**Remote Sensing by Satellite Gravimetry**

Editors

**Thomas Gruber Annette Eicker Frank Flechtner**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editors* Thomas Gruber Technische Universitat M¨ unchen ¨ Germany

Annette Eicker HafenCity University Hamburg Germany

Frank Flechtner Helmholtz-Zentrum Potsdam Deutsches GeoForschungsZentrum—GFZ Germany

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Remote Sensing* (ISSN 2072-4292) (available at: https://www.mdpi.com/journal/remotesensing/ special issues/satellite gravimetry).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-0008-9 (Hbk) ISBN 978-3-0365-0009-6 (PDF)**

© 2021 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**



## **About the Editors**

**Thomas Gruber** (Dr.) is a Senior Scientist at the Institute of Astronomical and Physical Geodesy at the Technical University of Munich. He coordinated the level-2 processing team for the GOCE mission and led several studies for future gravity field missions.

**Annette Eicker** (Prof. Dr.) is a Professor of "Geodesy and Adjustment Theory" at HafenCity University in Hamburg, Germany. She works on GRACE data analysis and geophysical applications and is currently the president of the Inter-Commission Committee for "Geodesy and Climate Research" of the International Association of Geodesy (IAG).

**Frank Flechtner** (Prof. Dr.) is Head of Section "Global Geomonitoring and Gravity Field" at the Hemholtz-Zentrum Potsdam Deutsches GeoForschungsZentrum and Professor of "Physical Geodesy" at the Technical University of Berlin. He was the GRACE Co-PI and is currently responsible for the operation of German-contributed mission elements for GRACE Follow-On.

## **Preface to "Remote Sensing by Satellite Gravimetry"**

During the last two decades, satellite gravimetry has become a new remote sensing technique providing a detailed global picture of the physical structure of the Earth. With the CHAMP, GRACE, and GOCE missions, spatial mass irregularities and the transport of mass in the Earth system could be systematically observed and monitored from space. With the GRACE Follow-On mission, launched in 2018, the time series of mass transport observations is continued for another number of years, which enables the disentanglement of anthropogenic and natural sources of climate change impact on the Earth system. A wide range of Earth science disciplines and operational observing systems benefit from these observations and have been enabled to improve their models and to get new insights into processes of the Earth system (e.g., water cycle, continental hydrology, ocean modelling, ice sheet and glacier melting, lithosphere modelling). The value of satellite gravimetry has been acknowledged by the international Earth science community in various resolutions, and in the meantime it is regarded as a new remote sensing tool, providing information that is complementary to other remote sensing techniques. Therefore, the Earth science community and space agencies are currently planning future satellite gravimetry missions in order to secure sustained observations of mass distribution and mass transport on a long-term basis with higher accuracy and better spatial and temporal resolution. This book is a collection of papers reporting results and research on various aspects of satellite gravimetry. It includes contributions about modelling the static and time-variable Earth gravity field using satellite gravimetry data; about design and capabilities of next-generation satellite gravimetry missions; and about results obtained in Earth science applications, namely for the cryosphere, the hydrosphere, and earthquake research.

> **Thomas Gruber, Annette Eicker, Frank Flechtner** *Editors*

## *Article* **The GFZ GRACE RL06 Monthly Gravity Field Time Series: Processing Details and Quality Assessment**

**Christoph Dahle 1,\*, Michael Murböck 1,2, Frank Flechtner 1,2, Henryk Dobslaw 1, Grzegorz Michalak 1, Karl Hans Neumayer 1, Oleh Abrykosov 1,†, Anton Reinhold 1, Rolf König 1, Roman Sulzbach 1,3 and Christoph Förste <sup>1</sup>**


Received: 10 August 2019; Accepted: 6 September 2019; Published: 11 September 2019

**Abstract:** Time-variable gravity field models derived from observations of the Gravity Recovery and Climate Experiment (GRACE) mission, whose science operations phase ended in June 2017 after more than 15 years, enabled a multitude of studies of Earth's surface mass transport processes and climate change. The German Research Centre for Geosciences (GFZ), routinely processing such monthly gravity fields as part of the GRACE Science Data System, has reprocessed the complete GRACE mission and released an improved GFZ GRACE RL06 monthly gravity field time series. This study provides an insight into the processing strategy of GFZ RL06 which has been considerably changed with respect to previous GFZ GRACE releases, and modifications relative to the precursor GFZ RL05a are described. The quality of the RL06 gravity field models is analyzed and discussed both in the spectral and spatial domain in comparison to the RL05a time series. All results indicate significant improvements of about 40% in terms of reduced noise. It is also shown that the GFZ RL06 time series is a step forward in terms of consistency, and that errors of the gravity field coefficients are more realistic. These findings are confirmed as well by independent validation of the monthly GRACE models, as done in this work by means of ocean bottom pressure in situ observations and orbit tests with the GOCE satellite. Thus, the GFZ GRACE RL06 time series allows for a better quantification of mass changes in the Earth system.

**Keywords:** satellite gravimetry; GRACE; Level-2 processing; time-variable gravity field; mass change monitoring

#### **1. Introduction**

During more than 15 years (April 2002 through June 2017) of successful science operations phase, the Gravity Recovery and Climate Experiment (GRACE) mission enabled breakthroughs in monitoring the terrestrial water cycle (e.g., [1,2]), ice sheet and glacier mass balances (e.g., [3,4]), sea-level change (e.g., [5,6]) and ocean bottom pressure variations (e.g., [7,8]). A comprehensive overview of numerous other GRACE-related studies and their contributions to understanding changes in the global climate system is reviewed by Tapley et al. [9]. These results are based on time-variable, in general monthly, global gravity field models.

Such models, so-called GRACE Level-2 products, are routinely generated by the joint US-German GRACE Science Data System (SDS) consisting of the Center for Space Research at the University of Texas at Austin (CSR), NASA's Jet Propulsion Laboratory (JPL), and the German Research Centre for Geosciences (GFZ). To provide consistent long-term gravity field time series of highest possible quality to the user community the SDS has recently reprocessed its gravity field solutions over the complete GRACE mission duration. This latest reprocessing, referred to as release 06 (RL06) [10–12], comprises updated background models and processing standards which are also applied to consistently process the first release of gravity field solutions based on data from the GRACE Follow-on (GRACE-FO) mission [13]. The GRACE-FO satellites were successfully launched in May 2018 and are designed to continue the unique GRACE data record for at least five additional years.

Apart from the SDS, recent GRACE gravity field time series are provided by other processing centers, e.g., ITSG-Grace2018 [14], CNES/GRGS RL04 [15], AIUB RL02 [16], or Tongji-Grace2018 [17].

The purpose of this work is to demonstrate the progress that has been achieved with the current GFZ GRACE RL06 time series compared to its precursor GFZ RL05a [18]. First, an overview of the GRACE gravity field processing procedure at GFZ is given and modifications implemented within the RL06 reprocessing are described (Section 2). Thereafter, results are discussed with focus on the internal quality in the spectral and spatial domain comparing RL06 with RL05a (Section 3). Additionally, both time series are validated by external data (Section 4). Finally, conclusions are drawn, and the main findings of this work are summarized (Section 5). All results confirm that GFZ's RL06 time series is a significant step forward in terms of accuracy, consistency and reliability and thus enables a better quantification of climate change related phenomena in the Earth system.

#### **2. GRACE Level-2 Gravity Field Processing at GFZ**

GRACE global monthly gravity field recovery at GFZ is based on the so-called "dynamical approach" using GFZ's Earth Parameter and Orbit System (EPOS) software package (https://www.gfz-potsdam.de/en/section/global-geomonitoring-and-gravity-field/topics/ earth-system-parameters-and-orbit-dynamics/earth-parameter-and-orbit-system-software-epos/). Underlying satellite orbit perturbations rely on a precise numerical orbit integration taking into account all reference system and force model related quantities [19]. The integrated orbit is then fitted to the GRACE tracking observations, i.e., GPS code and carrier phase observations and K-band inter-satellite ranging data between the two GRACE satellites. This step is done in a least-squares adjustment process solving iteratively for both satellites' state vector at the beginning of each arc, observation-specific parameters, in particular GPS receiver clock offsets, GPS carrier phase ambiguities and calibration parameters for the accelerometers, and other arc-specific parameters such as empirical accelerations. The term "arc" refers to the time length of the integrated orbit starting with one initial state vector which is typically one day. After convergence of the initial orbit adjustment with the a priori force models, the observation equations are extended by partial derivatives for the unknown global parameters describing the gravitational potential, represented by spherical harmonic (SH) gravity field coefficients. Arc-by-arc normal equation (NEQ) systems are generated in this way from the observation equations and accumulated over nominally one month to one overall system which is then solved by matrix inversion. For the complete GRACE mission 163 of these NEQ systems and corresponding Level-2 gravity field products have been derived.

In the following subsections, specific aspects relevant for GRACE gravity field processing are discussed and modifications in GFZ's RL06 processing relative to the processing of the previous release GFZ RL05a are described.

#### *2.1. GPS Constellation*

Since GPS tracking data are processed, the precise orbits and sender clock offsets of the GPS constellation, nominally consisting of 32 satellites, must be known. Using GFZ's EPOS software, it has been demonstrated by König [20] that an integrated processing of both GPS satellites and the GRACE satellites, i.e., simultaneous orbit determination and parameter adjustment at the observation level, is beneficial for the determination of the terrestrial reference frame (TRF). However, the dynamic part of the TRF in König [20] is limited to the SH gravity field coefficients of degrees one and two and all parameters have been estimated daily. When expanding the parameter space of the gravity field to a maximum degree and order (d/o) of 90 or higher, typically for monthly GRACE gravity field solutions, which requires that daily NEQs need to be stacked to obtain a monthly solution, such an integrated approach becomes quite demanding in terms of computational efficiency and proper separation of daily and monthly Earth system and other parameters. Thus, for GRACE gravity field recovery, it is common practice to determine precise GPS orbits and clock offsets beforehand using GPS tracking data from a globally distributed ground station network, and then keep them fixed in the subsequent GRACE orbit and gravity field adjustment process. At GFZ, the GPS constellation used for gravity field determination is traditionally generated in-house allowing for best possible consistency.

For processing GRACE RL06, the GPS constellation has been reprocessed as well. Compared to the previous RL05 GPS constellation, the following changes have been implemented: (1) application of the ITRF2014 reference frame realized by the IGS14 ground station network (https://mediatum.ub.tum. de/doc/1341338/1341338.pdf) instead of ITRF2008/IGS08; (2) increase in the number of GPS ground stations from approx. 70 to approx. 120 to 140; (3) improved solar radiation pressure parameterization; and (4) adaption of the background models according to GRACE RL06 standards (see Section 2.3). To assess the level of accuracy of GFZ's RL05 and RL06 GPS constellations, daily root mean square (RMS) values of position differences regarding final orbits provided by the International GNSS Service (IGS) for all available GPS satellites are calculated. During the GRACE mission period, these 3D RMS values are typically in the range of 3 to 5 cm in the first years until end of 2006 and around 3 cm for all years thereafter for the RL06 constellation. The corresponding global 1D RMS over the whole 15 years is 1.96 cm for RL06 and 2.28 cm for RL05 revealing that the current RL06 constellation is closer to and more consistent with the official IGS products compared to its predecessor.

#### *2.2. GRACE Observations*

Both GFZ RL05a and RL06 are based on official GRACE Level-1B (L1B) instrument data processed and provided by JPL [21]. In particular, the following L1B observations are used:


For GFZ RL05a, L1B RL02 data were used. The same RL02 data are used for RL06 in case of ACC1B and GPS1B. Regarding KBR1B and SCA1B, an improved L1B RL03 dataset [22] has been made available by JPL and is used for RL06. At the end of the mission, i.e., during the period November 2016 through June 2017, the ACC instrument aboard GRACE-B was turned off due to battery issues, and thus transplanted ACC observations from GRACE-A have to be used (in the following, this period is denoted by "GRACE single ACC"). For RL05a, a simple ACC data transplant was used, which had only attitude and time corrections applied. As part of JPL's L1B RL03 dataset, an improved transplant version [23], additionally corrected for thruster spikes, has been made available and is used for RL06.

Before orbit and gravity field determination with EPOS, the L1B data are preprocessed as follows for RL05a: (1) KRR observations are modified by adding the light time correction and the antenna

offset correction (both also taken from the KBR1B products); (2) GPS observations are cleaned, phase cycle slips are detected, and the data are downsampled to 30 s; (3) ACC observations are downsampled from 1 Hz to 0.2 Hz by simple decimating and data gaps < 100 s are interpolated; and (4) data gaps < 500 s in the SCA1B data are interpolated. For RL06, the only modification concerns (4): SCA1B data are downsampled from 1 Hz to 0.2 Hz, smoothed and data gaps are filled, all done simultaneously using spherical quadrangle interpolation (SQUAD). After preprocessing, the L1B observations are analyzed for data gaps caused by satellite-specific events such as, e.g., maneuvers and reboots of the Instrument Processing Unit (IPU). In case of larger gaps, the nominally one day long arcs are split into two or more shorter arcs over partial days. The minimum length of an arc, however, is set to three hours. To avoid any unwanted effects in the observations around such events, a margin of ten minutes is applied before and after data gaps when defining the final arcs.

#### *2.3. Background Models*

By definition, GRACE gravity field models represent geophysical signals caused by variations in the terrestrial water storage, mass loss in polar ice sheets and inland glacier systems, ocean mass variations, global isostatic adjustment and large earthquakes. Consequently, gravity variations caused by solid Earth and pole tides, atmosphere and ocean tides or short-term non-tidal atmospheric and oceanic mass variations are not supposed to be included in the gravity field solutions and are therefore taken into account during the data processing via background models. On the other hand, any error contained in the background models degrades the quality of the gravity field solutions, especially at low frequencies [24]. Thus, some of these background models have been updated for processing the GFZ RL06 time series (Table 1).


**Table 1.** Overview of background models used for GFZ RL05a and RL06 processing.

Among the background models listed in Table 1, the choice of the Atmosphere and Ocean De-aliasing (AOD) and the ocean tide model has the most impact on the quality of the monthly GRACE solutions and hence their geophysical interpretation [24]. The AOD1B model is an official GRACE L1B product routinely generated at GFZ. Compared to its precursor AOD1B RL05, the most recent release AOD1B RL06 has higher temporal (3-hourly vs. 6-hourly) and spatial (maximum SH d/o 180 vs. 100) resolution. Further details are described by Dobslaw et al. [31], where also improvements in GRACE gravity field processing by means of variance reduction of K-band range-acceleration residuals are already reported when using AOD1B RL06 instead of RL05. Regarding ocean tides, similar improvements are observed when using FES2014 instead of EOT11a. Generally, the choice of the ocean tide model does not significantly affect the quality of individual monthly GRACE solutions, but regional deficiencies, especially at high latitudes, can become visible in GRACE time series analysis. In this context, Ray et al. [34] reported that the FES2014 model, used for GFZ RL06, shows improvements in most of the polar regions compared to its precursor and performs comparable to other state-of-the-art ocean tide models.

#### *2.4. Processing Strategy*

The processing scheme of GRACE gravity field processing at GFZ consists of the following steps: (1) GPS data editing; (2) KRR data editing; (3) generation of a priori orbits; (4) generation of arc-wise NEQs; and (5) accumulation to and solving of monthly NEQs. For all previous GFZ GRACE releases from RL01 to RL05a, the strategy how these steps are performed is more or less identical and has been described by Schmidt [35]. For the current GFZ GRACE RL06 processing, the strategy has been considerably changed and is described in this section. An overview of the GFZ RL05a and RL06 processing strategies is given in Table 2.


**Table 2.** Overview of GFZ RL05a and RL06 processing strategies.

Editing of GPS data is done by automated elimination during an iterative precise orbit determination (POD). The elimination is based both on an *n*-sigma criterion, with varying values for *n* in the different iterations, and additionally an absolute threshold for the size of GPS residuals. The main difference in the GPS data editing step is that for RL06 the POD for both GRACE satellites is done completely independently from each other in contrast to RL05a, where down-weighted KRR observations were included and a common POD for GRACE-A and -B was applied. The goal for RL06 is to obtain best possible absolute orbit accuracy for each of the two spacecraft and thus to avoid that the elimination of GPS observations for one satellite is influenced by the other. Another difference concerns the empirical corrections for GPS phase center variations (PCV): For RL05a, a certain unvaried PCV correction based on one month (April 2008) of GPS residuals has been applied for the whole time series, whereas for RL06, monthly PCV corrections are computed from the corresponding monthly residuals. A novelty of RL06 is that also GPS code residual variation maps per month are computed and applied. In general, these empirical corrections are very stable over time, but they are significantly affected by systematic patterns whenever radio occultation measurements are activated onboard one of the GRACE satellites (Figure 1). Because activation and deactivation of these measurements has occurred several times throughout the GRACE mission (mostly related to satellite swap maneuvers), a monthly computation of these corrections has been chosen as the preferable option. Finally, the a priori weights for GPS phase and code observations have been changed from RL05a to RL06 to better reflect the actual level of the RMS of the corresponding residuals (see Table 2).

**Figure 1.** Code residual variation maps of GRACE-A (**top**) and GRACE-B (**bottom**) for June 2014 (**left**) and August 2014 (**right**). Radio occultation measurements on GRACE-A were activated in June 2014 and deactivated in August 2014 and vice versa on GRACE-B, after a satellite swap in July 2014.

For KRR data editing, a common POD for GRACE-A and -B is performed. In case of RL05a, where down-weighted KRR observations were already included in the previous step, the KRR weight was increased by setting the a priori standard deviation to 0.1 μm/s, and elimination of KRR observations was automatically done based on an 8-sigma criterion. Further editing of GPS observations during this step was not explicitly turned off. In contrast, now for RL06 KRR observations are only introduced in this step with a slightly increased a priori standard deviation of 0.3 μm/s which better matches—as done similarly for GPS—the corresponding orbit fit RMS. No automated elimination at all is applied; instead, possible elimination of KRR observations-if needed-is based on visual inspection of the KRR residuals. Usually, observations are edited if the residuals exceed a threshold of 3 μm/s, but this threshold is not fixed and other editing criteria such as anomalous behavior of the KRR residuals over a certain period of time within one arc might be applied. By modifying the KRR data editing as described, inadvertent elimination of observations over areas with large geophysical signals is avoided and about 1% more KRR observations (approx. 0.3 days accumulated over one month) remain and are used for gravity field determination in the case of RL06.

The purpose of generating the a priori orbit is to assure that convergence of a POD using the edited GPS and KRR observations and iterated orbit parameter estimates is reached after one iteration. If this condition is fulfilled, arc-wise NEQs are set up starting with the observations and initial parameters from the a priori orbit run. In case of RL05a, the observation weights from the previous data editing steps remained unchanged, whereas for RL06 individual arc-wise weighting for GPS and KRR observations has been introduced. Arc-wise a priori standard deviations are based on the RMS of the corresponding residuals after the KRR editing run: In case of KRR, they are equal to the RMS values; for GPS, the RMS values are multiplied by an empirical factor of 7. This intentional down-weighting of GPS observations relative to the K-band observations is necessary to obtain better gravity field solutions and is also done similarly by other processing centers (see, e.g., [10,16]).

Another modification from RL05a to RL06 regards the time-variable a priori gravity field which in case of RL05a was used as background model also during these two last-mentioned processing steps. This required restoring of the monthly mean of these a priori fields to the estimated monthly GRACE solution to provide users a Level-2 gravity field product which contains the full time-variable signal as expected per definition. For RL06, the time-variable gravity field background model is only used during the data editing steps, where background modeling is desired to be as realistic as possible, but not during the steps where gravity field determination takes place. Thus, any possible bias that might be introduced by applying a remove-restore procedure as described for RL05a is avoided for RL06.

#### *2.5. Parametrization*

In conjunction with the processing strategy, the orbit and instrument parametrization has also been significantly changed from GFZ RL05a to RL06 (see Table 3).

In RL05a processing, empirical accelerations were only set up and estimated during the GPS data editing step. In the subsequent steps, these parameters were completely removed, and empirical K-band parameters were introduced [35]. For RL06, empirical accelerations are set up more frequently (once per orbital revolution) and remain as parameters throughout all processing steps to assure a consistent parametrization from GPS editing until gravity field estimation. To avoid that these parameters absorb too much gravity field signal, an a priori standard deviation of 1E-8 m/s<sup>2</sup> is applied as constraint. In contrast to RL05a, no K-band parameters at all are set up in RL06. Internal tests have shown that the additional estimation of K-band parameters does not significantly impact the estimated gravity field solution, but leads to a degradation in orbit accuracy.

Further modifications from RL05a to RL06 affect the ACC instrument parameters. Regarding the ACC biases, their number has been decreased. Now for RL06, usually three biases per arc are estimated in along-track and radial direction (at the beginning, middle, and end of the arc), and nine in cross-track direction (at the beginning and end, and equally spaced in between). The number of biases can become less in case the arc length is shorter than 24 h, as the minimum spacing between two biases is set to three hours. Between the epochs where biases are estimated, the bias is modeled as a natural cubic spline function (RL05a: linear interpolation). Regarding ACC scale factors, one per arc and direction is estimated in case of RL06. This is a lesson learned from RL05a processing, where for most of the time series the scales were fixed to one, which turned out to be not optimal especially in the later years of the GRACE mission. Therefore, this was modified within RL05a to estimating 3-hourly scales, which helped to improve the quality of the RL05a solutions, but on the other hand tends to over-parameterize the gravity field estimation process. Another notable difference between RL05a and RL06 is the parametrization during the "GRACE single ACC" period: Whereas for RL05a the parametrization was the same as for the rest of the GRACE mission, a fully populated scale factor matrix is estimated per arc in case of RL06. This has been proposed first by Klinger and Mayer-Gürr [36] and is applied to the complete ITSG-Grace2016/2018 time series, as well as to other recent releases such as the CSR RL06 [10] and JPL RL06 [11] time series. For GFZ RL06, the additional six parameters, i.e., the off-diagonal elements of the scale factor matrix, are constrained with an a priori standard deviation of 1E-3 since otherwise, the inversion of the NEQ system becomes unstable and in many cases would fail.

In summary, the total number of orbit and instrument parameters described above has decreased for RL06 compared to most of the RL05a solutions, and is nearly identical compared to the RL05a solutions with modified ACC parametrization. Moreover, as already mentioned, the parametrization is now consistent during all processing steps. It has to be mentioned as well that there are other parameters not listed in Table 3, namely GPS receiver clock offsets (2880 per 1-day arc and satellite) and GPS phase ambiguities (approx. 400 per 1-day arc and satellite). However, these two parameter groups are pre-eliminated before gravity field estimation and there are no differences in their treatment between RL05a and RL06 processing.

Finally, regarding the main parameters of interest, i.e., the gravity field parameters, the maximum SH d/o estimated was slightly increased from 90 (RL05a) to 96 (RL06). New with GFZ RL06, an independently estimated GRACE time series only up to d/o 60 is provided additionally. These two decisions have been made jointly within the GRACE SDS to ensure better consistency between the three SDS time series than it was the case with RL05. Furthermore, both KRR and GPS observations contribute to the full spectrum of the monthly gravity field estimates in GFZ RL06, whereas in GFZ RL05a the contribution of GPS was limited to SH d/o 80.

**Table 3.** Number and properties of GFZ RL05a and RL06 orbit and instrument parameters (numbers are per arc representative for the nominal arc length of 24 h).


R, T, N: radial, along-track and cross-track direction in SRF. (1) period from 2003/01 through 2013/05; (2) period from 2002/04 through 2002/12 and 2013/06 through 2017/06; (3) period from 2002/04 through 2016/08; (4) period from 2016/11 through 2017/06 ("GRACE single ACC").

#### *2.6. Orbit Quality*

The quality of the GRACE orbits determined prior to gravity field estimation was not in the focus during GFZ RL05a processing. For GFZ RL06, the modifications in processing strategy and parametrization are motivated not only to obtain improved gravity fields, but also orbits of high quality.

A first indication that this goal is reached is given by GPS phase and code residuals of the GPS editing runs which are extremely stable during the whole mission with arc-wise RMS values of about 3 mm and 40 cm, respectively (for the "GRACE single ACC" period, only a slight increase is observed). In contrast to RL05a, the size of GPS residuals does not increase at all when adding KRR observations in the subsequent processing steps which can be attributed to the consistent parametrization.

Another independent orbit validation during RL06 processing is routinely done by means of satellite laser ranging (SLR) observations from ground stations to the GRACE satellites, provided by the International Laser Ranging Service (ILRS). Coarse outliers in the SLR observations are eliminated by a 20 cm threshold. Mean values and standard deviations of GRACE-A and -B SLR residuals per year are shown in Figure 2 for all available stations and a subset of high-quality stations. The definition of such a subset has been outlined by Arnold et al. [37] and the same 12 stations are used here for better comparability. These high-quality stations contribute between 50% and 75% of all available observations. It can be seen that the standard deviations are in the range of 20 mm to 25 mm for all stations and about 15 mm for the high-quality stations. For the year 2010, the standard deviations

for GRACE-A are 24.5 mm and 13.7 mm, respectively, which agrees very well with values of 24.4 mm and 12.3 mm, respectively, reported in [37]. As for the GPS residuals, the values are the same whether only GPS observations are used or KRR observations are added, and are also very similar for both GRACE satellites. Increased standard deviations, in particular for recent years, are observed for the a priori orbits which can be explained by the fact that these orbits are determined without a time-variable gravity background model and GPS observations are down-weighted. The mean values of GRACE-A and -B as are also very consistent, independent of the processing step or whether all or the high-quality stations are evaluated. They are mostly in the range of −10 mm to −15 mm which is relatively large. However, this is not necessarily due to the orbit quality, but may also be caused by incorrect values for the GPS phase center offset (PCO) of the GRACE GPS navigation antennas. For GFZ RL06, PCOs relative to the antenna reference point provided by Montenbruck et al. [38] are used. A geometrical offset (distance between the satellites' center of mass and the antenna reference point) of −444 mm is added–only for the z-component in the SRF–resulting in a total L3 PCO in z-direction of −391.7 mm. This value is approx. 22 mm larger than the corresponding value derived from the official GRACE L1B vector offset product for the GPS main antenna (VGN1B, see [21]) which might at least partly explain the relatively large negative offsets reported here.

Overall, the quality of the GFZ GRACE RL06 orbits is satisfyingly well confirming that the processing changes relative to GFZ RL05a have been a step into the right direction.

**Figure 2.** Mean and standard deviation per year of GRACE-A (blue) and -B (red) SLR residuals during the different GRACE processing steps for all available stations and a subset of high-quality stations.

#### **3. Results**

GFZ GRACE RL06 gravity field results are analyzed and discussed in comparison to GFZ's RL05a GRACE time series in the following subsections. Most of the results shown are relative to a climatology model (individually derived for each time series) which has been estimated as follows: The dominating signal content of the time series is approximated by fitting a proper parameter model coefficient-wise to the monthly solutions. Here, eight parameters describing the constant and linear part as well as periodic sine and cosine amplitudes for annual, semi-annual and 161-days (GRACE aliasing period for the ocean tide S2) periods are used. Furthermore, the formal errors of the monthly SH coefficients are used as a priori information to weight each individual monthly coefficient when estimating the climatology. Months with short repeat cycles (i.e., solutions which were regularized in RL05a) as well as the seven "GRACE single ACC" solutions are excluded.

#### *3.1. Formal and Empirical Errors*

In this subsection formal and empirical errors of the GFZ RL06 time series are analyzed in the spectral domain and compared to GFZ RL05a. The formal errors are the standard deviations of the gravity field parameters estimated in the least-squares adjustment process, i.e., the square root of the diagonal of the gravity field parameter part of the variance-covariance matrices. In principle, these errors should give a good indication of the real errors of the estimated parameters. However, as variance-covariance information of all the input data is insufficiently (observations) or not all (background models) applied, the formal errors are typically too optimistic.

Another quantification of the errors of such a gravity field time series are empirically derived values. Here, residuals regarding the climatology described above are defined as empirical errors of the time series.

Figure 3 shows the RMS over the whole time series of the empirical and formal errors for RL05a and RL06. The main differences between empirical and formal errors can be seen in the very low SH degrees and around so-called resonance orders. Due to residual signals in the very low degrees, which are not covered by the eight parameters, the empirical errors show much higher values than the formal ones here. Around the resonance orders, which are integer multiples of approximately 15, it is well known that GRACE errors are larger due to systematic effects from temporal aliasing caused by background model errors. This effect is not very well represented in the formal errors.

**Figure 3.** (**a**–**d**) SH spectra of empirical error RMS for RL05a (**a**) and RL06 (**b**); and of formal error RMS for RL05a (**c**) and RL06 (**d**); (**e**) Ratio of SH spectra of empirical error RMS "RL05a/RL06"; (**f**) SH degree amplitudes of the ratio of empirical error RMS "RL05a/RL06" (blue), and the ratios "empirical/formal error RMS" for RL05a (red), and RL06 (green).

Compared to RL05a (Figure 3a,c), Figure 3b,d reveal smaller empirical errors and more realistic formal errors for RL06. This becomes even more clear when looking at the ratio of the empirical error RMS values between RL05a and RL06 (Figure 3e) which is > 1 for nearly all coefficients and also when plotted as amplitudes per SH degree (Figure 3f). Also, the degree amplitudes of the ratio between empirical and formal errors indicate that the GFZ RL06 formal errors are more realistic as smaller variations than for GFZ RL05a are visible and the curve is closer to the value of one.

#### *3.2. Degree Amplitudes*

Difference degree amplitudes relative to climatology are shown in Figure 4. The spread of monthly degree amplitudes is less for RL06 than for RL05a which illustrates that RL06 is a more homogeneous time series. Significant differences between the RL05a and RL06 median degree amplitudes are already visible at approx. SH degree 15, and almost all monthly RL06 degree amplitude curves are well below the median RL05a curve (or vice versa) for medium and high degrees indicating a notably improved signal-to-noise ratio for RL06. When only looking at the "GRACE single ACC" period, large improvements from RL05a to RL06 are achieved according to the corresponding median degree amplitudes. These improvements are not only present for medium and high degrees, but also for the very low degrees. However, the quality of the "GRACE single ACC" solutions is still significantly worse than for the rest of the GRACE mission also in case of RL06.

**Figure 4.** Degree amplitudes relative to a climatology model for GFZ RL05a (**left**) and GFZ RL06 (**right**); thin lines represent monthly solutions (without "GRACE single ACC" solutions and those regularized in RL05a), and bold lines represent the median curves (the same curves are shown in both plots) for RL05a (red), RL06 (96 × 96, green), RL06 (60 × 60, blue), and the "GRACE single ACC" months only for RL05a (black) and RL06 (96 × 96, grey).

#### *3.3. RMS of Residuals in the Spatial Domain*

To assess the quality of the GFZ RL06 time series in the spatial domain, monthly residual SH coefficients relative to climatology are converted to gridded mass anomalies in terms of equivalent water height (EWH). To reduce the impact of spatially correlated noise, the solutions are de-correlated and smoothed by applying the non-isotropic DDK filter [39]. Then, RMS values of the time series per grid point are calculated and shown in Figure 5. For the period until August 2016, i.e., without the "GRACE single ACC" months, a clear reduction of RMS variability for RL06 has been achieved (Figure 5a,b). Geophysical signals over continental areas, where they are much larger than over the oceans, are less superimposed by the typical GRACE striping pattern and thus better detectable. Variability over the oceans is generally expected to be rather small and is therefore often interpreted as upper error bound for monthly global GRACE gravity field models. Latitude-dependent weighted RMS (wRMS) values over the oceans decrease from 6.8 cm (RL05a) to 4.0 cm (RL06, 41% relative improvement) when DDK5 filtered, and from 3.4 cm (RL05a) to 2.1 cm (RL06, 38% relative improvement) when DDK3 filtered. For the "GRACE single ACC" period,

Figure 5c shows DDK3 filtered RMS variability to allow a direct comparison with the period before, but it becomes obvious that much stronger decorrelation and smoothing would be required here to extract geophysical signals. Nevertheless, the corresponding wRMS values over ocean decrease again significantly from 9.5 cm (RL05a) to 6.1 cm (RL06, 36% relative improvement).

**Figure 5.** RMS of the time series of residuals (cm EWH) relative to a climatology model (without months regularized in RL05a) for GFZ RL05a (left) and GFZ RL06 (right); the following different cases are shown: period from 2002/04 through 2016/08, DDK5 filtered (**a**) and DDK3 filtered (**b**); and "GRACE single ACC" period, DDK3 filtered (**c**).

Monthly wRMS values over the oceans for the complete GRACE time series (DDK5 filtered) are shown in Figure 6. Again, it becomes visible that GFZ RL06 is a clear improvement over RL05a in terms of noise reduction and homogeneity. Some months where RL06, particularly for the 96 × 96 time series, exhibits larger wRMS values than RL05a can be attributed to short period repeat orbit cycles (the most harmful repeat orbits during the GRACE mission are: 61/4 around September 2004, 46/3 around May 2012, 77/5 around December 2013, 31/2 around February 2015). RL05a solutions for these months were regularized which is not the case anymore for RL06 as the additionally provided RL06 60 × 60 time series, which is less sensitive to these short period repeat orbits, might be analyzed instead. Apart from the repeat cycles just mentioned before, the wRMS values of the RL06 96 × 96 and

60 × 60 time series are mostly almost identical. Periods where these values are notably larger are again related to less harmful repeat cycles such as the long-lasting 107/7 repeat orbit around December 2009. Finally, also Figure 6 shows that the "GRACE single ACC" months are of much less quality than the rest of the time series. At least, the RL06 solution for May 2017 is now of comparable quality (it must be noted that for this solution GRACE-B ACC data is actually available and used).

**Figure 6.** wRMS over the oceans (cm EWH) of DDK5 filtered residuals relative to a climatology model for the complete GFZ RL05a (red), GFZ RL06 (96 × 96, green), and GFZ RL06 (60 × 60, blue) time series.

#### *3.4. Low Degree Harmonics*

In this subsection, time series of selected low degree SH coefficients are analyzed, starting with C20. This coefficient is known to be poorly estimated from GRACE (see, e.g., [40]), and it is common practice to replace it, e.g., with estimates derived from SLR observations to geodetic satellites. Despite the fact that the GFZ GRACE RL06 C20 values have significantly improved compared to GFZ RL05a (Figure 7a), a replacement of C20 is still recommended for RL06 before using the time series for geophysical interpretation. Available SLR-based replacement time series which are consistent with RL06 standards are, e.g., GRACE Technical Note TN-11 generated by CSR [41], or a similar time series provided by GFZ [42] which is also shown in Figure 7a.

Two other coefficients requiring special attention are C21 and S21. When analyzing surface mass variations from the GRACE SDS RL05 time series, Wahr et al. [43] recommended corrections to these coefficients to account for effects of the applied mean pole model. Since all three SDS RL06 time series including GFZ RL06 are processed based on a linear mean pole model which is conform to the updated IERS2010 mean pole convention (http://iers-conventions.obspm.fr/chapter7.php), this recommendation is not applicable anymore to these reprocessed time series. Looking at the GFZ RL06 C21 time series in comparison to GFZ RL05a (Figure 7b), however, one can see an anomalous behavior during the "GRACE single ACC" period. Although already the RL05a time series shows larger amplitudes in that period, this is even more pronounced in RL06. A similar behavior is visible also for S21 (Figure 7c). The reason for these anomalies in C21 and S21 is not yet fully understood and subject to further investigation. As it is clearly correlated with the use of ACC data transplant, a possible explanation would be that it is due to inaccurate modeling of surface forces, potentially in conjunction with an inappropriate parametrization. First experiments at GFZ combining GRACE and SLR on NEQ level have revealed promising results and might lead to a replacement time series, similar to C20, to overcome these deficiencies in the near future. It should be mentioned here that a GRACE+SLR combination would not be a novelty as, e.g., the GRACE solutions provided by the CNES/GRGS group are in fact already based on a combination with SLR [15].

**Figure 7.** Time series of SH coefficients C20 (**a**); C21 (**b**); and S21 (**c**); each plot shows values of GFZ RL05a (red), GFZ RL06 (96 × 96, green), and GFZ RL06 (60 × 60, blue); for C20, the SLR-based time series König et al. [42] is shown additionally (black).

#### **4. External Validation**

Due to the uniqueness of GRACE Level-2 products as observable for studies of Earth surface mass transport and climate change, it is nontrivial to validate them against independent data or models, and thus to reliably assess the quality of different GRACE time series in terms of signal content rather than only assessing their internal noise level. In the following subsections, two methods to evaluate the quality of the GFZ RL06 and RL05a time series by external data are presented.

#### *4.1. OBP Validation*

First, the GFZ GRACE RL06 and RL05a solutions are independently validated by comparing them with ocean bottom pressure (OBP) in situ observations.

The OBP database used here was initially compiled by Macrander et al. [44] and consists of 167 stations which are irregularly scattered over the oceans covering the time period from 2002 through 2010 with observation lengths for individual stations of up to eight years. The station data are preprocessed as outlined by Poropat et al. [7] to obtain time series of OBP observations with removed trends, tidal variability, outliers, and discontinuities at certain dates related to instrument issues including maintenance and battery replacement.

For the OBP validation, the GFZ RL06 and RL05a Level-2 solutions are post-processed as follows: The C20 coefficients are replaced with GRACE Technical Notes TN-11 and TN-07 [40], for RL06

and RL05a, respectively, the effects of glacial isostatic adjustment are corrected by subtracting the model by A et al. [45], co-seismic signatures from three megathrust earthquakes are removed with estimates from the GOCO06s model [46], and approximated degree-1 coefficients according to Bergmann-Wolf et al. [47] are added. The DDK filter is applied to de-correlate the solutions: DDK4 is used for the long-term trend component, whereas DDK2 is used for the annual and semi-annual components and for the remaining residual monthly signals. Please note that for five monthly solutions with particularly poor signal-to-noise ratio, the DDK1 filter is used. Finally, the monthly GAD background model [48] including atmospheric surface pressure and non-tidal OBP is added back.

These post-processed GRACE data are evaluated at the locations of the OBP in situ recorders. Regionally different linear trends, specifically caused by changing sea level, are removed. Since GRACE data do not represent a point-measurement, but an average over a large area, areas of coherent OBP variability are identified and the GRACE OBP data are averaged over these areas [49]. The selection of these areas follows [7].

To compare GRACE and in situ OBP variations at the OBP in situ sites, relative explained variances (defined as *σ*<sup>2</sup> *<sup>r</sup>* = (*σ*<sup>2</sup> in situ OBP − *<sup>σ</sup>*<sup>2</sup> in situ OBP - GRACE OBP)/*σ*<sup>2</sup> in situ OBP) and correlation coefficients are calculated from both time series. Generally, positive relative explained variances are observed for about 35% of the OBP in situ stations (Figure 8a), indicating that GRACE-derived and observed OBP variations correspond rather poorly in many regions. However, improvements in relative explained variance for GFZ RL06 compared to GFZ RL05a become visible in most regions (Figure 8c). The same conclusion can be drawn for the correlations, where a slight increase for GFZ RL06 can be seen as well again in most regions (Figure 8d). Generally, correlation coefficients between GRACE and in situ OBP are within the range of 0.1 to 0.7 for most of the stations (Figure 8b); the corresponding 25th, 50th, and 75th percentiles are 0.21, 0.34, and 0.54, respectively. Overall, a slightly better performance of GFZ RL06 over GFZ RL05a in explaining OBP variability over wide regions is achieved.

**Figure 8.** (**a**) Relative explained variances *σ*<sup>2</sup> *<sup>r</sup>* at in situ OBP stations for GFZ RL06; (**b**) Correlation coefficients between in situ OBP and GRACE OBP for GFZ RL06; (**c**) Difference of relative explained variances for GFZ RL06 and GFZ RL05a; (**d**) Difference of correlation coefficients for GFZ RL06 and GFZ RL05a. For (**c**) and (**d**), red colors indicate improvements of GFZ RL06 over GFZ RL05a; stations with relative explained variances or correlation coefficients < 0 for both RL06 and RL05a are marked with white crosses.

#### *4.2. GOCE Orbit Tests*

As another independent validation, orbits of ESA's Gravity field and steady-state Ocean Circulation Explorer (GOCE) mission are used to compare the quality of monthly GRACE gravity solutions. The GOCE satellite [50], in orbit from March 2009 until November 2013, had a very low orbital altitude of about 255 km and thus shows a rather high sensitivity to the Earth's gravity field.

For these orbit tests, dynamic orbits are fitted to GOCE kinematic 3D orbit positions which are taken as observations (i.e., not directly the GPS tracking data). These kinematic orbits (Precise Science Orbits) have been generated at AIUB [51] and are provided within the GOCE High Level Processing Facility. For this study, GOCE orbit tests have been carried out for four months (November and December 2009, October and November 2010), each consisting of 30 individual GOCE arcs with a length of 1.25 days. The ocean tide model used here is FES2014 [28] to SH d/o 100. The reference system and gravitational force modeling is done applying the IERS 2010 [33] conventions. GOCE common mode accelerations are used during orbit computation instead of non-gravitational force models. The scale factors of the common mode accelerations cannot be accurately estimated per arc due to the drag-free control system which has compensated most of the signal and are therefore fixed to one [52]. This value is accurate within 3%, as demonstrated by Visser and van den IJssel [53]. The GOCE gradiometer works best in the measurement bandwidth of 10 to 200 s [54], and consequently the common mode accelerations include an instrumental bias. Therefore, three common mode acceleration biases per arc are estimated, one in each direction, in addition to the initial state vector.

For each month, two versions of orbit fits are computed which are identical except for the background gravity field model: The first version uses the corresponding GFZ RL06 monthly solutions, the second one uses the GFZ RL05a solutions instead, both up to SH d/o 90. Due to the high gravitational sensitivity of GOCE, the monthly GRACE models are filled up with SH coefficients from the long-term static GOCE model GO\_CONS\_GCF\_2\_DIR\_R6 [55] up to d/o 240 to achieve reasonable orbit fits at the level of few centimeters.

The results of the GOCE orbit tests are listed in Table 4. When using GFZ RL06 instead of GFZ RL05a as background model, the RMS values of the orbit fits are clearly reduced for all four months, with relative improvements of GFZ RL06 over GFZ RL05a ranging from 12% to 25%. The significant differences in the orbit fits prove that such kind of orbit validation tests are an appropriate tool for the validation of monthly GRACE gravity field solutions. For future validation purposes, it is planned to extend the orbit validation to the complete GOCE mission period and to investigate whether orbits of other Low Earth Orbiting satellites such as, e.g., CHAMP and Swarm can be used as well.


**Table 4.** RMS of orbit fits [cm] for the time-variable GFZ RL05a and RL06 gravity field models and (only for reference) for the static model GO\_CONS\_GCF\_2\_DIR\_R6. RMS values are based on 3D residuals and represent mean values of the 30 individual arcs within a particular month.

#### **5. Conclusions**

GFZ has reprocessed an improved monthly gravity field time series for the complete GRACE mission consisting of 163 gravity field models (Level-2 products) in the period from April 2002 through June 2017. This GFZ GRACE RL06 time series incorporates a reprocessed in-house GPS constellation (orbits and clocks), reprocessed Level-1B K-band ranging, star camera and accelerometer (ACC) transplant observations (L1B RL03 dataset provided by JPL), updated background models for tidal (FES2014) and non-tidal (AOD1B RL06) mass variations, and a considerably modified processing strategy including a different parametrization with (for most months) even less parameters compared to the precursor GFZ RL05a. Key features of the new RL06 processing strategy are a strict separation of GPS and K-band data editing, manual instead of automated sigma-based K-band data editing, and omittance of a time-variable gravity background model during the gravity field estimation step. Main differences in RL06 parametrization compared to RL05a are consistently estimated parameters throughout all processing steps, less ACC parameters (biases and scale factors), and –exclusively for the last months where ACC transplant data has to be used for GRACE-B ("GRACE single ACC" period)–the estimation of a fully populated ACC scale factor matrix. Independent validation by satellite laser ranging (SLR) observations reveals a satisfying quality of the GRACE orbits prior to gravity field adjustment with standard deviations of SLR residuals < 20 mm for selected high-quality stations.

With the new GFZ RL06 time series significant improvements have been achieved: The noise is considerably reduced and, consequently, geophysical signals are better detectable and can be analyzed at smaller spatial scales. Relative improvements over GFZ RL05a in terms of residual RMS variability are about 40% for both DDK3 and DDK5 filtered solutions. Furthermore, the complete time series is more homogeneous. Although the GFZ RL06 formal errors are still too optimistic for most of the gravity field coefficients, they exhibit a more realistic behavior, and also empirical errors in terms of residuals relative to a climatology model are smaller than for GFZ RL05a. The quality of the gravity fields within the "GRACE single ACC" period is clearly worse compared to the rest of the time series, but relative to RL05a, the RL06 solutions are also significantly improved here. Special attention needs to be paid to the C21 and S21 coefficients showing unrealistic amplitudes during that period. A combined GRACE+SLR replacement time series for these coefficients might help to mitigate this issue, as first investigations at GFZ have indicated. Regarding the C20 coefficient, known to be poorly estimated from GRACE, it is still advised to replace the values by external time series, e.g., derived from SLR. Such a time series that is consistently processed with GRACE RL06 standards, is also provided by GFZ [42].

External validation by means of comparison with in situ ocean bottom pressure observations as well as orbit tests with the GOCE satellite confirm that improvements have been achieved with GFZ RL06 over RL05a, enabling thus a better understanding of phenomena in the Earth system related to climate change. To put the relative improvement from RL05a to RL06 in context with the relative improvements between all GFZ GRACE releases since RL01, Figure 9 shows gravity field anomalies for all previous GFZ GRACE releases exemplarily for the month August 2003. The corresponding relative improvements in terms of wRMS over ocean are as follows: RL01 to RL02: 14%, RL02 to RL03: 24%, RL03 to RL04: 4%, RL04 to RL05a: 0%, RL05a to RL06: 41%. This is another clear indication of the remarkable improvements achieved with the GFZ RL06 reprocessing and also depicts that even after more than 15 years of the first instrument data release a substantial gain in the quality of monthly GRACE gravity field products is possible thanks to reprocessing efforts regarding Level-1 and Level-2 products, but also improved background models and enhanced processing strategies. Hence, reprocessing of a GRACE RL07 time series is already planned, for which a final release of Level-1 products will be available. Apart from using these new Level-1 data and possible background model updates, the specific focus at GFZ for RL07–or other likely upcoming releases–will be on the reported C21/S21 issue, as well as on a further reduction of noise as achieved by other groups (see, e.g., [14]). Whereas modification or fine tuning of the parametrization is always a promising option in view of improvements, in particular the application of an improved stochastic modeling of errors in observations and background models is envisaged.

A comparison between GFZ RL06 and recently published GRACE time series by other processing centers is not the purpose of this work; however, such comparisons were already done in several other studies: Göttl et al. [56] report an increased consistency of the SDS (CSR, JPL, GFZ) RL06 and ITSG-Grace2018 solutions compared to the SDS RL05 and ITSG-Grace2016 solutions. Kvas et al. [14] investigated the signal content of the SDS RL06 and ITSG-Grace2018 solutions by evaluating river basin averages and conclude that all four solutions exhibit the same signal content. Adhikari et al. [57] calculated sea-level fingerprints using the SDS RL06 time series and find that differences between these three solutions are within 1-sigma uncertainties.

**Figure 9.** Gravity field anomalies in terms of cm EWH (DDK3 filtered) for the month 2003/08 for all GFZ GRACE releases so far: RL01 (**top left**), RL02 (**top middle**), RL03 (**top right**), RL04 (**bottom left**), RL05a (**bottom middle**), and RL06 (**bottom right**).

The GFZ GRACE RL06 monthly gravity field time series consists of fully unconstrained spherical harmonic (SH) Level-2 products, i.e., no regularization at all is applied, and is provided in two versions as agreed upon within the GRACE SDS: (1) up to SH degree and order 96; and (2) up to SH degree and order 60. GFZ GRACE RL06 is available at GFZ's Information System and Data Center (ISDC) archive (https://isdc.gfz-potsdam.de/grace-isdc/) along with related documentation ([12]; Release Notes for GFZ RL06 Level-2 products (ftp://isdcftp.gfz-potsdam.de/ grace/DOCUMENTS/RELEASE\_NOTES/GRACE\_GFZ\_L2\_Release\_Notes\_for\_RL06.pdf); GRACE Level-2 User Handbook (ftp://isdcftp.gfz-potsdam.de/grace/DOCUMENTS/Level-2/GRACE\_L2\_ Gravity\_Field\_Product\_User\_Handbook\_v4.0.pdf)).

GFZ GRACE RL06 processing standards and background models are also used for the initial GFZ GRACE-FO Level-2 product release [58]. Moreover, GFZ GRACE/GRACE-FO RL06 Level-2 products are the basis for GFZ's web portal GravIS (Gravity Information Service, http://gravis.gfz-potsdam.de), jointly developed with the Alfred-Wegener-Institut and TU Dresden, where dedicated Level-3 products for hydrological, oceanic and polar ice-sheet applications are visualized and offered for download. Finally, the GFZ GRACE RL06 time series contributes to the newly established International Combination Service for Time-variable Gravity Fields (COST-G), a product center of the International Gravity Field Service (IGFS).

**Author Contributions:** Conceptualization, C.D.; Processing of GRACE orbits and gravity fields, C.D.; Processing of GPS orbits, A.R.; Software, C.D., M.M., G.M., K.H.N. and O.A.; Validation of results, C.D., M.M., H.D., R.S. and C.F.; Writing—Original Draft Preparation, C.D.; Writing—Review and Editing, M.M., F.F., H.D., R.S. and C.F.; Visualization, C.D., M.M. and R.S.; Supervision, F.F., H.D. and R.K.; Project Administration, F.F.; Funding Acquisition, F.F., H.D. and R.K.

**Funding:** This research was partly funded by the German Ministry for Education and Research (BMBF) with FKZ 03F0654A, and by the German Research Foundation (DFG) within Research Group 2736 NEROGRAV (New Refined Observations of Climate Change from Spaceborne Gravity Missions).

**Acknowledgments:** We would like to thank the German Space Operations Center (GSOC) of the German Aerospace Center (DLR) for providing continuously and nearly 100% of the raw telemetry data of the twin GRACE satellites. Valuable comments by four anonymous reviewers helped to improve the manuscript and are highly appreciated.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **SLR, GRACE and Swarm Gravity Field Determination and Combination**

**Ulrich Meyer 1,\*, Krzysztof Sosnica 2, Daniel Arnold 1, Christoph Dahle 1,3, Daniela Thaller 4, Rolf Dach <sup>1</sup> and Adrian Jäggi <sup>1</sup>**


Received: 28 February 2019; Accepted: 16 April 2019; Published: 22 April 2019

**Abstract:** Satellite gravimetry allows for determining large scale mass transport in the system Earth and to quantify ice mass change in polar regions. We provide, evaluate and compare a long time-series of monthly gravity field solutions derived either by satellite laser ranging (SLR) to geodetic satellites, by GPS and K-band observations of the GRACE mission, or by GPS observations of the three Swarm satellites. While GRACE provides gravity signal at the highest spatial resolution, SLR sheds light on mass transport in polar regions at larger scales also in the pre- and post-GRACE era. To bridge the gap between GRACE and GRACE Follow-On, we also derive monthly gravity fields using Swarm data and perform a combination with SLR. To correctly take all correlations into account, this combination is performed on the normal equation level. Validating the Swarm/SLR combination against GRACE during the overlapping period January 2015 to June 2016, the best fit is achieved when down-weighting Swarm compared to the weights determined by variance component estimation. While between 2014 and 2017 SLR alone slightly overestimates mass loss in Greenland compared to GRACE, the combined gravity fields match significantly better in the overlapping time period and the RMS of the differences is reduced by almost 100 Gt. After 2017, both SLR and Swarm indicate moderate mass gain in Greenland.

**Keywords:** satellite gravimetry; ice mass change; GRACE; SLR; swarm; normal equation combination

#### **1. Introduction**

The Gravity Recovery and Climate Experiment satellite mission (GRACE) [1] dedicated to the observation of temporal variations of the gravity field allows for the quantification of ice mass loss of glacier accumulations in polar and sub-polar regions (e.g., [2–4]). However, this high resolution information is limited to the life-time of the GRACE satellites (2002–2017) and of the GRACE-FO (Follow On) [5] mission that was launched in May 2018, but suffered a failure of the main instrument processing unit between July and October 2018. To date, no other single satellite mission or proxy is able to bridge the gap between GRACE and GRACE-FO with comparable quality (compare, e.g., [6,7] or [8]). We therefore present a combination on the normal equation level of alternative satellite gravimetric data considering several missions.

Satellites not dedicated to gravity field determination already collected data before GRACE and during the gap between GRACE and GRACE-FO. These are mainly the geodetic satellite laser ranging (SLR) missions, e.g., the Laser Geodynamics Satellites (LAGEOS) or, at a significantly lower orbit altitude, the Laser Relativity Satellite (LARES) as the youngest member of the SLR

family, which also provide information about temporal variations of the Earth gravity field at the lowest degrees of the spherical harmonic spectrum. The geodetic SLR satellites are optimal for gravimetry by their spherical geometry and their favorable area-to-mass ratio that minimizes the effect of surface forces [9]. For studies of temporal gravity field variations derived by SLR in the pre-GRACE era, see Cheng et al. [10], Bianco et al. [11] and Cheng and Tapley [12]. For a focus on variations in Earth oblateness, where GRACE results are unreliable [13], compare Cox and Chao [14] Cheng and Tapley [15] and Bloßfeld et al. [16].

In addition, other Earth observing satellites at orbits below 1500 km altitude, so-called low Earth orbiters (LEOs), which are equipped with GPS receivers for precise orbit determination, may serve for deriving the mass distribution of the Earth and its temporal variations at low to medium resolution [17]. Due to the time-period covered and their low orbit altitude, the three satellites of the Swarm mission [18] are well suited to bridge the gap between GRACE and GRACE-FO.

We present a long-term series covering 1995–2018 of low resolution monthly gravity field coefficients derived at the Astronomical Institute of the University of Bern (AIUB) from a combined orbit determination of LAGEOS and the geodetic SLR satellites in low Earth orbits [19]. The derived gravity fields are compared to time-series of GRACE solutions spanning 2003–2016, which were also determined at AIUB [20]. Finally, also the GPS-derived Swarm kinematic orbits that are routinely computed at AIUB [21] are exploited to derive monthly gravity fields for the time-period 2014–2018.

Due to the different orbital altitudes and ground track patterns of the satellites, and due to the different observation techniques used, the gravity fields derived differ in their spherical harmonic and corresponding spatial resolution [22]. The truncation of the spherical harmonic expansion of the gravity field leads to spatial leakage that has to be taken into account when comparing the results (e.g., [4,23–26]). Furthermore, the high-resolution GRACE gravity fields commonly are smoothed to suppress noise in the high-degree coefficients (e.g., [27–29]). We shortly introduce the representation of the gravity field in a spherical harmonic expansion and exemplify the problem of signal loss due to leakage or filter attenuation in Section 2. In the following, all monthly gravity fields are truncated to the same spherical harmonic degree for the sake of comparison of the different techniques.

To illustrate the capability of the different satellite gravimetric techniques to track temporal variations of the mass distribution within the system Earth, we transform the gravity variations to variations in equivalent water height (EWH) [30] or mass variations in Greenland and at the coast of West Antarctica and compare their spatial distribution and development with time. Moreover, we fit deterministic signal models of secular and seasonal variations to consecutive five-year time-periods and compare the derived mass trends.

Comparable studies were already performed by Matsuo et al. [31], Talpe et al. [32] and Bonin et al. [33]. The latter concluded that truncated at spherical harmonic degree 5 gravity field models are not able to correctly separate mass loss in Greenland and Antarctica and that, while the inter-annual variations in none of the SLR time-series are realistic, long-time mass trends are well captured in the regularized AIUB SLR time-series truncated at degree 10 [19] as opposed to other series. Bonin et al. [33] suggest to either reduce the temporal resolution of the SLR-derived gravity fields to reduce their scatter, which in our eyes is counter-intuitive because it also reduces the separability of mass loss signal in Greenland and Antarctica, or to combine SLR with other data.

We do not continue our regularized SLR-only degree 10 time-series beyond 2014 because in combination with Swarm the full sensitivity of SLR can be exploited without regularization. The combination approach is favorable because it is independent from external information not based on the original observations. We provide an unconstrained SLR-only degree 6 time-series and a combined SLR + Swarm time-series where SLR entered to degree and order 10. The decorrelation of the individual spherical harmonic coefficients is possible by the Swarm data. Due to the consistent processing of Swarm and SLR, we are able to perform the combination on the normal equation level, taking correlations between reference frame (co-estimated in the case of SLR), force model and orbit parameters into account [34]. Combinations of SLR with CHAllenging Minisatellite Payload

(CHAMP), GRACE or Gravity field and steady-state Ocean Circulation Explorer (GOCE) for gravity field determination were also studied by Moore et al. [35], Cheng et al. [36], Maier et al. [37] and Haberkorn et al. [38].

#### **2. Materials and Methods**

The dominating force acting on a satellite is due to Earth gravity. Satellite orbits at low altitudes are sensitive to mass distribution and redistribution and therefore to mass transport in the Earth system. The long-term mean gravity field of the Earth has been determined by the dedicated gravimetric satellite mission GFZ-1 [39], by CHAMP [40] that also observed the magnetosphere, and by GRACE [41,42] and GOCE [43,44], again dedicated to gravimetry. Temporal variations in gravity have mainly been derived from GRACE (see Wouters et al. [45] for an overview), but also from CHAMP [46] and other Earth observing LEOs such as the three satellites of the Swarm mission [47] or from the fleet of geodetic SLR satellites (e.g., [10–12]).

Prerequisite for gravity field determination is the precise observation of the satellite orbits. Today this may either happen by laser ranging or by GPS- or Doris-tracking of the satellites (the latter being irrelevant here due to the choice of missions). Critical for orbit modeling and signal separation is the knowledge of all forces acting on the satellites, the so-called background force model, and the set of parameters estimated to represent the orbit, improve the a priori force model and absorb model or instrument errors, the so-called orbit parameterization. In detail, the force model consists of a priori models of the gravitational forces by the Earth and third bodies, ocean, atmosphere and solid Earth tides, satellite specific models of the surface forces such as air drag, solar radiation pressure and Earth albedo, and an empirical or pseudo-stochastic part [48]. In case of dedicated gravity missions, the surface forces are normally measured by accelerometers onboard the satellites [49] (due to technical reasons only the sum of all forces plus an unknown but constant bias can be observed).

At AIUB, the gravity field determination from satellite observations is treated as a generalized orbit determination problem [48]. The satellites' orbits are co-estimated with the instrument specific parameters like accelerometer scale factors and with the parameters updating the a priori background forces, i.e., the weight coefficients *Clm* and *Slm* of the expansion of the Earth's gravitational potential *V* in spherical harmonics [50]

$$V(r, \phi, \lambda) = \frac{GM}{R} \sum\_{l=0}^{\infty} \left(\frac{R}{r}\right)^{l+1} \sum\_{m=0}^{l} P\_{lm}(\sin \phi) \left(\mathbb{C}\_{lm} \cos m\lambda + \mathbb{S}\_{lm} \sin m\lambda\right),\tag{1}$$

where *r*, *φ*, *λ* are the spherical coordinates in an Earth-fixed reference frame, *GM* is the product of the gravitational constant and the Earth's mass, *R* is the semi-major axis of the Earth and *Plm* are the fully normalized Legendre functions of degree *l* and order *m*. In case of SLR, air drag scale factors, station and geocenter coordinates, and Earth orientation parameters are also co-estimated [19].

Deficiencies of the force model are mitigated by pseudo-stochastic accelerations or pulses [51]. In order to reduce the absorption of gravity field signal by these parameters, their frequency has to be tailored to the orbit altitude and observation sampling of the specific satellite and their magnitude has to be limited by constraints depending on the satellites' environment at orbit altitude. By extensive cross-validation with other analysis centers in the frame of the European Gravity Service for Improved Emergency Management (EGSIEM) [52] project, it could be shown that mass trends and the amplitude of seasonal variations in the AIUB gravity fields are not affected by the pseudo-stochastic parameterization. Since a separate determination of sub-groups of the parameter space leads to a regularization favoring the a priori force model applied [53] orbit, stochastic and force model parameters have to be determined together in one adjustment process.

The observations used for orbit determination are either the 1D-ranges (normal points [54]) observed by SLR, the kinematic orbits of Earth observing LEOs determined by precise point positioning (PPP) [55] from high-low GPS data that are used as pseudo-observations [56], or low-low range-rates derived from the K-band inter-satellite link in case of the two satellites of the GRACE mission [57]. The sampling of these observation types is very diverse. In case of SLR, it depends on the inhomogeneous global distribution of the SLR stations [19]. Certain regions of the Earth, such as the polar regions, are not covered by observations at all due to the lack of suitably positioned stations. All information about gravity variations is derived from the orbit dynamics. Therefore, the orbit modeling of SLR satellites has to be mainly dynamical (based on physical models) and the pseudo-stochastic parameterization of the orbits has to be very limited to allow for gravity field determination. On the other hand, GPS and K-band observations are normally given at very high sampling rates of 1 s, 5 s or 10 s and in case of polar orbits the observation distribution is global and densest near the poles. Several pseudo-stochastic parameters may be set up per orbital revolution of the Swarm or GRACE satellites.

The spatial resolution of the resulting gravity fields depends on the orbital altitude and ground track pattern of the satellites. In case of sparse SLR tracking, gravity variations beyond degree 2 can only be determined by the combined evaluation of several satellites (see [10–12]). Satellites at lower inclinations are helpful to decorrelate the individual gravity field coefficients [19]. In case of the Earth observing LEOs, the achievable temporal resolution of subsequent gravity field solutions directly depends on sub-cycles of the ground track pattern, which vary depending on the orbit altitude (decreasing with time in case of the LEOs). The GRACE mission was designed to deliver monthly gravity fields [1] and since then monthly temporal resolution has become the standard, even if 10-day [58], and even daily "snapshot" solutions [59] of the gravity field are also available.

#### *2.1. SLR*

An important prerequisite for the determination of the long wavelength part of the gravity field is the exact modeling of the surface forces acting on the satellites. To simplify the modeling of the surface forces, most of the geodetic SLR satellites have a low area-to-mass ratio [9]. The high-flying SLR satellites LAGEOS 1 and 2 are mainly sensitive to the Earth's flattening C20 and its variations with time [60]. Higher spectral resolution of the gravity field can be achieved by combined processing of the fleet of geodetic SLR satellites at lower orbits (SLR-LEOs) and at different inclinations of the orbital plane (see [10–12,19]). We exploit the two LAGEOS satellites and the following SLR-LEOs for the determination of large scale temporal gravity field variations: Starlette, AJISAI, Stella, Larets, LARES, and the old Earth observation satellite Beacon-C that is carrying a laser retro-reflector array and is included due to its low orbit inclination. The orbit characteristics of all satellites used are compiled in Table 1.

**Table 1.** Orbit characteristics of satellites used and a priori observation errors, as determined by variance analysis of residuals (Beacon-C is down-weighted due to large surface forces acting on the non-spherical satellite).


The same a priori model of gravitational forces is consistently used for the SLR, GRACE and Swarm processing. To avoid affecting the derived temporal gravity field variations by a priori information, we use a static a priori gravity field. Furthermore, the background force model consists of solid Earth tides, ocean tides, ocean pole tides, and de-aliasing of ocean and atmosphere mass variations (AOD) [61]. Specific for the SLR satellites are models of the surface forces for air drag, solar radiation pressure and albedo that take into account the properties of the individual SLR satellites. The force model constituents and the resolution of their expansion in spherical harmonics (if applicable) are listed in Table 2.

**Table 2.** Background force model details for processing of low Earth orbiters. Where the force model constituent is expanded in spherical harmonics, the max. degree/order is given. Where this is not the case, it is only indicated if the listed model is applied or not. ACC indicates that surface forces are observed by accelerometers.


Even if SLR is sensitive to gravity field variations at higher degrees, at monthly resolution, only spherical harmonics coefficients (SHC) of degrees 2 to 5 and of degree 6 and order 1 can be determined from SLR alone, i.e., without regularization. At higher degrees SLR-only gravity field solutions suffer from the strong correlations between individual SHC [19]. A prerequisite for the separation of individual SHC, even at the low degree of 5, is that SLR LEOs orbiting at different inclinations are used [10]. In this context Beacon-C and LARES are helpful due to their rather exotic orbit inclinations (Table 1).

However, the correct localization of the mass loss signal is not possible at this low resolution [33]. Therefore, we perform a combined solution with Swarm (see Section 2.4). In combination with Swarm, the SHC are decorrelated by the Swarm observations. To exploit the sensitivity of SLR beyond degree 5 in the combination and to avoid omission or commissioning errors [67], the SHC are set up to degree and order 10. In case of SLR-only monthly solutions coefficients of degree 6 (all but order 1), and degrees 7–10 are fixed to zero. In case of combined solutions, all coefficients are estimated.

The parameter-space of the SLR solutions is complemented by monthly estimates of the SLR station coordinates, by daily piecewise-linear estimates of the geocenter coordinates and by monthly estimates of the Earth rotation parameters [68]. As indicated above, the number of pseudo-stochastic parameters is very limited due to the sparse observation coverage. Only in along-track pulses are estimated once per orbital revolution of the satellites. The stochastic orbit parameterization is completed by periodic terms on orbit revolution frequency. The complete SLR orbit parameterization and the time intervals for which the individual parameters are set up are detailed in Table 3.


**Table 3.** Orbit parameterization of SLR satellites and time intervals for which the individual parameters are estimated.

#### *2.2. GRACE*

The GRACE satellites were launched in March 2002 into polar orbits at an initial orbit altitude of 500 km [1]. During 15 years, they provided information about temporal variations of the Earth gravity field at unprecedented spatial and temporal resolution. The mission ended in October 2017 due to battery failure after decay to an orbit altitude of 330 km. The satellites were equipped with GPS receivers for orbit determination [57]. The key instrument for gravity field recovery was a K-band inter-satellite link which provided range measurements with micrometer accuracy [69]. Due to long-periodic systematic errors that cancel out by differentiation [70], most analysis centers use range-rates for gravity field recovery that were derived from the original range observations by numerical differentiation. Surface forces acting on the GRACE satellites were measured with onboard accelerometers in order to separate them from the gravitational forces [49].

We estimated monthly gravity fields from K-band range-rates and kinematic orbits that were used instead of the original GPS phase observations for efficiency reasons (the equivalence of both approaches was demonstrated by Jäggi et al. [71]). Therefore, in the first step, kinematic satellite positions were determined using the Bernese GNSS Software 5.2 (AIUB, Bern, Switzerland) [72]. The kinematic orbits were introduced as pseudo-observations together with their epoch-wise covariances [56]. Normal equations (NEQs) were computed for both observation types on a daily basis, combined and summed up to monthly batches. All but the gravity field parameters were pre-eliminated from the combined daily NEQs. Note that the K-band observations only contain line-of-sight, i.e., predominantly along-track information and a K-band only orbit solution therefore would be rank-deficit [70].

The accelerometer provides biased accelerations in the instrument frame that is closely aligned to the radial, along-track, and cross-track direction of the co-rotating orbital tripod. To account for the biases, daily constant accelerations are estimated in radial and cross-track directions. In along-track, four polynomial parameters are estimated per day to also absorb temperature induced variations in the accelerometer measurements that are in conflict with the ultra-sensitive K-band observations [20]. However, the key element of the Celestial Mechanics Approach (CMA) that was developed at AIUB for orbit and gravity field determination [48] is the estimation of rather frequent pseudo-stochastic accelerations (or pulses) to compensate deficiencies in the a priori force model [51]. Accelerations are estimated at 15 min intervals in all three directions of the orbital tripod. They are constrained to zero in order to minimize absorption of gravity signal. The parameterization applied for GRACE processing is summarized in Table 4.

For the a priori force model, the static part of AIUB-GRACE03S is used, but compared to the SLR-processing with increased spherical harmonic resolution to take into account the high sensitivity of the K-band observable. In addition, the background models for tidal effects and AOD were chosen consistently with the SLR-processing (details on the force model can be found in Table 2).

#### *2.3. Swarm*

A number of Earth observation LEOs at orbit altitudes of 400–600 km that are equipped with GPS receivers may also be exploited to derive information on gravity field variations [17,56]. Eminent candidates to bridge the gap, be it at lower spatial resolution, between GRACE and the GRACE-FO mission are the three satellites of ESA's Earth's Magnetic Field and Environment Explorer (Swarm) mission [18] that were launched in November 2013. All three satellites of the constellation circle the Earth in near circular polar orbits, Swarm A and C at 460 km, Swarm B at 530 km altitude (in 04/2014). They are equipped with GPS receivers and accelerometers, but the data of the latter turned out to be disturbed by slow temperature-induced bias variations and sudden bias changes [73] and are not routinely used for orbit and gravity field determination.

At AIUB, kinematic orbits of all three Swarm satellites are determined routinely [21]. While for the first mission phase until June 2014 gravity field results are deteriorated due to high solar and consequently ionosphere activity and not optimal GPS receiver settings [74,75], for the time period critical to bridge the gap between GRACE and GRACE-FO, the data quality corresponds to the nominal. By evaluation of the kinematic Swarm orbits, monthly gravity fields can be determined that are sensitive to temporal gravity variations up to about degree 13 [47].

The background force model for the Swarm processing is defined correspondingly to SLR and GRACE (see Table 2). Again the static part of AIUB-GRACE03S is used as a priori model and AOD is applied for de-aliasing of short-term variations in the atmosphere and ocean masses. As in the case of GRACE, no models for surface forces are applied. To compensate for the not used accelerometer observations, the constraints on the pseudo-stochastic accelerations are set less strict [21]. Details on the Swarm orbit parameterization can be extracted from Table 4.


**Table 4.** Orbit parameterization of Earth observation satellites and time intervals for which the individual parameters are estimated.

#### *2.4. Combination of Swarm and SLR on the Normal Equation Level*

To fill the gap between GRACE and GRACE-FO, we propose a combination of gravity fields derived by multi-SLR and Swarm analysis as detailed above. A very simple combination of SLR and GRACE, in fact the replacement of C20 estimates in the monthly GRACE gravity fields by values derived from SLR, is already common practice [76]. In contrast to this, we perform a combination of Swarm and SLR on the NEQ level, i.e., we solve the equation

$$(w\_{SLR}\mathbf{N}\_{SLR} + w\_{Surn}\mathbf{N}\_{Surn})\mathbf{dx} = w\_{SLR}\mathbf{b}\_{SLR} + w\_{Surn}\mathbf{b}\_{Surn} \tag{2}$$

for the vector of unknown gravity field coefficients *dx*, where *NSLR* and *NSwarm* are the normal equation matrices of SLR and Swarm, *bSLR* and *bSwarm* are the corresponding right-hand side vectors of the individual normal equation systems, and *wSLR* and *wSwarm* are weighting factors. The solution of the combined normal equation system is superior to a combination on a solution level because correlations between gravity field coefficients and other force model, orbit, instrument and reference frame parameters are taken into account [34]. Since all satellite data have been processed consistently and all but the gravity field parameters were pre-eliminated from the individual NEQs, the combination is straightforward.

The key question of the combination is the ratio of the weights *wSwarm* and *wSLR* assigned to the different observation techniques [38]. We first derive monthly weights by variance component estimation (VCE) [77]. As shown in Figure 1, the ratio *wSwarm* : *wSLR* of the weights derived by VCE ranges from 20 to 95. To assess the plausibility of these weights, we derive alternative weights based on the accuracy estimates of the SLR ranges, which typically vary around 2 cm (see Table 1), and the GPS L1 phase observations of Swarm, for which Schreiter et al. [78] provide accuracy estimates close to 3 mm in case of strong ionosphere activity and close to 2 mm in case of low ionosphere activity (as long as no advanced observation screening in the region of the geomagnetic equator is performed). Based on these accuracy assumptions, a ratio of weights in the range from 44 to 100 can be expected, which is quite close to what is determined by VCE.

**Figure 1.** Ratios of monthly weights of the Swarm- with respect to the SLR-normal equations, as derived by variance component estimation.

The slightly positive trend in the relative weights visible in Figure 1 is explained by the decaying orbit altitude of the Swarm satellites and the corresponding increase in sensitivity to the gravity field. A seasonal variation probably is related to the seasonal tracking characteristics of the SLR stations and the fact that much more stations are located in the northern than in the southern hemisphere.

Test combinations were performed based on constant weighting ratios *wSwarm* : *wSLR* = 100:1, 10:1, 2.5:1 or 1:1. To assess the contribution of either Swarm or SLR to the combined solution, resolution matrices *RSLR* and *RSwarm* were determined from the individual normal equation matrices *NSLR*, *NSwarm* and the inverse of the combined matrix *N* = *wSLRNSLR* + *wSwarmNSwarm* following the approach described in Sneeuw [79]:

$$\mathbf{R}\_{SLR} = \mathbf{N}^{-1} \mathbf{N}\_{SLR\prime} \tag{3}$$

$$\mathbf{R}\_{\text{Swarm}} = \mathbf{N}^{-1} \mathbf{N}\_{\text{Swarm}}.\tag{4}$$

The contribution numbers of the individual SHC are found on the main diagonals of the resolution matrices and are presented in triangle plots in Figure 2 for Swarm (left column) and SLR (right column). The contribution to the Sine-coefficients of the spherical harmonic spectrum are shown in the left hand part of the triangle plots, to the Cosine-coefficients in the right hand part, and the contribution to the zonal SHC (order 0) can be found in between. The contribution numbers vary between 0 (no influence) and 1 (determined by 100% from the corresponding observations). In the middle column of Figure 2, the mean contribution per degree is given.

**Figure 2.** Contribution per spherical harmonics coefficient of Swarm (**left**) and SLR (**right**) satellites and mean contribution per degree (**middle column**) in case of relative weighting 100:1 (**top**), 10:1 (**second row**), 2.5:1 (**third row**) or equal weight (**bottom row**).

In case of a ratio of weights 100:1 (Figure 2, first row), only degree 2 SHC is significantly influenced by SLR, C20 in fact is dominated by SLR. Applying relative weights of 10:1 (Figure 2, second row), still mainly degree 2 SHC are impacted by SLR. Considering the strength of the SLR-derived temporal gravity variations, the suppression of the contribution of SLR to the other SHC is not justified. A combination based on the weights derived by VCE (Figure 1) therefore is pointless.

Decreasing the relative weight of the GPS observations further (Figure 2, rows three and four), the contribution of Swarm is step by step reduced to the sectorial (equal degree and order) and near sectorial SHC, where the sensitivity of GPS observations is strong [70], and to C30 that in case of SLR is weakly determined due to correlations with other zonal coefficients [68].

#### *2.5. Spectral Resolution, Signal Leakage, and Filter Loss*

The Earth gravity field is commonly represented by a spherical harmonic expansion (Equation (1)), truncated at a certain maximum degree *lmax* < ∞. The maximum degree (and consequently order) of this expansion determines the spatial scale of the represented signal. Monthly GRACE gravity fields are available to degree and order 90 (corresponding to a spatial scale of 460 km at the equator), Swarm monthly gravity fields are expanded to degree/order 70 but contain significant time-variable signal only up to about degree 13 (approx. 3100 km at the equator) [47]. SLR-derived gravity fields are even limited to degree 6 (approx. 6700 km at the equator) due to correlations between individual SHC at higher degrees that cannot be determined in unconstrained solutions given the inhomogenous observation coverage [10].

The truncation of the spherical harmonic expansion at a certain maximum degree causes signal leakage (e.g., [23–26]). To demonstrate this effect, we simulate a mass layer with uniform mass distribution within the island of Greenland, while the mass over the rest of the globe is set to zero. The simulated mass distribution is expanded in a series of spherical harmonics and reconstructed from the SHC truncating the series at various values of *lmax*. In Figure 3a, the reconstructed mass distributions are shown and the percentage of the integrated mass still contained inside the shorelines of Greenland is listed.

**Figure 3.** Integrated mass within Greenland reconstructed at different truncation degrees, (**a**) without filter or (**b**) smoothed by a 300 km Gaussian filter.

The signal content is also attenuated by filters to suppress the noise inherent to the gravity field models. A very simple filter that is commonly applied to GRACE monthly gravity fields is a Gaussian filter with 300 km filter radius [27]. In the spatial domain, the Gaussian filter represents a weighted average, the weighting function being bell-shaped with 300 km the half-width radius. In the spectral domain, the filter corresponds to degree-dependent scaling factors that quickly approach zero for higher degrees (from 1 at degree 0, the scaling factor has already dropped to 0.01 at degree 60). The experiment on the signal leakage by truncation of the spherical harmonic expansion is repeated, additionally applying a 300 km Gaussian filter for smoothing (Figure 3b). Due to the filter attenuation at medium to high degrees, the signal loss is even more dramatic than in the unfiltered case. More advanced filter types like the non-isotropic filter proposed by Han et al. [28] and the decorrelation filter DDK [29] were designed to minimize the filter loss, but the principle problem of signal attenuation persists.

Both experiments show that gravity fields truncated at different maximum degrees cannot be compared. At the very limited resolution of the SLR-derived gravity fields, only a fraction of the original mass is localized inside the borders of Greenland. The true mass localization and change can be recovered approximately applying iterative forward-modeling approaches [25,26], which require additional assumptions like mass loss being concentrated in fast flowing sections of ice streams, or at the coast. Nevertheless, leakage (in and out) and filter loss remain major limitations to the quantification of mass transport from satellite gravimetry. In the following, we truncate all gravity fields to the same low degree for the sake of comparison. Furthermore, we refrain from the use of filters.

To derive mass from SHC, first the dimensionless SHC are scaled by the factors provided by Wahr et al. [30] to transform them to EWH. Then, global 1◦-grids of EWH are computed from the scaled SHC by spherical harmonic synthesis (Equation (1)). The series expansion is truncated at the maximum degree of choice (either 6 or 10). To transform the EWH grids to mass, each grid cell is multiplied by its area (dependent on latitude) and the density of water (1000 kg/m3). By integration over the area of interest, e.g., Greenland, the final mass estimates are derived.

#### **3. Results and Discussion**

Figure 4 shows the integrated secular effect of glacial isostatic adjustment (GIA), ice mass and snow mass change in the polar regions during the period 2010 to 2014 as derived from un-smoothed degree 60 monthly gravity field models determined from GRACE GPS and K-band observations. The mass loss is mainly related to ice loss (dynamic or by melting and run-off). GIA, i.e., the relaxation of the crust in reaction to the large scale ice melt after the last ice age, counteracts ice mass loss and therefore the actual ice mass loss is even larger (estimates for the mass change induced by GIA vary from 1 Gt/year to 20 Gt/year for Greenland and from 55 Gt/year to 110 Gt/year for Antarctica [32]) than indicated by the figures, while snow mass depends on the season and largely cancels out in a multi-year mean. The observation coverage is densest near the poles due to the polar orbits of the GRACE satellites and consequently the noise is lowest near the poles. At lower latitudes, the noisy striping typical for GRACE becomes visible.

**Figure 4.** Trends in equivalent water height (EWH) as observed by GRACE in the Arctic (**left**) or Antarctic (**right**) region in the period 2010-2014, derived from unfiltered d/o 60 solutions.

For comparison to SLR-derived gravity field models the GRACE results are truncated at degree/order 10 and the corresponding EWH trends are shown in Figure 5. The trend signal that is well localized over the continents in Figure 4 leaks out over the oceans in Figure 5, as predicted by our simulation (Figure 3).

**Figure 5.** Trends in equivalent water height (EWH) in polar regions derived from GRACE gravity fields truncated at d/o 10.

In Figure 6, global plots of GRACE-derived trends 2010–2014 in EWH (left) and the amplitudes of seasonal variations (right) are provided, both truncated at degree/order 10. Trends are mainly visible at high latitudes, where large scale ice mass loss is the main reason for negative trends, GIA for positive trends in North America and Fennoscandia. At this low spherical harmonic resolution, it is difficult to state if negative trends near the west coast of North America are related to drought. Positive trends in the Amazon region probably are related to inter-annual variability that is not strictly seasonal. Seasonal variations (Figure 6, right) are related to hydrology and are strongest in the tropical and subtropical regions with strong seasonal variation in rainfall.

**Figure 6.** Global trends 2010–2014 in EWH (**left**) and amplitude of annual EWH variations (**right**) derived from GRACE gravity fields truncated at d/o 10.

Figure 7 provides the same information as Figure 6, but derived from SLR. In the top row, the unconstrained degree 6 gravity field models, and, in the bottom row, the constrained degree/order 10 solutions are evaluated. Especially in the case of the EWH trends, the improvement of the localization with a higher maximum degree is obvious. In the case of seasonal variations, the amplitudes in all but the Amazon basin are heavily attenuated compared to GRACE (Figure 6, right).

**Figure 7.** Global mass trends (in equivalent water height) as observed by SLR between 2010 and 2014 (**left**) and amplitude of annual variations (**right**). Top: unconstrained monthly gravity fields up to degree 6, bottom: constrained monthly gravity fields up to degree 10.

Figures 8 and 9 focus on EWH trends derived from SLR for the time periods 1995–1999, 2000–2004, 2005–2009 and 2010–2014 in polar regions. The trends are determined either from unconstrained degree 6 solutions (top row) or from regularized monthly gravity field models up to spherical harmonic degree and order 10 (bottom row). While we can confirm the findings of Bonin et al. [33] based on simulations that with degree 5 gravity fields, the correct localization of mass variations in Greenland and Antarctica is not granted; we observe rather precise localization of the SLR-derived mass trends in Greenland or Antarctica for the regularized degree 10 solutions (compare the sub-figures for the time-period 2010–2014 in Figures 8 and 9 with the corresponding results from GRACE in Figure 5). Along the coasts of Greenland and close to the coast of West Antarctica, the SLR gravity fields indicate significant mass loss over the ocean, but this has to be expected taking the effect of leakage into account (Figure 3).

**Figure 8.** Trends in EWH as observed by SLR in the Arctic region during 5-year periods between 1995 and 2014. **Top**: unconstrained monthly gravity fields up to degree 6, **bottom**: constrained monthly gravity fields up to degree 10.

**Figure 9.** Trends in EWH as observed by SLR in the Antarctic region during 5-year periods between 1995 and 2014. **Top**: unconstrained monthly gravity fields up to degree 6, **bottom**: constrained monthly gravity fields up to degree 10.

In the following, time-series of mass change within certain regions are studied to demonstrate the accordance of SLR and Swarm with GRACE (both truncated at degree/order 6 corresponding to the unconstrained SLR solutions). SHC of degree 1 and C20 are removed, the latter because the GRACE estimates of C20 suffer from temporal aliasing [13] and accelerometer instrument noise [80] (the effect of C20 alone on mass change in Greenland sums up to about 100 Gt in the time period from 2000 to 2018). No re-scaling to compensate for signal leakage is applied and no corrections for GIA and seasonal variations due to snow mass are applied. For a complete treatment of ice mass loss as derived from GRACE data, we refer to the literature in this field (e.g., [4]).

While GRACE observations cover the period 2002–2017, earlier estimates of mass change have to be based on observations of the geodetic SLR satellites. After the end of the GRACE mission, mass estimates are based either on SLR or on Swarm observations. The good agreement of SLR- and GRACE-derived seasonal variations in the Amazon basin is obvious in Figure 10, while Swarm tends to slightly overestimate the seasonal variation.

**Figure 10.** Monthly mass estimates for the Amazon basin. GRACE and Swarm gravity fields were truncated at degree 6 to match the resolution of SLR gravity fields. Spherical harmonic coefficient C20 was excluded from the comparison.

The evolution of the Greenland mass is shown in Figure 11. Again, the long-term agreement of all three observation types is very good. SLR does not reveal significant mass loss in Greenland in the pre-GRACE era. Obviously, the GRACE mission was launched just in time to observe the onset of ice loss in Greenland. After acceleration in the ice mass loss, the GRACE mass estimates almost level out in the time period from 2014 to 2016. SLR tends to slightly overestimate the Greenland mass loss between 2014 and 2017, while, after 2017, even a small mass gain is visible in both the SLR and Swarm derived mass estimates. The reduction of the SLR- and Swarm-derived mass loss therefore is time-shifted by about three years compared to GRACE. Bonin et al. [33] also observe diverging mass trends for Greenland but state that, compared to the independent time-series of mass change derived by the Ice-sheet Mass Balance Inter-comparison Exercise (IMBIE) for Greenland and Antarctica [81], in the long term, these deviations of SLR average out. The inter-annual variability is larger in the monthly SLR estimates than observed by GRACE at this low spherical harmonic degree, and even larger in the monthly Swarm solutions.

**Figure 11.** Monthly mass estimates for entire Greenland (see masks in Figure 6). GRACE and Swarm gravity fields were truncated at degree 6 to match the resolution of SLR gravity fields. Spherical harmonic coefficient C20 was excluded from the comparison.

While the accuracy of the GRACE mass estimates at this very low resolution can considered to be constant in the time-period shown, because temporal variations in noise due to the changing satellite environment and technical problems mainly are manifest in the weakly determined high degree/order SHC, the quality of the SLR-derived estimates depends on the station network that was even more sparse in the 1990s than it is today, and on the number of satellites contributing to the monthly solutions. A reduction in the scatter can be observed after the launch of LARES in 2012. The monthly Swarm solutions exhibit even greater scatter than the SLR solutions and additionally seem to over-estimate the seasonal variation compared to GRACE. After 2014, the Swarm and the SLR results match well in trend

and in phase. During 2014, the Swarm results are impaired by high ionosphere activity, non-optimal GPS receiver settings, and in the first half of 2014 by reduced observation sampling [75].

Consecutively, mass trends were determined for the coastal regions of Greenland with strong mass loss, the region of the inland ice sheet, where a weak gain in mass is observed by GRACE, and the coast of West Antarctica. The masks used are provided in Figure 12. The separation of Greenland into regions of mass gain and regions of mass loss was done based on Figure 4. The mask of Antarctica and its glacial sub-basins is derived from Horwath and Dietrich [82].

**Figure 12.** Masks defining inland and coastal regions of Greenland (**left**) and the coast of West Antarctica (**right**). The resolution of the masks is 1◦.

Starting with the first monthly estimates from SLR in 1995, we fit deterministic models including bias, trend and seasonal variations within 5-year periods to integrated mass estimates of either SLR or GRACE monthly gravity field models summed up over the coastal areas (Figure 13, left) or the inland ice sheet (Figure 13, right) of Greenland, or the coast of West Antarctica (Figure 14).

**Figure 13.** Monthly mass estimates in Greenland from GRACE and SLR, and best fitting trends for 5-year periods for the region of mass loss along the coast (**left**) or the inland region of mass gain (**right**). Each 5-year period is centered around zero.

In case of SLR, the trend estimation is based either on the unconstrained degree 6 monthly gravity field solutions or on the regularized degree 10 solutions (compare Figures 8 and 9 for the corresponding spatial plots). For comparison, the original degree 60 GRACE gravity fields were truncated at degrees 6 or 10 and also mass trends in 5-year intervals determined.

The SLR trend estimates reveal good agreement with GRACE (truncated at the corresponding degree) for the time periods where a direct comparison is possible. However, due to signal leakage at the very low resolution of SLR, the absolute mass loss is drastically underestimated. Truncated at degree 6 or 10, due to the very limited spatial resolution, neither GRACE nor SLR are able to trace the inland mass gain observed by GRACE at higher resolution (Figure 13, right).

Prior to the start of GRACE, no distinct mass trend within Greenland can be observed. Since the start of the highly sensitive GRACE observations, the mass loss at the coast is accelerating within the time period covered by Figure 13. This fact has already been reported, e.g., by Velicogna [83] and recently by Bevis et al. [84], and is visible in SLR and GRACE estimates alike. In case of the degree 6 SLR solutions, the five year trend estimates are 2000–2004: −21.5 Gt/year, 2005–2009: −43.6 Gt/year, 2010–2014: −63.1 Gt/year. As explained above, due to leakage, these values are drastically underestimated compared to the degree 60 GRACE solutions 2003–2004: −179.8 Gt/year, 2005–2009: −184.4 Gt/year, 2010–2014: −302.7 Gt/year (no GIA corrections applied).

Comparable results are achieved for the area of significant ice mass loss along the coast of West Antarctica (Figure 14), where, due to the bad localization of the mass loss in the degree 6 solutions, the five year trends derived from the degree 10 SLR gravity fields 2000–2004: −5.4 Gt/year, 2005–2009: −15.1 Gt/year, and 2010–2014: −50.7 Gt/year are quoted here together with the degree 60 GRACE solutions 2003–2004: −85.5 Gt/year, 2005–2009: −121.1 Gt/year, and 2010–2014: −201.0 Gt/year. Again, the trend estimates at the given low degree of SLR are underestimates due to leakage, but the increase in ice melt is captured very impressively.

**Figure 14.** Monthly mass estimates at the coast of West Antarctica from GRACE and SLR, and best fitting trends for 5-year periods. Each 5-year period is centered around zero.

The individual and the combined monthly mass estimates within Greenland for the two extreme cases of 100:1 and 1:1 relative weighting of Swarm and SLR are shown in Figure 15. The combination starts with January 2015 because, in 2014, the GRACE data are still quite complete while the Swarm data is impaired by strong ionosphere activity and non-optimal GPS receiver settings (receiver settings were adapted in May 2015). During 2015 and in the first half of 2016, a direct comparison to GRACE results is possible. After August 2016, the accelerometer on GRACE B was completely switched off and the processing of GRACE data at AIUB stopped. From the direct comparison, the impression that Swarm generally slightly overestimates the amplitude of the mass variations while SLR overestimates the secular mass loss is confirmed.

**Figure 15.** Monthly mass estimates within Greenland. All time-series truncated at degree 6 and with C20 excluded.

In case of relative weighting 100:1, the combined mass estimates closely follow the Swarm-only results. This can be expected in view of the results of the contribution analysis (Figure 2), even more so because C20 is excluded from the analysis. In the case of equal weights, the combined solution follows more closely the GRACE mass estimates, but still with higher inter-annual variability.

The monthly differences between GRACE and the individual or combined alternative mass estimates are shown in Figure 16. During 2015, the combination of SLR and Swarm with equal weighting has the smallest differences with respect to GRACE; in the first half of 2016, the combination 0.4\*SLR+Swarm is slightly closer. The RMS of the differences with respect to GRACE over all months is 166 Gt for the SLR-only solutions, 110 Gt in case of Swarm-only, 110 Gt for the combination 0.01\*SLR+Swarm, 83 Gt for 0.1\*SLR+Swarm, 67 Gt for 0.4\*SLR+Swarm and 68 Gt in case of equal weighting.

**Figure 16.** Monthly differences between GRACE and SLR (blue), Swarm (green) or combined mass estimates within Greenland. Missing values indicate gaps in the GRACE time-series.

These results clearly indicate that a combination with close to equal weights of Swarm (GPS) and SLR is closer to the GRACE reference than when weights are based on variance analysis of observation residuals. This can be explained by systematic errors in the kinematic orbits due to the GPS phase processing. In fact, GPS also has to be down-weighted in combination with the GRACE K-band observations (by a factor of approx. 200, see [34]). Recent experiments using one month of kinematic GRACE orbits derived with undifferenced integer-fixed GPS phase ambiguities indicate that the inconsistency between K-band and GPS is reduced. A better assessment will be possible as soon as longer time-series of kinematic orbits based on undifferenced integer-fixed ambiguities become available, but this will be the topic of a future publication.

#### **4. Conclusions**

Satellite laser ranging can contribute to the derivation of mass change estimates at the spatial scale of the Amazon basin, Greenland, or West Antarctica prior to the GRACE mission. SLR and high-low tracking of the Swarm satellites both may be used to bridge the gap between GRACE and GRACE-FO concerning the large scale mass loss in the polar regions of the Earth. When all individual gravity field results are truncated at the same spherical harmonic degree, the results can be compared directly. They match well for Greenland with a somewhat higher inter-annual variability of the SLR monthly mass estimates. Swarm can only be included in the comparison near the end of the life-time of the

GRACE satellites. The seasonal variations visible in Swarm monthly mass estimates seem to be slightly over-estimated. On the other hand, the SLR derived mass loss within Greenland is larger than what can be extracted from GRACE observations between 2014 and 2017. After 2017, both SLR and Swarm results indicate moderate mass gain within Greenland.

For a correct localization of mass change signal along the coast of West Antarctica, a spherical harmonic expansion up to degree 10 is desirable. This can be achieved by the regularization of SLR-only gravity field models or, preferably, by a combination with other LEO, e.g., Swarm data. A combination on the normal equation level, correctly taking into account all correlations between estimated and pre-eliminated parameters, is feasible as long as all satellite data is processed consistently. The best fit between a SLR/Swarm combination and GRACE for the period from January 2015 to August 2016, where both Swarm and GRACE observations of good quality are available, is achieved when SLR and Swarm normal equations are combined with almost equal weights.

It is foreseen to continue the combination of SLR data, taking into account the contributions by different analysis centers, and the combination of SLR with Swarm and possibly also GRACE-FO on the normal equation level in the frame of the Combination Service for Time-variable Gravity field models (COST-G) [34,52], the newly established product center of the International Gravity Field Service (IGFS) under the umbrella of the International Association of Geodesy (IAG).

**Author Contributions:** GRACE processing, derivation of mass trends, conceptualization and writing—U.M.; SLR processing and investigation—K.S.; determination of kinematic GRACE and Swarm orbits—D.A.; Swarm processing—C.D.; software development, supervision and funding—D.T.; supervision and project administration (SLR)—R.D.; supervision and project administration (GRACE, Swarm)—A.J.

**Funding:** This research was partly funded by BKG "Vertrag über die Weiterentwicklung der Berner GNSS Software, 21.04.2017" and by the ESA project Swarm-DISC, contract No. 4000109587/13/I-NB.

**Acknowledgments:** We want to thank three anonymous reviewers for their invaluable comments.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **A New Approach to Earth's Gravity Field Modeling Using GPS-Derived Kinematic Orbits and Baselines**

#### **Xiang Guo \* and Qile Zhao**

GNSS Research Center, Wuhan University, Wuhan 430079, China **\*** Correspondence: xiangguo@whu.edu.cn

Received: 13 June 2019; Accepted: 19 July 2019; Published: 21 July 2019

**Abstract:** Earth's gravity field recovery from GPS observations collected by low earth orbiting (LEO) satellites is a well-established technique, and kinematic orbits are commonly used for that purpose. Nowadays, more and more satellites are flying in close formations. The GPS-derived kinematic baselines between them can reach millimeter precision, which is more precise than the centimeter-level kinematic orbits. Thus, it has long been expected that the more precise kinematic baselines can deliver better gravity field solutions. However, this expectation has not been met yet in practice. In this study, we propose a new approach to gravity field modeling, in which kinematic orbits of the reference satellite and baseline vectors between the reference satellite and its accompanying satellite are jointly inverted. To validate the added value, data from the Gravity Recovery and Climate Experiment (GRACE) satellite mission are used. We derive kinematic orbits and inter-satellite baselines of the twin GRACE satellites from the GPS data collected in the year of 2010. Then two sets of monthly gravity field solutions up to degree and order 60 are produced. One is derived from kinematic orbits of the twin GRACE satellites ('orbit approach'). The other is derived from kinematic orbits of GRACE A and baseline vectors between GRACE A and B ('baseline approach'). Analysis of observation postfit residuals shows that noise in the kinematic baselines is notably lower than the kinematic orbits by 50, 47 and 43% for the along-track, cross-track and radial components, respectively. Regarding the gravity field solutions, analysis in the spectral domain shows that noise of the gravity field solutions beyond degree 10 can be significantly reduced when the baseline approach is applied, with cumulative errors up to degree 60 being reduced by 34%, when compared to the orbit approach. In the spatial domain, the recovered mass changes with the baseline approach are more consistent with those inferred from the K-Band Ranging based solutions. Our results demonstrate that the proposed baseline approach is able to provide better gravity field solutions than the orbit approach. The findings may facilitate, among others, bridging the gap between GRACE and GRACE Follow-On satellite mission.

**Keywords:** Earth's gravity field; kinematic orbit; kinematic baseline; GRACE

#### **1. Introduction**

Knowledge of temporal variations of the Earth's gravity field is of importance to understand large-scale mass transport at and below the Earth's surface. It has been mostly observed with the ultra-precise K-Band ranging (KBR) measurements from the Gravity Recovery and Climate Experiment (GRACE) mission [1] and its successor, i.e., the GRACE Follow-On (GFO) mission. However, there exists a gap between GRACE and GFO. Nowadays, there is a consensus that GPS-based high-low satellite-to-satellite tracking (hl-SST) can play an important role in bridging the gap [2–7]. For that purpose, GPS-derived kinematic orbits are usually taken as pseudo-observations in gravity field modeling. Since GPS data are about three orders of magnitude less precise than the KBR data, they are only sensitive to temporal variations at low spherical harmonic degrees (<20) of the Earth's gravity field. Furthermore, it is well known that kinematic orbits are very sensitive to observation

geometry and various systematic errors, e.g., mismodeling of GPS antenna phase center, high-order ionosphere-induced errors, near-field multipath, and uncalibrated hardware delays [8–10]. As a result, the recovered gravity field solutions usually suffer from systematic errors, particularly at low degrees [11]. Therefore, efforts are still ongoing to further improve the GPS-based gravity field solutions. Among others, GPS-derived kinematic baselines between formation flying satellites have drawn much attention, due to their higher precision than the kinematic orbits.

To exploit the more precise space baselines, two approaches have been proposed in the context of the celestial mechanics approach to gravity field modeling [12]: The 'observation equation approach' and the 'GRACE-type approach' [13]. In the former, individual observation equations for both satellites are first established based on their positions; then the two individual observation equations are differenced to form the baseline observation equations; finally, the normal equations (NEQs) are established from the baseline observation equations. Since only the baselines are used, orbit parameters for one of the two satellites have to be fixed to the a priori values in this approach. Unfortunately, this operation will seriously degrade the gravity field solutions at very low degrees [13,14]. In the GRACE-type approach, individual observation and normal equations for both satellites are established based on their kinematic orbits in a first step; in a second step, the observation and normal equations for baseline vectors or baseline lengths are established based on kinematic baselines; the NEQs from the first and second step are combined to form the final NEQs in the third step. Unlike the observation equation approach, orbit parameters for both satellites are estimated in the GRACE-type approach. Recent studies based on this approach all use baseline lengths to perform gravity field recovery, the improvements are demonstrated to be insignificant [13] or negligible [15], when compared to the gravity field solutions based solely on kinematic orbits.

To make full use of the precise kinematic baseline vectors, we propose a new approach in this study, where kinematic orbits of the reference satellite and kinematic baseline vectors between the reference satellite and its accompanying satellite are taken as observations during gravity field recovery. Hereafter, we denote the new approach as the 'baseline approach'. Accordingly, the approach based solely on kinematic orbits is denoted as the 'orbit approach'. To demonstrate the added value of the new approach, data collected by the GRACE mission in 2010 are used. We first compute the kinematic orbits and baselines of the twin GRACE satellites. Then two sets of monthly gravity field solutions are produced: One is based on kinematic orbits of the twin satellites with the orbit approach (hereafter denoted as the 'orbit solution' where applicable), the other is based on kinematic orbits of GRACE A and kinematic baseline vectors between GRACE A and B with the baseline approach (hereafter denoted as the 'baseline solution' where applicable). To evaluate the quality of the GPS-based solutions, we use as the reference the WHU RL01 monthly gravity field solutions produced at the GNSS Research Center of Wuhan University [16,17]. Those solutions are mainly based on the GRACE KBR measurements and therefore are much more accurate than any GPS-based solutions.

This paper is organized as follows: Section 2 presents in detail the data and methods adopted for gravity field modeling. Results are shown in Section 3, and discussed in Section 4. Finally, Section 5 is left for conclusions.

#### **2. Data and Methods**

#### *2.1. Data and Models*

Data processing in this study is performed with the Position And Navigation Data Analyst (PANDA) software, which is developed at the GNSS Research Center of Wuhan University and has been widely used in precise orbit determination for both GNSS satellites and low Earth orbiters [18,19]. Recently, the dynamic approach to gravity field modeling has been implemented in PANDA and successfully applied to produce GRACE monthly gravity field solutions [16,17,20,21]. In this approach, data processing consists of two steps when the gravity field is estimated from kinematic orbits. In the first step, a priori dynamic orbits are computed by fitting to the kinematic orbits through numerically

integrating the equation of motion defined by the a priori force models (cf. Table 1). During this process, orbits are expressed as truncated Taylor series with respect to the unknown parameters (cf. Table 1), about the computed a priori orbits. The partials with respect to those parameters are obtained by resolving the so-called variational equations. Only non-gravity parameters are adjusted at this step. In the second step, the gravity parameters are included. Based on the partial derivatives with respect to all unknown parameters, daily NEQs are set up for the purpose of the subsequent least-squares adjustment. Arc-specific parameters are then pre-eliminated, and the daily NEQs are accumulated into monthly NEQs. The monthly NEQ matrix is eventually inverted in order to obtain the corrections of the spherical harmonic coefficients (SHCs) with respect to the a priori gravity values.

Details about the data and models used in this study are also described in Table 1. The kinematic orbits of the twin GRACE satellites used in this study are produced in house based on a zero-difference data processing scheme. To achieve the best accuracy, both the single-difference ambiguities between GPS satellites at each GRACE satellite and double-difference ambiguities between the GRACE satellites and GPS satellites are resolved. Accuracy of the kinematic orbits obtained in this way is about 1.5 cm as inferred from satellite laser ranging residuals. More details can be found in Reference [22]. As to the kinematic baselines, they are produced based on a single-difference scheme data processing as described in Reference [23]. In this scheme, the positions of the reference satellite (which is GRACE A in this study), are not estimable, and have to be fixed to the a priori values. This requires accuracy of the a priori orbits to be better than 10 cm for millimeter precision baseline determination [24]. For that purpose, we exploit the abovementioned ambiguity-fixed kinematic orbits, which are fully qualified with an accuracy better than 2 cm. Accuracy of the kinematic baselines obtained in this way is demonstrated to be slightly better than 4 mm, as inferred from KBR residuals, which is consistent with other studies [13,25]. During the solution process, the accelerometer observations are used to model the non-gravitational forces. For each component in the accelerometer frame, the scale factors are estimated on a monthly basis, and the biases are modeled with a three-order polynomial and the polynomial coefficients are estimated on a daily basis to account for thermal variations. Finally, the arc length is set equal in our data processing scheme to 24 h.


**Table 1.** Data and models used for gravity field modeling.

#### *2.2. The Orbit and Baseline Approach*

In the orbit approach, only kinematic orbits of the twin GRACE satellites are taken as observations. The linearized observation equations are expressed as follows:

$$\begin{array}{l} \mathbf{y}\_A = \mathbf{y}\_A^0 + \mathbf{d}\_A^{\eta\_{\mathcal{S}}} \mathbf{x}\_{A^{\mathcal{S}}}^{\eta\_{\mathcal{S}}} + \mathbf{d}\_A^{\mathcal{S}} \mathbf{x}^{\mathcal{S}} + \varepsilon\_{y\_A} \\ \mathbf{y}\_B = \mathbf{y}\_B^0 + \mathbf{d}\_B^{\mathcal{S}} \mathbf{x}\_B^{\mathcal{S}} + \mathbf{d}\_B^{\mathcal{S}} \mathbf{x}^{\mathcal{S}} + \varepsilon\_{y\_B} \end{array} \tag{1}$$

where **y** is the satellite position vector, which is extracted from the kinematic orbits, and **y**<sup>0</sup> is its computed counterpart; **x**ng denotes the non-gravity parameter vector, which consists of satellite initial state vector, accelerometer scale and bias parameters (cf. Table 1); **x**<sup>g</sup> denotes the gravity parameters in the form of SHCs up to degree and order 60; **d**ng and **d**<sup>g</sup> denote the design matrices with respect to the non-gravity and gravity parameters, respectively; Finally, ε denotes the noise which includes noise in both the position and its computed counterpart. Data processing consists of three steps. First, individual observation and normal equations for the two satellites are established based on kinematic orbits. Second, the two individual NEQs are combined, and arc-specific parameters are pre-eliminated from the daily NEQs. Third, the daily NEQs are accumulated into monthly NEQs, which are eventually solved to obtain monthly gravity field solutions.

In the baseline approach, the observations consist of kinematic orbits of GRACE A and kinematic baseline vectors between GRACE A and B. The linearized observation equations are expressed as follows:

$$\begin{array}{l} \mathbf{y}\_A = \mathbf{y}\_A^0 + \mathbf{d}\_A^{\text{reg}} \mathbf{x}\_A^{\text{reg}} + \mathbf{d}\_A^{\text{g}} \mathbf{x}^{\text{g}} + \boldsymbol{\varepsilon}\_{\mathcal{Y}\_A} \\ \mathbf{y}\_{AB} = \mathbf{y}\_{AB}^0 + \mathbf{d}\_B^{\text{reg}} \mathbf{x}\_B^{\text{reg}} - \mathbf{d}\_A^{\text{reg}} \mathbf{x}\_A^{\text{reg}} + \left(\mathbf{d}\_B^{\text{g}} - \mathbf{d}\_A^{\text{g}}\right) \mathbf{x}^{\text{g}} + \boldsymbol{\varepsilon}\_{\mathcal{Y}\_{AB}} \end{array} \tag{2}$$

where **y***AB* and **y**<sup>0</sup> *AB* are the baseline vector and its computed counterpart, respectively; ε*yAB* denote the noise which consists of noise in both the baseline and its counterpart; other symbols are same to those in Equation (1). Data processing includes the following steps: First, kinematic orbits of GRACE B are derived from the kinematic orbits of GRACE A and the kinematic baseline vectors between GRACE A and B, i.e., **y***<sup>B</sup>* = **y***<sup>A</sup>* + **y***AB*. Second, individual observation equations for both satellites are established based on the kinematic orbits. Third, the baseline observation equations are formed by differencing the two orbit observation equations, i.e., **y***AB* = **y***<sup>B</sup>* − **y***A*. Fourth, individual NEQs are established from observation equations of kinematic orbits of GRACE A and observation equations of kinematic baseline vectors between GRACE A and B, respectively. Fifth, the individual NEQs are combined and arc-parameters are pre-eliminated from the daily NEQs. Finally, the daily NEQs are accumulated into monthly NEQs, which are eventually inverted to obtain monthly gravity field solutions.

As described above, six observations per epoch are used in both the orbit and baseline approach. While observations consist of the position vectors of GRACE A and B in the orbit approach, they are composed of the position vector of GRACE A and baseline vector between GRACE A and B in the baseline approach. It should be noted that the same kinematic orbits of GRACE A are used, and the same parameters are set up in both approaches (cf. Table 1). In the following calculations, we further replace the kinematic orbits of GRACE B with those constructed from kinematic orbits of GRACE A and kinematic baseline vectors between GRACE A and B in the context of the orbit approach. Our test experiments show that impacts of this replacement on the gravity field solutions are minor. Now, one may consider that the baseline approach is equivalent to the orbit approach, since the baseline vectors are just linear combinations of the two kinematic orbits used in the orbit approach and this will not change the final solutions in frame of the weighted least square adjustment if the stochastic models for different observations are properly accounted for [32]. However, this is not the case when the observations are contaminated by systematic errors, which has long been a problem for time-varying gravity field recovery from spaceborne GPS data. In that case, the baseline approach has two advantages at least over the orbit approach: First, the baseline vectors are free of common-mode errors from the GPS satellites, due to the adopted single-difference data processing scheme when deriving them from the GPS measurements. Thus, the baselines are more precise than kinematic

orbits calculated with a zero-difference scheme. Second, according to the Hill's equation [33], errors in the initial state vectors and background force models would lead to constant or slowly varying perturbations in the computed a priori orbits which are used to establish the observation equations of kinematic orbits. While these systematic errors will enter into the **O**−**C** (observed minus computed) vectors and contaminate the final gravity field solutions in the orbit approach, they can be mitigated when forming the baseline observation equations through differencing the two orbit observation equations. This is particularly true for the GRACE mission, because errors for the two satellites are largely common, since the two spacecraft are identical and not far separated (about 220 km) in a coplanar orbit. Therefore, one may expect that the baseline approach will deliver better gravity field solutions than the orbit approach, which will also be demonstrated later in this article.

#### *2.3. Data Weighting Scheme*

Generally speaking, errors in kinematic orbits are often temporally correlated, although ambiguity-fixing can largely reduce the correlations [34]. On the other hand, deficiencies in the background force models also lead to correlated errors in the computed dynamic orbits [35]. As a result, noise in the residuals is usually frequency-dependent (i.e., colored). In this study, we adopt the frequency-dependent data weighting (FDDW) concept proposed by Reference [36] to account for the colored noise during the solution process. Recently, the FDDW concept has successfully been applied to GRACE KBR data processing with the classical dynamic approach and has notably reduced noise in the WHU RL01 monthly gravity solutions [17]. To represent the dependence of noise on frequency, we consider noise power spectral density (PSD), which is estimated from postfit observation residuals. For that purpose, we start from a simple assumption of white noise in the observations during the first computation. Then the noise PSDs are estimated from the postfit residuals and are used for applying the FDDW scheme in the second computation. No further iteration is needed, due to the accurate background models. The reader is referred to Reference [36] for more details.

#### **3. Results**

#### *3.1. Analysis in the Spectral Domain*

Figure 1 shows the square-root PSD, hereafter denoted as PSD1/2, estimated from the postfit residuals of kinematic orbits of GRACE A and kinematic baselines between GRACE A and B for a typical month of January 2010. Since the PSD1/2s for kinematic orbits of both satellites and in both approaches show similar patterns, only those for GRACE A are displayed here. It can be observed that the PSD1/2s for both kinematic orbits and baselines exhibit clear frequency-dependent behaviors, which necessitates the FDDW scheme adopted in this study. A comparison between the two sets of PSD1/2s reveals that PSD1/2s for the kinematic baselines are generally lower than those for the kinematic orbits. On average, PSD1/2s for the kinematic baselines are reduced by 50, 47 and 44% at the along-track, cross-track and radial components, respectively, when compared to those for the kinematic orbits. Obviously, the kinematic baselines are of higher precision than the kinematic orbits.

As mentioned above, the KBR-based WHU RL01 monthly gravity field solutions are chosen as the ground truth. In fact, we also tried the official GRACE monthly solutions for that purpose. The results revealed that the differences were negligible. Therefore, to assess the computed GPS-based gravity field solutions, we subtract from them the WHU RL01 solutions and the obtained residual SHCs are interpreted as the 'true' errors of the GPS-based solutions. To facilitate analysis in the spectral domain, we calculate the degree errors (DEG) and cumulative errors (CUM) of different solutions in terms of geoid heights with the following formulas:

$$\begin{aligned}DEG\_l &= R\sqrt{\sum\_{m=0}^{l} \left(\Delta C\_{lm}^2 + \Delta S\_{lm}^2\right)}\\ CLIM\_l &= R\sqrt{\sum\_{i=2}^{l} DG\_i^2} \end{aligned} \tag{3}$$

where R is the reference radius (6,378,137.46 m), Δ*Clm*, Δ*Slm* are the residual SHCs with respect to WHU RL01. Figure 2 displays the degree errors of different GPS-based solutions and degree signals represented by WHU RL01 in January 2010. It can be seen that the degree errors of the baseline solutions are smaller than those of the orbit solutions, except for relatively low degree terms. To make the discussion more comprehensive, we further calculate the RMS (root mean square) for each SHC time series, as well as for the associated formal errors in 2010. The results are displayed in Figure 3, which in general are similar to those displayed in Figure 2. One can see that degree errors beyond degree 10 can be significantly reduced when the baseline approach is applied. One can also observe that the true errors clearly depart from the formal errors below degree 15 in both cases. This may likely be attributed to the presence of systematic errors in the observations, e.g., mismodeling of GPS antenna phase center, high-order ionosphere-induced errors as mentioned before. Table 2 further lists the cumulative geoid height errors up to degree 10, 20 and 60. It reveals that the cumulative errors up to degree 20 and 60 are reduced by 15 and 34%, respectively, when the baseline approach is applied. From these results, we can conclude that the proposed baseline approach can improve the gravity field beyond degree 10.

**Figure 1.** PSD1/2s estimated from postfit residuals of kinematic orbits of GRACE A and kinematic baselines between GRACE A and B.

**Figure 2.** Geoid height errors per degree for different gravity field solutions in January 2010.

**Figure 3.** Geoid height errors per degree for different gravity field solutions (RMS values in 2010).

**Table 2.** Cumulative geoid height errors (cm) up to degree 10, 20 and 60 of different solutions (RMS values in 2010).


#### *3.2. Analysis in the Spatial Domain*

In this section, we compare the ability of the orbit and baseline approaches to recover temporal gravity field variations and mass anomalies in the spatial domain. Since the unfiltered solutions are dominated by high-frequency noise, post-processing is required. In this study, we use the Gaussian filter [37] for that purpose. To account for the impact of the filter width on the results, we consider two radii—750 and 1000 km. Hereafter, we denote the corresponding solutions as the 'G750' and 'G1000' filtered solutions, respectively. As it was done in the spectral domain, we also compare the GPS-based solutions with the KBR-based WHU RL01 solutions. To keep consistency with the GPS-based solutions, the latter is also post-processed with the same Gaussian filters. In addition, the C20 terms, to which GRACE is less sensitive, due to the polar orbits [38], are replaced in all gravity field solutions with the SLR-derived values [39]. As regards the degree 1 terms, which cannot be derived from GRACE data

alone, we use the estimates obtained by combining GRACE data and geophysical models as described in References [40,41].

To perform the analysis in the spatial domain, we first transform the monthly SHCs to mass anomalies in terms of equivalent water heights (EWHs) on a 1◦ × 1◦ grid, as explained in Reference [37]. The correction for the Earth's oblateness has been applied as proposed by Ditmar [42]. As mass variations (if exist) are primarily linear or/and seasonal, a deterministic model composed of an offset, trend, annual, and semi-annual terms is fitted to the time-series of mass anomalies per grid node. Then, a comparison with the WHU RL01 solutions allows us to assess the signals and noise of the mass anomalies inferred from the GPS-based solutions.

#### 3.2.1. Gridded Mass Anomalies

Figure 4 displays geographical maps of the derived periodic annual signals in mass anomalies inferred from the G750 filtered solutions. It can be seen that after applying the G750 filter, the estimates are still dominated by strong high-frequency noise. In the case of the G1000 filtered solutions (Figure 5), high-frequency noise is largely suppressed, and the signal patterns inferred from the GPS-based solutions become similar to WHU RL01. Among others, annual signals over the Amazon river basin in South America can be clearly seen in the GPS-based solutions.

**Figure 4.** Periodic annual signals in mass anomalies in terms of equivalent water heights inferred from different G750 filtered solutions. From top to bottom, the plots display the results based on the WHU-RL01, orbit and baseline solutions, respectively.

**Figure 5.** Same as in Figure 4, but for the G1000 filtered solutions.

We have also computed the RMS differences of gridded mass anomalies between the GPS-based solutions and WHU RL01. The geographical distribution of the RMS differences is displayed in Figure 6. It can be observed that the RMS differences in the case of the baseline solutions are generally smaller than those in case of the orbit solutions when the G750 filter is applied. A further calculation reveals that the weighted mean of the gridded RMS differences (weighted by the cosine of latitude) in the case of the baseline solutions are reduced by 14 and 6% for the G750 and G1000 filtered solutions, respectively, when compared to those in case of the orbit solutions (cf. Table 3). These results demonstrate that the mass anomalies inferred from the baseline solutions are more consistent with WHU RL01 and, therefore, are more accurate, as compared to the orbit solutions.



**Figure 6.** Geographical distribution of the gridded RMS differences with respect to WHU RL01 for different GPS-based solutions. The top and bottom rows show the results for the orbit and baseline solutions, respectively.

#### 3.2.2. Regional Mass Anomalies

To compare the performance of the orbit and baseline approaches in the context of mean regional signals, we select the Amazon river basin as the target regions. As shown in Figure 5, the Amazon river basin exhibits significant annual variations, which are mainly of hydrological origin [43,44].

Figure 7 displays the time series of mean mass anomalies over the Amazon river basin inferred from different solutions in 2010. To evaluate the consistency between the GPS-based solutions and WHU RL01, we calculate the correlation coefficients and RMS differences between the mass anomaly time-series. The results confirm that the GPS-based solutions obtained with the baseline approach are more consistent with WHU RL01, as evidenced by larger correlation coefficients and smaller RMS differences (cf. Table 4).

**Figure 7.** Time series of mean mass anomalies over the Amazon river basin inferred from the G750 (left) and G1000 (right) filtered solutions in 2010.


**Table 4.** Correlation coefficients and RMS differences (cm) between mass anomaly time series over the Amazon river basin inferred from the GPS-based solutions and WHU RL01.

Table 5 further lists the annual amplitudes and phases and the associated formal errors over the Amazon river basin inferred from different solutions. One can see that the GPS-based solutions (after applying a Gaussian filter) can reproduce the signals as WHU RL01 does, particularly when the baseline approach is applied. The signals inferred from the baseline solutions are closer to those inferred from the WHU RL01 solutions when compared to the orbit solutions.

**Table 5.** Annual amplitudes (cm) and phases (day) and the associated formal errors of mass anomalies over the Amazon river basin derived from different solutions.


All these results lead us to conclude that the proposed baseline approach is able to provide better regional mass anomaly estimates, when compared to the orbit approach.

#### **4. Discussion**

As shown in the text, with the proposed baseline approach we have achieved more promising results than previous studies which are based on the GRACE-type approach [13,15], when exploiting GPS-derived kinematic baselines for Earth's gravity field recovery. While improvement of the gravity field solutions was insignificant or even negligible in previous investigations, it is shown to be significant in our case with cumulative errors up to degree 60 being reduced by 34%. We attribute this improvement mainly to the fact that we have made full use of the precise three-dimensional baseline vectors in our approach, whereas only the baseline lengths were used in previous studies.

Our results also show that the gravity field solutions still cannot benefit from the proposed baseline approach at the very low degree terms (below degree 10) at present. Because these low degree terms are very prone to systematic errors in the observations, further efforts are still needed to identify and reduce the possible systematic errors.

Another issue is that we have made use of the high precision accelerometer observations to account for the non-gravitational forces acting on the GRACE satellites, which may not be available for other formation flying satellites. Indeed, high precision accelerometer observations are indispensable to achieve the best possible gravity field solutions from hl-SST data. In fact, it is hardly possible to obtain realistic gravity field solutions if the non-gravitational forces cannot be adequately modeled either with accelerometer observations or with physical models. Thus, this is the prerequisite for both the orbit and baseline approach. Because the kinematic baselines are more precise than kinematic orbits (Figure 1), it is reasonable that the baseline approach can deliver better gravity field solutions, at least for formation flying satellites equipped with high precision accelerometers.

#### **5. Conclusions**

In this study, we have proposed a new approach to Earth's gravity field modeling based on GPS-derived kinematic orbits and baselines from formation flying satellites. In this approach,

kinematic orbits of the reference satellite and kinematic baselines between the reference satellite and its accompanying satellite are taken as observations to perform gravity field recovery (denoted as the 'baseline approach' in the text). Compared to the orbit approach, which is based solely on kinematic orbits, systematic errors in observations and background force models can be suppressed in the proposed approach. To demonstrate the added value of the proposed approach, two sets of monthly gravity field solutions have been produced based on one year of kinematic orbits and baseline vectors of the GRACE satellites using both the baseline and orbit approaches. A comparison in the spectral domain shows that the gravity field solutions beyond degree 10 can be notably improved when the baseline approach is applied, with cumulative errors up to degree 20 and 60 being reduced by 15% and 34%, respectively, as compared to the orbit approach. Analysis in the spatial domain shows that the GPS-based gravity field solutions obtained with the baseline approach are more consistent with the KBR-based solutions. This is evidenced by smaller RMS differences and larger correlation coefficients with respect to the KBR-based solutions. These results, therefore, demonstrate that the proposed baseline approach can provide better time-varying gravity field solutions than the orbit approach. Our findings may facilitate, among others, bridging the gap between GRACE and GFO missions by using hl-SST GPS data from formation flying satellites.

**Author Contributions:** X.G. conceived and performed the experiments, analyzed the data and wrote the paper. Q.Z. supported the research and helped with the text.

**Funding:** This research was funded by the National Natural Science Foundation of China, grant number 41574030, 41374083 and the APC was funded by 41574030.

**Acknowledgments:** The GRACE level-1B data are publicly available from the ftp site: ftp://isdcftp.gfz-potsdam. de/grace/Level-1B. The numerical calculations in this research have been done on the supercomputing system in the Supercomputing Center of Wuhan University. Finally, we thank the three anonymous reviewers for their helpful comments, which helped us improve the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Improved Estimates of Geocenter Variability from Time-Variable Gravity and Ocean Model Outputs**

#### **Tyler C. Sutterley 1,2,***∗***,† and Isabella Velicogna 2,3**


Received: 7 August 2019; Accepted: 4 September 2019; Published: 10 September 2019

**Abstract:** Geocenter variations relate the motion of the Earth's center of mass with respect to its center of figure, and represent global-scale redistributions of the Earth's mass. We investigate different techniques for estimating of geocenter motion from combinations of time-variable gravity measurements from the Gravity Recovery and Climate Experiment (GRACE) and GRACE Follow-On missions, and bottom pressure outputs from ocean models. Here, we provide self-consistent estimates of geocenter variability incorporating the effects of self-attraction and loading, and investigate the effect of uncertainties in atmospheric and oceanic variation. The effects of self-attraction and loading from changes in land water storage and ice mass change affect both the seasonality and long-term trend in geocenter position. Omitting the redistribution of sea level affects the average annual amplitudes of the *x*, *y*, and *z* components by 0.2, 0.1, and 0.3 mm, respectively, and affects geocenter trend estimates by 0.02, 0.04 and 0.05 mm/yr for the the *x*, *y*, and *z* components, respectively. Geocenter estimates from the GRACE Follow-On mission are consistent with estimates from the original GRACE mission.

**Keywords:** GRACE; GRACE-FO; time-variable gravity; geocenter; reference frames; self-attraction and loading

#### **1. Introduction**

Variations in the Earth's geocenter reflect the largest scale variability of mass within the Earth system, and are essential inclusions for the complete recovery of surface mass change from time-variable gravity [1,2]. The Earth's geocenter is the difference between the Earth's center of mass and center of figure, which is represented by the degree one spherical harmonic terms [3,4]. Estimates of geocenter position have important applications in the determination of terrestrial reference frame variations, satellite altimeter orbit fluctuations, and mass transport from time-variable gravity [1,5]. Geocenter positions are not stationary. Variations in geocenter have been attributed to changes in terrestrial water storage, glacier and ice sheet mass, atmospheric and oceanic circulation, geodynamic processes and other mass transport processes, Figure 1 [1,6–8].

Measurements of time-variable gravity from the Gravity Recovery and Climate Experiment (GRACE) and the GRACE Follow-On (GRACE-FO) missions are set in a center of mass (CM) reference frame, in which the total degree one variations are inherently zero [9–11]. However, the individual contributions to degree one variations in the CM reference frame, such as from oceanic processes or terrestrial water storage change, are not necessarily zero [4]. Applications set in a center of figure (CF) reference frame, such as the recovery of mass variations of the oceans, hydrosphere and cryosphere, also require the inclusion of degree one terms to be fully accurate [2,4]. The exclusion of degree one terms can have a significant impact on estimates of ocean mass [12], ice sheet mass change [13] and terrestrial hydrology [14] due to far-field signals leaking into each regional estimate.

There are presently two methods that regularly provide estimates of geocenter variability: (1) measurements from satellite laser ranging (SLR) [5,15] and (2) calculations from time-variable gravity and ocean model outputs [2,16]. Global inversions of GRACE data, GPS data, and modeled ocean bottom pressure can also provide long-term or detrended estimates of geocenter variability [17,18]. Trends in SLR-derived solutions can be contaminated by network effects, such as station drift [19,20]. They are not necessarily related to long-term changes in geocenter position. Geocenter estimates from inversions of time-variable gravity and ocean model outputs can provide trend estimates. However, they require information about the oceanic redistribution of mass from changes in terrestrial water storage and ice mass change [2]. Here, we improve geocenter estimates from time-variable gravity and ocean model output combinations by including the effects of self-attraction and loading. We test the overall sensitivity of the estimates to uncertainties in atmospheric pressure, ocean bottom pressure and glacial isostatic adjustment. In the following sections, we discuss (1) the data utilized for this research, (2) how we implement and test our method, (3) the results from our method and (4) the overall implications of the research.

**Figure 1.** Schematic of the major geophysical processes contributing to mass variability sensed by time-variable gravity.

#### **2. Data**

#### *2.1. Time-Variable Gravity*

We use monthly GRACE/GRACE-FO Release-6 (RL06) gravity solutions provided by the University of Texas Center for Space Research (UTCSR), the German Research Centre for Geosciences (GeoForschungsZentrum, GFZ) and the Jet Propulsion Laboratory (JPL) for the period April 2002–May 2019 [9–11,21]. Each Level-2 gravity field solution consists of fully-normalized spherical harmonic coefficients (*C*˜ *lm*, *S*˜ *lm*) of degree, *l*, and order, *m*. We substitute the *C*˜ 20 coefficients derived from GRACE/GRACE-FO data with estimates from satellite laser ranging (SLR) from NASA Goddard Space Flight Center due to anomalous variability in the GRACE-derived coefficients [22]. For periods when the satellite pairs had a single fully-operating accelerometer, we replace the *C*˜ <sup>30</sup> coefficients derived from GRACE/GRACE-FO data with SLR-derived estimates [22]. The monthly GRACE/GRACE-FO RL06 products have non-tidal atmospheric mass variability removed using outputs from the European Centre for Medium-Range Weather Forecasts (ECMWF) and ERA-Interim [23]. Non-tidal oceanic mass variability is removed from the Release-6 GRACE/GRACE-FO data using outputs from the Max Planck Institute ocean model (MPIOM) [24]. We correct for the effects of Glacial Isostatic Adjustment (GIA), which is the viscoelastic response to

changes in ice mass since the last glacial maximum, using outputs from the A et al. [25] compressible Earth model using ICE-6G ice history. The impact of different ice histories and mantle rheologies are tested using expected GIA rates from Caron et al. [26]. We account for the elastic deformation of the solid Earth induced by variations in surface mass loading using load Love numbers of gravitational potential, *kl*, calculated by Wahr et al. [27]. The degree one load Love number of gravitational potential, *k*1, is updated to reflect the use of a CF reference frame [28]. We smooth the GRACE/GRACE-FO coefficients using a 300 km radius Gaussian averaging function to reduce the impact of random spherical harmonic errors [4,29]. We filter the coefficients to reduce the impact of correlated north/south "striping" errors [30].

As an independent assessment of geocenter variability, we use monthly estimates derived from satellite laser ranging (SLR) [15]. The traditional SLR-derived geocenter solutions reflect the best fit to the center of the SLR network (CN) [31]. However, SLR-derived solutions can be affected by network-effects when individual stations drift relative to each other due to tectonics, surface loading or other processes [19]. Here, we use both the traditional CN-CM solutions [15] and the new UTCSR CF-CM solutions that try to account for these network-effect range biases [5]. We correct both sets of SLR solutions for the gravitational effects of non-tidal atmospheric and oceanic variability using the 6-h Release-6 de-aliasing product provided by GFZ [23].

#### *2.2. Atmospheric Reanalyses and Ocean Models*

We estimate the effect of uncertainty in atmospheric pressure by comparing the geocenter outputs from the GRACE/GRACE-FO atmospheric de-aliasing products (GAA) with estimates from ERA-Interim [32], MERRA-2 [33], NCEP-DOE-2 [34] and JRA-55 [35] reanalysis outputs. ERA-Interim is computed by ECMWF and is available starting from 1979 [32]. MERRA-2 is computed by the NASA Global Modeling and Assimilation Office (GMAO) and is available starting from 1980 [33]. NCEP-DOE-2 is computed by the National Centers for Environmental Prediction (NCEP) and is available starting from 1979 [34]. JRA-55 is computed by the Japan Meteorological Agency (JMA) and is available starting from 1958 [35]. Atmospheric pressure anomalies for each reanalysis are calculated relative to the mean over 2001–2002.

We estimate the effect of uncertainty in oceanic circulation by substituting the GRACE/GRACE-FO ocean bottom pressure (OBP) product (GAD) derived from MPIOM with estimates from ECCO-JPL near real-time Kalman-filtered (kf080i) simulations [36,37] and ECCO Version 4 Release 3 (V4r3) simulations [38,39]. Each ocean model incorporates different model physics, data assimilation schemes, and atmospheric forcings. MPIOM is a global ocean model forced with atmospheric outputs from ECMWF medium-range forecasts, and is coupled with a prognostic sea ice model [24]. ECCO-JPL is a regional ocean model configured to resolve tropical ocean circulation (79.5◦S–78.5◦N latitudinal limits) that is forced with atmospheric outputs from NCEP Reanalysis, and the model does not include sea ice [36]. ECCO V4r3 is a global ocean model that is forced with atmospheric outputs from ERA-Interim, and is coupled with a sea ice model [38,40,41]. The OBP data from ECCO V4r3 were converted from anomalies relative to depth, *φbot*, into a time series of absolute OBP in pascals using the model standard density, *ρ*0, and the average gravitational acceleration of the Earth, *g*. As Boussinesq-type models conserve volume rather than mass, OBP anomalies for each ECCO model were calculated relative to the global average OBP at each time step [42]. Temporal anomalies in OBP are calculated relative to the mean over 2003–2007, following the JPL Tellus OBP product documentation [36].

#### **3. Methods**

A change in surface mass density, *σ*(*θ*, *φ*, *t*), of a thin layer at the Earth's surface at time, *t*, colatitude, *θ*, and longitude, *φ*, can be decomposed into a series of fully-normalized spherical harmonic coefficients, *C*˜ *lm*(*t*) and *S*˜ *lm*(*t*), after allowing for the Earth's elastic response [4].

$$\left\{ \begin{matrix} \xi\_{lm}(t) \\ \xi\_{lm}(t) \end{matrix} \right\} = \frac{3}{4\pi a\rho\_e} \frac{1+k\_l}{2l+1} \int \sigma(\theta,\phi,t) \, P\_{lm}(\cos\theta) \begin{Bmatrix} \cos m\phi \\ \sin m\phi \end{Bmatrix} \, d\Omega \tag{1}$$

where *a* is average radius of the Earth, *ρ<sup>e</sup>* is the average density of the Earth, *kl* is the gravitational load Love number of degree *l*, *Plm* is the associated Legendre Polynomial of degree *l* and order *m*, and *d*Ω is the element of solid angle sin *θ dθ dφ*. The change in the surface mass density field, *σ*(*θ*, *φ*, *t*), can be calculated through a summation of the fully-normalized spherical harmonics as shown in Wahr et al. [4].

$$\sigma(\theta,\phi,t) = \frac{a\rho\_c}{3} \sum\_{l=0}^{\infty} \sum\_{m=0}^{l} \frac{2l+1}{1+k\_l} P\_{lm}(\cos\theta) \left\{ \bar{\mathbb{C}}\_{lm}(t) \cos m\phi + \bar{\mathbb{S}}\_{lm}(t) \sin m\phi \right\} \tag{2}$$

For this work, we use surface mass spherical harmonic coefficients, denoted here as *Clm*(*t*) and *Slm*(*t*), which are calculated from the fully-normalized spherical harmonic coefficients, *C*˜*lm*(*t*) and *S*˜ *lm*(*t*).

$$\left\{ \begin{matrix} \mathbb{C}\_{lm}(t) \\ \mathbb{S}\_{lm}(t) \end{matrix} \right\} = \frac{a\rho\_c}{3} \frac{2l+1}{1+k\_l} \begin{Bmatrix} \mathbb{C}\_{lm}(t) \\ \mathbb{S}\_{lm}(t) \end{Bmatrix} \tag{3}$$

Time-variable gravity measurements from GRACE and GRACE-FO are set in a reference frame of instantaneous center of mass (CM) [9–11]. In this reference frame, spherical harmonic coefficients of degree one, *C*10, *C*11, and *S*11, are inherently zero [4]. However, applications set in a center of figure (CF) reference frame require estimates of degree one variability to be accurate [2,4]. Here, we estimate the degree one variations by combining time-variable gravity measurements with ocean model outputs following Swenson et al. [2]. We start by partitioning the surface mass density changes into the individual land and ocean components (Equation (4)). For this, we use an ocean function, *ϑ*(*θ*, *φ*), with coastlines buffered by 300 km in order to limit the leakage of mass from the land into the ocean estimate [4].

$$\begin{aligned} \sigma(\theta,\phi,t) &= \sigma\_{\text{laml}}(\theta,\phi,t) + \sigma\_{\text{ocann}}(\theta,\phi,t) \\ \sigma\_{\text{ocann}}(\theta,\phi,t) &= \theta(\theta,\phi) \,\sigma(\theta,\phi,t) \end{aligned} \tag{4}$$

The oceanic components of the degree one spherical harmonics (*Cocean* <sup>10</sup> , *<sup>C</sup>ocean* <sup>11</sup> , and *<sup>S</sup>ocean* <sup>11</sup> ) can be calculated from the changes in ocean mass, *σocean*(*θ*, *φ*, *t*) [2,4].

$$\begin{aligned} C\_{10}^{\text{ocam}}(t) &= \frac{1}{4\pi t} \int P\_{10}(\cos\theta) \,\sigma\_{\text{ocam}}(\theta, \phi, t) \, d\Omega\\ C\_{11}^{\text{scam}}(t) &= \frac{1}{4\pi} \int P\_{11}(\cos\theta) \,\sigma\_{\text{ocam}}(\theta, \phi, t) \, \cos\phi \, d\Omega\\ S\_{11}^{\text{scam}}(t) &= \frac{1}{4\pi} \int P\_{11}(\cos\theta) \,\sigma\_{\text{ocam}}(\theta, \phi, t) \, \sin\phi \, d\Omega \end{aligned} \tag{5}$$

Assuming that the changes in oceanic mass can be determined from the global surface mass density field using an ocean function (Equation (4)), the oceanic contributions to degree one can also be estimated from the global set of spherical harmonics [2,4]. Here, we separate the degree one terms in the spherical harmonic summation from the higher degree terms.

$$\begin{aligned} \mathcal{C}\_{10}^{\text{ceman}}(t) &= \frac{\mathcal{C}\_{10}(t)}{4\pi} \int P\_{10}(\cos\theta) \,\theta(\theta,\phi) \, P\_{10}(\cos\theta) \,d\Omega + \\ &\frac{\mathcal{C}\_{11}(t)}{4\pi} \int P\_{10}(\cos\theta) \,\theta(\theta,\phi) \, P\_{11}(\cos\theta) \,\cos\phi \,d\Omega + \\ &\frac{S\_{11}(t)}{4\pi} \int P\_{10}(\cos\theta) \,\theta(\theta,\phi) \, P\_{11}(\cos\theta) \,\sin\phi \,d\Omega + \\ &\frac{1}{4\pi} \int P\_{10}(\cos\theta) \,\theta(\theta,\phi) \, \sum\_{l=2}^{\infty} \sum\_{m=0}^{l} P\_{lm}(\cos\theta) \,\left\{C\_{lm}(t)\cos m\theta + S\_{lm}(t)\sin m\phi\right\} \,d\Omega \end{aligned} \tag{6}$$

$$\begin{aligned} \xi\_{11}^{\text{cancel}} &= \frac{C\_{10}(t)}{4\pi} \int P\_{11}(\cos\theta) \cos\phi \,\theta(\theta,\phi) \,P\_{10}(\cos\theta) \,d\Omega + \\ &\frac{C\_{11}(t)}{4\pi} \int P\_{11}(\cos\theta) \,\cos\phi \,\theta(\theta,\phi) \,P\_{11}(\cos\theta) \,\cos\phi \,d\Omega + \\ &\frac{S\_{11}(t)}{4\pi} \int P\_{11}(\cos\theta) \,\cos\phi \,\theta(\theta,\phi) \,P\_{11}(\cos\theta) \,\sin\phi \,d\Omega + \\ &\frac{1}{4\pi} \int P\_{11}(\cos\theta) \,\cos\phi \,\theta(\theta,\phi) \,\sum\_{l=2}^{\infty} \sum\_{m=0}^{l} P\_{lm}(\cos\theta) \,\left\{C\_{lm}(t)\cos m\phi + S\_{lm}(t)\sin m\phi\right\} \,d\Omega \\ &S\_{11}^{\text{causal}}(t) = \frac{C\_{10}(t)}{4\pi} \int P\_{11}(\cos\theta) \,\sin\phi \,\theta(\theta,\phi) \,P\_{10}(\cos\theta) \,d\Omega + \\ &\frac{C\_{11}(t)}{4\pi} \int P\_{11}(\cos\theta) \,\sin\phi \,\theta(\theta,\phi) \,P\_{11}(\cos\theta) \,\cos\phi \,d\Omega + \\ &\frac{S\_{11}(t)}{4\pi} \int P\_{11}(\cos\theta) \,\sin\phi \,\theta(\theta,\phi) \,P\_{11}(\cos\theta) \,\sin\phi \,d\Omega + \end{aligned} \tag{8}$$

$$\frac{1}{4\pi t} \int P\_{11}(\cos\theta) \left. \sin\phi \,\theta \,(\theta,\phi) \right| \sum\_{l=2}^{\infty} \sum\_{m=0}^{l} P\_{lm}(\cos\theta) \left\{ \mathbb{C}\_{lm}(t) \cos m\phi + \mathbb{S}\_{lm}(t) \sin m\phi \right\} \,d\Omega = 0$$

If the oceanic contributions to degree one variability (*Cocean* <sup>10</sup> , *<sup>C</sup>ocean* <sup>11</sup> , and *<sup>S</sup>ocean* <sup>11</sup> ) can be estimated from an ocean model, then the unknown complete degree one terms (*C*10, *C*11, and *S*11) in Equations (6)–(8) can be calculated from the residual between the oceanic degree one terms and the measured mass change over the ocean calculated using all other degrees of the global spherical harmonics [2].

$$
\begin{bmatrix}
\mathbb{C}\_{10}(t) \\
\mathbb{C}\_{11}(t) \\
\mathbb{C}\_{11}(t)
\end{bmatrix} = \begin{bmatrix}
I\_{10\mathsf{C}}^{10\mathsf{C}} & I\_{11\mathsf{C}}^{10\mathsf{C}} & I\_{11\mathsf{S}}^{10\mathsf{C}} \\
I\_{10\mathsf{C}}^{11\mathsf{C}} & I\_{11\mathsf{C}}^{11\mathsf{C}} & I\_{11\mathsf{S}}^{11\mathsf{C}} \\
I\_{10\mathsf{C}}^{11\mathsf{S}} & I\_{11\mathsf{S}}^{11\mathsf{S}} & I\_{11\mathsf{S}}^{11\mathsf{S}} \\
I\_{10\mathsf{C}}^{11\mathsf{S}} & I\_{11\mathsf{C}}^{11\mathsf{S}} & I\_{11\mathsf{S}}^{11\mathsf{S}} \\
\end{bmatrix}^{-1} \begin{bmatrix}
\mathsf{C}\_{10}^{\text{cearn}}(t) - \mathsf{G}\_{10\mathsf{C}}(t) \\
\mathsf{C}\_{11}^{\text{cearn}}(t) - \mathsf{G}\_{11\mathsf{C}}(t) \\
\mathsf{S}\_{11}^{\text{cearn}}(t) - \mathsf{G}\_{11\mathsf{S}}(t)
\end{bmatrix} \tag{9}
$$

The *I*-matrix in Equation (9) is comprised of the degree one terms in the spherical harmonic summations from Equations (6)–(8) [2,4]:

*I* <sup>10</sup>*<sup>C</sup>* <sup>10</sup>*<sup>C</sup>* <sup>=</sup> <sup>1</sup> 4*π P*10(cos *θ*) *ϑ*(*θ*, *φ*) *P*10(cos *θ*) *d*Ω *I* <sup>10</sup>*<sup>C</sup>* <sup>11</sup>*<sup>C</sup>* <sup>=</sup> <sup>1</sup> 4*π P*10(cos *θ*) *ϑ*(*θ*, *φ*) *P*11(cos *θ*) cos *φ d*Ω *I* <sup>10</sup>*<sup>C</sup>* <sup>11</sup>*<sup>S</sup>* <sup>=</sup> <sup>1</sup> 4*π P*10(cos *θ*) *ϑ*(*θ*, *φ*) *P*11(cos *θ*) sin *φ d*Ω *I* <sup>11</sup>*<sup>C</sup>* <sup>10</sup>*<sup>C</sup>* <sup>=</sup> <sup>1</sup> 4*π P*11(cos *θ*) cos *φ ϑ*(*θ*, *φ*) *P*10(cos *θ*) *d*Ω *I* <sup>11</sup>*<sup>C</sup>* <sup>11</sup>*<sup>C</sup>* <sup>=</sup> <sup>1</sup> 4*π P*11(cos *θ*) cos *φ ϑ*(*θ*, *φ*) *P*11(cos *θ*) cos *φ d*Ω *I* <sup>11</sup>*<sup>C</sup>* <sup>11</sup>*<sup>S</sup>* <sup>=</sup> <sup>1</sup> 4*π P*11(cos *θ*) cos *φ ϑ*(*θ*, *φ*) *P*11(cos *θ*) sin *φ d*Ω *I* <sup>11</sup>*<sup>S</sup>* <sup>10</sup>*<sup>C</sup>* <sup>=</sup> <sup>1</sup> 4*π P*11(cos *θ*) sin *φ ϑ*(*θ*, *φ*) *P*10(cos *θ*) *d*Ω *I* <sup>11</sup>*<sup>S</sup>* <sup>11</sup>*<sup>C</sup>* <sup>=</sup> <sup>1</sup> 4*π P*11(cos *θ*) sin *φ ϑ*(*θ*, *φ*) *P*11(cos *θ*) cos *φ d*Ω *I* <sup>11</sup>*<sup>S</sup>* <sup>11</sup>*<sup>S</sup>* <sup>=</sup> <sup>1</sup> 4*π P*11(cos *θ*) sin *φ ϑ*(*θ*, *φ*) *P*11(cos *θ*) sin *φ d*Ω (10)

The measured mass changes over the ocean, *G*10*C*, *G*11*C*, and *G*11*S*, are estimated from time-variable gravity measurements of degree 2 and greater as listed in Equations (6)–(8) [2].

$$\begin{aligned} G\_{10\gets}(t) &= \frac{1}{4\pi} \int P\_{10}(\cos\theta) \,\theta(\theta,\varphi) \sum\_{l=2}^{\infty} \sum\_{m=0}^{l} P\_{lm}(\cos\theta) \left\{ \mathbb{C}\_{lm}(t) \cos m\phi + S\_{lm}(t) \sin m\phi \right\} \,d\Omega\\ G\_{11\gets}(t) &= \frac{1}{4\pi} \int P\_{11}(\cos\theta) \cos\phi \,\theta(\theta,\varphi) \sum\_{l=2}^{\infty} \sum\_{m=0}^{l} P\_{lm}(\cos\theta) \left\{ \mathbb{C}\_{lm}(t) \cos m\phi + S\_{lm}(t) \sin m\phi \right\} \,d\Omega\\ G\_{11S}(t) &= \frac{1}{4\pi} \int P\_{11}(\cos\theta) \sin\phi \,\theta(\theta,\varphi) \sum\_{l=2}^{\infty} \sum\_{m=0}^{l} P\_{lm}(\cos\theta) \left\{ \mathbb{C}\_{lm}(t) \cos m\phi + S\_{lm}(t) \sin m\phi \right\} \,d\Omega \end{aligned} \tag{11}$$

#### *3.1. Eustatic Sea Level from Land Surface Fluxes*

The MPIOM model used to correct GRACE/GRACE-FO data for ocean circulation does not include the effects of land-sea exchange and the corresponding sea level response [24]. As the land-sea flux signal would be present in the ocean mass variations, we estimate the monthly land-sea flux using GRACE/GRACE-FO time-variable gravity measurements when calculating the oceanic contributions to degree one (*Cocean* <sup>10</sup> , *<sup>C</sup>ocean* <sup>11</sup> , and *<sup>S</sup>ocean* <sup>11</sup> ) [12]. We use the same 300 km buffered coastline mask to calculate our land function (L = 1 − *ϑ*). At each time step, we use solutions to the sea level equation to calculate the spatial pattern of sea level variation induced by the change in land mass [43–45]. These changes in sea level are often referred to as the effects of self-attraction and loading (SAL) or as the sea level fingerprints (SLF) of the land mass change [46]. As the SLFs can differ significantly from global ocean averages, geocenter estimates that assume a uniform redistribution of terrestrial water fluxes can be negatively impacted [16]. Here, we use a pseudo-spectral approach for solving the sea level equation in which we assume the Earth deforms elastically from the modern-day mass change without the manifestation of viscoelastic effects [47,48]. When calculating each SLF, we use a static ocean function with the same 300 km buffered coastlines to verify that mass is being conserved.

#### *3.2. Iterated Solutions*

In order to estimate the full component of land-sea mass transport, an initial estimate of the geocenter variability is needed to calculate the total land mass [2]. Initially, we use annual and semi-annual geocenter estimates from Chen et al. [7], which are calculated from land-surface model estimates of soil moisture and snow water equivalent. We then calculate the eustatic sea level change induced from the land-sea flux and use it to generate a full geocenter estimate with Equation (9). The initial geocenter estimate from Chen et al. [7] is then replaced with the newly calculated estimate, and the sequence is repeated until the difference between successive iterations falls below a threshold value (see flowchart in Figure 2).

#### *3.3. Spherical Harmonics of Atmospheric and Oceanic Variability*

Changes in atmospheric pressure, *p*, and ocean bottom pressure, *pbot*, at colatitude, *θ*, and longitude, *φ*, will impact the Earth's gravitational field. These changes can be decomposed into a series of fully-normalized spherical harmonic coefficients representing the induced gravitational change after allowing for the Earth's elastic response [49,50]:

$$\begin{Bmatrix} \tilde{\mathbb{C}}\_{lm}(t) \\ \tilde{\mathbb{S}}\_{lm}(t) \end{Bmatrix} = \frac{3}{4\pi\rho\_{\ell}} \frac{1+k\_{l}}{(2l+1)} \int \xi\_{l}^{\mathbb{Z}}(\theta,\phi,t) \, P\_{lm}(\cos\theta) \begin{Bmatrix} \cos m\phi \\ \sin m\phi \end{Bmatrix} \, d\Omega \tag{12}$$

The vertical integral, *ξl*(*θ*, *φ*, *t*), in Equation (12) is determined based on assumptions of the Earth's geometry and the vertical structure of the atmosphere or ocean [49].

Here, spherical harmonics from NCEP-DOE-2 and JRA-55 reanalyses are calculated assuming a thin layer two-dimensional atmosphere with a realistic Earth geometry incorporating the model orography, *h*(*θ*, *φ*), and estimates of geoid height, *N*(*θ*, *φ*) (Equations (12) and (13)), while harmonics from ERA-Interim and MERRA-2 reanalyses are calculated assuming a three-dimensional atmospheric geometry integrating over the model layers (Equations (12) and (14)) [49].

$$\xi\_l(\theta,\phi,t) = \left(\frac{a + h(\theta,\phi) + N(\theta,\phi)}{a}\right)^{l+2} \frac{p\_0(\theta,\phi,t)}{g(\theta,\phi)}\tag{13}$$

$$\xi\_l^x(\theta,\phi,t) = -\int\_{p\_0}^0 \left(\frac{a + z(\theta,\phi) + N(\theta,\phi)}{a}\right)^{l+2} \frac{dp}{g(\theta,\phi,z)}\tag{14}$$

Similarly, spherical harmonics from ECCO kf080i and ECCO V4r3 ocean bottom pressure are calculated assuming a thin layer two-dimensional ocean with a realistic Earth geometry incorporating estimates of geoid height and ocean bathymetry, *d*(*θ*, *φ*) (Equations (12) and (15)).

$$\xi\_l(\theta,\phi,t) = \left(\frac{a + N(\theta,\phi) - d(\theta,\phi)}{a}\right)^{l+2} \frac{p\_{\text{bot}}(\theta,\phi,t)}{g(\theta,\phi)}\tag{15}$$

#### *3.4. Time Series Analysis*

We calculate the average geocenter change by simultaneously fitting a least-squares model with constant, linear and quadratic terms with annual, semiannual and 161-day oscillating terms to the estimate geocenter time series [51]. The 161-day oscillating terms account for aliasing of the S2 tidal constituent in the monthly GRACE/GRACE-FO time-variable gravity fields [52].

**Figure 2.** Flowchart of our processing scheme for estimating geocenter variations from time-variable gravity data and ocean model outputs. Blue nodes denote datasets, red nodes denote calculations, and the green node denotes the final converged solution.

#### **4. Results**

#### *4.1. Simulated Geocenter Estimates*

We test the efficacy of our methodology for estimating geocenter variations by running experiments using fields of simulated global mass variability [2]. Sets of coefficients are constructed using estimates of terrestrial water storage (TWS) calculated from the Global Land Data Assimilation Systems (GLDAS) NOAH land surface model [53], estimates of surface mass balance (SMB) change for glaciers and ice caps calculated from the Regional Atmospheric and Climate Model (RACMO2.3) [54,55] and estimates of Greenland and Antarctic ice sheet mass balance from the mass budget method (MBM) [56]. Monthly GLDAS TWS estimates were calculated by combining the snow water equivalent, canopy water storage and soil moisture variables for non-glaciated regions [57]. We include the mass changes from glaciers and ice sheets to incorporate more processes that affect inter-annual geocenter variability [8]. The SMB of a glacier represents the sum of mass accumulation from snow and rain minus the surface ablation from meltwater runoff, sublimation, and snow drift erosion [58,59]. Cumulative anomalies in SMB were calculated for glaciers and ice caps in reference to a 1961–1990 baseline [60]. MBM estimates of ice sheet mass balance were calculated combining SMB outputs from RACMO2 with estimates of total ice discharge [56,61]. The sea level fingerprints of the synthetic data were calculated as an estimate of oceanic variability [48]. The ocean function used when calculating the sea level fingerprints in the synthetic is an update of Hall et al. [62] that incorporates Antarctic grounded ice delineations [63] and more accurate Greenland coastlines [64].

We test three different scenarios: (1) a uniform ocean redistribution of the land mass change with a static seasonal geocenter estimate similar to Swenson et al. [2], (2) a uniform ocean distribution of the land mass change that iteratively solves for the geocenter and (3) an oceanic redistribution taking into account self-attraction and loading effects that iteratively solves for the geocenter. Each model run uses spherical harmonic coefficients of degree two and greater that are calculated from the synthetic data. The harmonics are truncated to degree and order 60 and processed using the same 300 km Gaussian averaging function and decorrelation filter used with the GRACE/GRACE-FO data [4,29,30]. In order to calculate the full component of the land mass change, we include seasonal estimates of degree one variation from Chen et al. [7]. In the two iteration scenarios, we replace the degree one coefficients for each run with the calculated results of the previous run. The process is repeated until the difference between estimates on successive iterations falls below a threshold value.

We compare the results of each scenario against the exact time series of degree one variations calculated from the synthetic dataset (Figure 3). The RMS differences between the scenarios and the original time series are 0.32 mm, 0.21 mm, and 0.11 mm for the static, iterated, and iterated SLF scenarios, respectively. This represents differences of 21%, 14%, and 7%, respectively, for each of the three scenarios compared to the standard deviation of the original time series. Calculating the land mass change using a static seasonal estimate of geocenter variability underestimates both the short-term and long-term variability in geocenter motion. Self-attraction and loading effects limit the recovery of geocenter variations in the two scenarios that assume a uniform redistribution of land-sea fluxes. The recovery of the seasonal variability of each coefficient is significantly improved when self-attraction and loading effects are incorporated (Figure 4).

**Figure 3.** Time series of actual and recovered geocenter variations, (**a**) *x*, (**b**) *y* and (**c**) *z*, in mm from synthetic fields of mass variability derived from GLDAS NOAH v2.1 land surface model outputs [53], RACMO2.3 surface mass balance outputs [54,55] and ice sheet mass balance estimates [56]. The gray line is the original degree one time series calculated directly from the synthetic dataset. The orange, purple and green lines are the derived degree one time series using different scenarios to calculate the land-sea fluxes. The orange line uses a static seasonal geocenter from Chen et al. [7], the purple line uses an iterated self-consistent geocenter, and the green line uses an iterated self-consistent geocenter taking into account the effects of self-attraction and loading.

**Figure 4.** Annual amplitudes of (**a**) actual and (**b–d**) recovered degree one surface mass variations in mm water equivalent calculated from synthetic fields of mass variability derived from GLDAS NOAH v2.1 land surface model outputs [53], RACMO2.3 surface mass balance outputs [54,55] and ice sheet mass balance estimates [56]. In the recovered solutions, the land-sea exchange is estimated using (**b**) a static seasonal geocenter from Chen et al. [7], (**c**) an iterated self-consistent geocenter, and (**d**) an iterated self-consistent geocenter taking into account the effects of self-attraction and loading.

#### *4.2. Recovered Geocenter Estimates*

We estimate sets of degree one coefficients that are corrected for the effects of non-tidal atmospheric and oceanic variation [2]. These coefficients are directly applicable to time-variable gravity applications for estimating surface mass change [4]. We calculate geocenter variability estimates for each of the processing centers (CSR, GFZ, JPL) of the GRACE/GRACE-FO Science Data System (SDS) using the iterated self-consistent geocenter method that incorporates self-attraction and loading effects (Figure 5). Geocenter results from all three processing centers largely agree in terms of the amplitude and phase of the seasonal signal, along with the long-term trend in each coefficient. Self-attraction and loading effects impact the trend of the estimated geocenter solution for all three processing centers (Figure 6). Using a static seasonal geocenter to estimate the total land-sea flux results in weaker trends for all three processing centers. The trend and annual amplitudes of each coefficient calculated using time-variable gravity fields from each processing center are listed in Table 1. Annual amplitudes for solutions derived from satellite laser ranging (SLR) after correcting for non-tidal atmospheric and oceanic variability are listed for comparison [5,15]. The listed uncertainties for each coefficient are 95% confidence intervals, but do not take into account spherical harmonic errors or uncertainties in the geophysical corrections.

**Table 1.** Geocenter motion annual amplitudes, annual phase, and trends for 2002–2017 derived using satellite laser ranging (SLR) and time-variable gravity fields from the Center for Space Research (CSR), the German Research Centre for Geosciences (GFZ) and the Jet Propulsion Laboratory (JPL) corrected for the effects of non-tidal atmospheric and oceanic variation. Errors denote the 95% confidence level.

**Figure 5.** Time series of recovered geocenter variations, (**a**) *x*, (**b**) *y* and (**c**) *z*, in mm calculated using an iterated self-consistent geocenter with self-attraction and loading effects from time-variable gravity fields provided by the Center for Space Research (orange), the German Research Centre for Geosciences (purple) and the Jet Propulsion Laboratory (green). The gray shading denotes the period between the GRACE and GRACE-FO missions.

**Figure 6.** Trends in recovered degree one surface mass variations in mm water equivalent calculated from time-variable gravity fields provided by the Center for Space Research (**a**–**c**), the German Research Centre for Geosciences (**d**–**f**) and the Jet Propulsion Laboratory (**g**–**i**). The land-sea exchange is calculated in the first column (**a**,**d**,**g**) with a static seasonal geocenter from Chen et al. [7], in the second column (**b**,**e**,**h**) with an iterated self-consistent geocenter, and in the third column (**c**,**f**,**i**) with an iterated self-consistent geocenter taking into account the effects of self-attraction and loading.

#### *4.3. Uncertainty Estimates*

The total uncertainty in the estimated geocenter time series represents a combination of GRACE/GRACE-FO measurement error, signal leakage between ocean and land, and uncertainty in the geophysical corrections. We assess the impact of these uncertainties, which we assume are uncorrelated, on the trend, annual amplitude and annual phase to the 95% confidence level. Monthly errors in the GRACE.GRACE-FO Level-2 spherical harmonics are estimated by first smoothing each coefficient with a 13-month Loess-type algorithm [13] and then calculating the residuals between the original and smoothed coefficients [65]. Spherical harmonic uncertainties contribute 0.11, 0.09 and 0.19 mm to the *x*, *y*, and *z* components for coefficients derived from CSR time-variable gravity fields.

We estimate the contribution from uncertainty in ocean circulation using ocean bottom pressure outputs from ECCO-JPL near real-time Kalman-filtered simulations and ECCO V4r3 simulations [36,38]. The time series of recovered degree one coefficients derived from CSR time-variable gravity fields using ocean bottom pressure outputs from ECCO-JPL kf080i [36], ECCO V4r3 [38] and MPIOM [23] are shown in Figure 7. The trend and annual amplitudes of each coefficient derived using the different ocean bottom pressure estimates are listed in Table 2. Solutions from ECCO-JPL agree with ECCO V4r3 and MPIOM derived solutions for the *y* and *z* components, but differs in terms of trend in the *x*-component [2]. Solutions from ECCO V4r3 and MPIOM agree for all coefficients between 2002 and 2014.

**Figure 7.** Time series of recovered geocenter variations, (**a**) *x*, (**b**) *y* and (**c**) *z*, in mm calculated using an iterated self-consistent geocenter with self-attraction and loading effects from time-variable gravity fields provided by the Center for Space Research using ocean bottom pressure outputs from ECCO-JPL real-time Kalman-filtered simulations (orange) [36], ECCO Version 4 Release 3 simulations (purple) [38], and Max Planck Institute ocean model (MPIOM) [24].

**Table 2.** Geocenter motion annual amplitudes, annual phase, and trends for 2002–2015 derived using time-variable gravity fields from the Center for Space Research (CSR) using ocean bottom pressure outputs from ECCO-JPL real-time Kalman-filtered simulations (kf080i) [36] and ECCO Version 4 Release 3 simulations (V4r3) [38]. Errors denote the 95% confidence level.


We estimate the uncertainty in atmospheric circulation by using an ensemble of reanalysis outputs [32–35]. Differences between the GRACE/GRACE-FO atmospheric de-aliasing product (GAA) and outputs from ERA-Interim [32], MERRA-2 [33], NCEP-DOE-2 [34], and JRA-55 [35] reanalyses will affect the estimates of geocenter motion. The average monthly uncertainty in geocenter variation due to uncertainties in atmospheric circulation is 0.07, 0.10, and 0.24 mm for the *x*, *y*, and *z* components, respectively.

Glacial Isostatic Adjustment (GIA) affects both ocean mass and land-sea flux calculations used to reconstruct the geocenter variation. The contribution of GIA uncertainty is calculated by comparing our best case estimate against the expected GIA rate from Caron et al. [26]. Using coefficients from Caron et al. [26] affects trend estimates by 0.04, 0.003, and 0.20 mm/yr for the *x*, *y*, and *z* components, respectively. This represents trend differences in the *x*, *y*, and *z* components of 21%, 3%, and 28%, respectively, compared with the values derived using GIA outputs from A et al. [25] with ICE-6G ice history. Uncertainty in GIA is a limiting factor for determining rates of geocenter change from reconstructions with time-variable gravity and ocean models.

#### **5. Discussion**

There is better agreement between the estimates derived from GRACE/GRACE-FO and the new CF-CM SLR-derived solutions compared with SLR-derived CN-CM estimates Figure 8, Table 1 [5]. Self-attraction and loading effects improve the correspondence with SLR-derived solutions, particularly in terms of average annual amplitude (Figure 9). However, there are still differences in the seasonal amplitudes of the *z*-component of geocenter motion (*C*10). Using a different ocean model to derive the oceanic component of geocenter variability cannot fully explain the difference between the solutions (Figure 8). Uncertainties in the GRACE spherical harmonics and uncertainties in atmospheric variation can partially explain the differences in the solutions. In addition, uncertainties in the ability to correct for network-effects in the SLR-derived CF-CM solutions can also help explain some discrepancies with the solutions derived from GRACE/GRACE-FO.

There is strong agreement between the three processing centers for each of the degree one coefficients during both the GRACE and GRACE-FO periods. However, between the end of 2015 and the end of the GRACE mission, there is disagreement between centers in estimates of the *x* and *y* components of geocenter motion (*C*<sup>11</sup> and *S*11). Operational procedures enacted to maintain the battery life of the GRACE satellites during the latter stages of the mission affected the quality of the time-variable gravity fields. The accelerometer onboard GRACE-B was turned off in September 2016 to reduce the battery load and maintain the operation of the microwave ranging instrument. Independent methods have been developed by the GRACE processing centers to spatiotemporally transplant the accelerometer data retrieved from GRACE-A for GRACE-B [66]. The increased uncertainty in the gravitational fields likely affects our ability to estimate the geocenter variability, particularly for the months with a single operating accelerometer.

**Figure 8.** Time series of measured and recovered geocenter variations, (**a**) *x*, (**b**) *y* and (**c**) *z*, in mm from satellite laser ranging (orange and purple) and time-variable gravity fields provided by the Center for Space Research (green). The SLR-derived solutions in orange (CN-CM) are the traditional solutions that center the SLR network [15], and the SLR-derived solutions in purple (CF-CM) include the effects of local site displacements [5]. The solutions derived from GRACE/GRACE-FO in green use an iterated self-consistent geocenter that takes into account the effects of self-attraction and loading.

**Figure 9.** Annual amplitudes of measured and recovered degree one surface mass variations in mm water equivalent calculated from (**a**–**c**) time-variable gravity fields provided by the Center for Space Research, and (**d**) satellite laser ranging (SLR) CF-CM solutions. In the solutions derived from GRACE/GRACE-FO, the land-sea exchange is estimated using (**a**) a static seasonal geocenter from Chen et al. [7], (**b**) an iterated self-consistent geocenter, and (**c**) an iterated self-consistent geocenter taking into account the effects of self-attraction and loading.

Swenson et al. [2] provides a method for deriving geocenter motion from time-variable gravity measurements and estimates of ocean bottom pressure. The full component of oceanic mass is estimated in Swenson et al. [2] by calculating the land-sea fluxes using time-variable gravity measurements. We expand upon this method by using an iterated, self-consistent geocenter when calculating the full component of the land mass change, and by including self-attraction and loading effects when redistributing the mass over the ocean. The necessity of these inclusions only became evident with a longer record of time-variable gravity measurements. Using a static, seasonal geocenter omits the contribution of inter-annual fluctuations in degree one to the total land mass change. Self-attraction and loading effects impact both the seasonal amplitudes and the long-term trend of the recovered geocenter estimate (Figures 6 and 9). This is due to the spatial patterns of sea level fingerprints that can deviate strongly from the uniform average, particularly over the long-term from changes in glaciers and ice sheets [48,67].

Sun et al. [16] expanded on Swenson et al. [2] by simultaneously estimating oblateness, *C*20, variations along with geocenter variations. They test the sensitivity of geocenter estimation methods to geophysical processes, such as glacial isostatic adjustment and ocean redistributions due to self-attraction and loading. In Sun et al. [16] the GRACE spherical harmonics are truncated to degree and order 45 and not processed with a decorrelation filter. Here, we use the full expansion of spherical harmonics for each processing center and filter for correlated errors in the GRACE/GRACE-FO harmonics [30]. The test the efficacy of the two methods using our synthetic reconstruction of global mass change. While statistically similar between the two techniques, we find that expanding to higher degree and orders and filtering produces more consistent estimates compared with synthetic estimate. The Sun et al. [16] method is used when calculating the JPL Tellus geocenter product. The differences between the geocenter estimates computed here and the estimates provided by JPL Tellus are largely during the final months of the GRACE mission and initial months of the GRACE-FO mission. Estimates of Antarctic ice sheet mass balance from time-variable gravity are sensitive to uncertainties in geocenter

variation [68]. We find that differences between the two geocenter estimates affect Antarctic ice sheet mass balance estimates by 8–10 Gt/yr between 2002 and 2019.

Here, we use a buffered 300 km ocean function to calculate the sea level fingerprints in order to conserve mass and to reduce the leakage between land and ocean [4]. A full-resolution ocean function with realistic coastlines could be used if sets of scaling factors could be derived for both the land and ocean mass change. Typically, the use of scaling factors is applicable only for deriving seasonal fluctuations in land mass [69]. A more-complete scaling factor, such as from Hsu and Velicogna [67] for the land mass change, could possibly improve estimates of geocenter variation. The effects of self-attraction and loading would likely become more evident with a full-resolution ocean function as the effects can be more pronounced in coastal areas [48].

#### **6. Conclusions**

Geocenter variations are an important representation of global mass change, with applications in satellite gravimetry and in satellite orbit determination. Here, we investigate the effects of different calculations of the eustatic sea level caused by changes in land mass for estimating geocenter variations from combinations of ocean model outputs and time-variable gravity measurements from the GRACE and GRACE Follow-On missions. We find that using an iterated, self-consistent geocenter incorporating self-attraction and loading effects provides the best estimate of geocenter variation. Uncertainties from glacial isostatic adjustment, ocean circulation and atmospheric circulation limit the determination of geocenter position from this technique. The annual amplitudes of the *z*-component of geocenter variation differs between estimates from this technique and estimates derived from satellite laser ranging (SLR). The effects of self-attraction and loading improve the correspondence with SLR-derived coefficients but there are still discrepancies worth further investigation. Estimates of geocenter variations using data from the GRACE Follow-On mission are consistent with estimates from the GRACE mission, which enables the extension of the geocenter time series going forward.

**Author Contributions:** T.C.S. performed the analysis and wrote the manuscript. I.V. supervised the project, contributed to the analysis, helped analyze the data and provided comments/feedback.

**Funding:** Research was supported by an appointment to the NASA Postdoctoral Program at NASA Goddard Space Flight Center, administered by Universities Space Research Association under contract with NASA.

**Acknowledgments:** This work was performed at NASA Goddard Space Flight Center, the Jet Propulsion Laboratory, and University of California, Irvine. The authors thank our editor Thomas Gruber, and four anonymous reviewers for their comments and suggestions on improving this manuscript. The authors would like to thank John Ries (UTCSR) for his work to produce the geocenter solutions from satellite laser ranging (SLR) and for his helpful discussions on geocenter variability. The authors wish to thank Byron Tapley, Frank Flechtner, Michael Watkins, Srinivas Bettadpur and the GRACE/GRACE-FO Science Data System (SDS) teams for their work to produce the GRACE and GRACE-FO gravity solutions. The authors also wish to thank the German Space Operations Center (GSOC) of the German Aerospace Center (DLR) for providing the raw GRACE and GRACE-FO telemetry data to the processing centers. GRACE and GRACE-FO data are available from the NASA Physical Oceanography Distributed Active Archive Center (PO.DAAC) at https://podaac.jpl.nasa.gov/grace and the GFZ Information System and Data Center (ISDC) at http://isdc.gfz-potsdam.de/grace-isdc/. Reanalysis outputs are available from ERA-Interim at https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era-interim, from MERRA-2 at https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2/, from NCEP-DOE-2 at https://rda.ucar. edu/datasets/ds091.0/, and from JRA-55 at http://jra.kishou.go.jp/JRA-55/index\_en.html. Documentation for the JPL Tellus ocean bottom pressure product is available at https://grace.jpl.nasa.gov/data/get-data/ ocean-bottom-pressure/. Geocenter data from this project are available on Figshare under a CC BY 4.0 license at https://doi.org/10.6084/m9.figshare.7388540. The following programs are provided by this project for processing the GRACE/GRACE-FO data: https://github.com/tsutterley/read-GRACE-harmonics reads the Level-2 spherical harmonic data, and https://github.com/tsutterley/read-GRACE-geocenter reads the geocenter data provided by this project.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## **An Assessment of the GOCE High-Level Processing Facility (HPF) Released Global Geopotential Models with Regional Test Results in Turkey**

#### **Bihter Erol \*, Mustafa Serkan I¸sık and Serdar Erol**

Istanbul Technical University, Civil Engineering Faculty, Geomatics Engineering Department, Maslak 34469, Istanbul, Turkey; isikm@itu.edu.tr (M.S.I.); erol@itu.edu.tr (S.E.)

**\*** Correspondence: bihter@itu.edu.tr; Tel.: +90-212-285-3821

Received: 23 December 2019; Accepted: 6 February 2020; Published: 10 February 2020

**Abstract:** The launch of dedicated satellite missions at the beginning of the 2000s led to significant improvement in the determination of Earth gravity field models. As a consequence of this progress, both the accuracies and the spatial resolutions of the global geopotential models increased. However, the spectral behaviors and the accuracies of the released models vary mainly depending on their computation strategies. These strategies are briefly explained in this article. Comprehensive quality assessment of the gravity field models by means of spectral and statistical analyses provides a comparison of the gravity field mapping accuracies of these models, as well as providing an understanding of their progress. The practical benefit of these assessments by means of choosing an optimal model with the highest accuracy and best resolution for a specific application is obvious for a broad range of geoscience applications, including geodesy and geophysics, that employ Earth gravity field parameters in their studies. From this perspective, this study aims to evaluate the GOCE High-Level Processing Facility geopotential models including recently published sixth releases using different validation methods recommended in the literature, and investigate their performances comparatively and in addition to some other models, such as GOCO05S, GOGRA04S and EGM2008. In addition to the validation statistics from various countries, the study specifically emphasizes the numerical test results in Turkey. It is concluded that the performance improves from the first generation RL01 models toward the final RL05 models, which were based on the entire mission data. This outcome was confirmed when the releases of different computation approaches were considered. The accuracies of the RL05 models were found to be similar to GOCO05S, GOGRA04S and even to RL06 versions but better than EGM2008, in their maximum expansion degrees. Regarding the results obtained from these tests using the GPS/leveling observations in Turkey, the contribution of the GOCE data to the models was significant, especially between the expansion degrees of 100 and 250. In the study, the tested geopotential models were also considered for detailed geoid modeling using the remove-compute-restore method. It was found that the best-fitting geopotential model with its optimal expansion degree (please see the definition of optimal degree in the article) improved the high-frequency regional geoid model accuracy by almost 15%.

**Keywords:** gravity field satellite missions; GOCE; GRACE; GOCE High-Level Processing Facility (HPF), earth gravity field; geoid; spectral enhancement method (SEM), GPS/leveling

#### **1. Introduction**

The low-earth orbiting (LEO) satellite missions initiated a new stage in Earth gravity field studies and led to unprecedented progress in determining global gravitational field models and related parameters. In addition to the satellite laser ranging (SLR) tracking data, as a fundamental technique for gravity field determination since the earlier research [1], the recent contribution of the LEO satellites significantly improved global geopotential models (GGMs). The mentioned improvements are both in terms of precision and spatial resolution, and hence they can be assumed as a breakthrough with a stunning impact on geodetic science and related fields [2,3]. The observations by LEO satellites primarily rely on satellite-to-satellite tracking (SST), which includes high-to-low (hl-SST) as well as low-to-low SST (ll-SST) [4–8]. The fundamental principle of SST techniques is based on the precise measurements of distance variations between the satellites, which can be used to determine the orbit perturbations and hence the gravitational force acting on the satellite. In addition to SST, dedicated satellite gravity gradiometry (SGG), which is a sensitive detection technique of the space gravitational gradients, is also used in the LEO satellite technology and the gravitational field of the Earth is determined globally with high precision and resolution by means of combined SGG and SST observations.

As a chronological overview of the dedicated Earth gravity field satellite missions, the Challenging Minisatellite Payload (CHAMP), launched first, in July 2000 as a German Research Center for Geosciences (GFZ) project, and pioneered the LEO satellite missions for mapping the gravity field in a global scale. High-low SST with an equipped space-borne GPS-receiver on CHAMP enabled the determination of the long-wavelength static gravity field of Earth [9]. Thereafter, the Gravity Recovery and Climate Experiment (GRACE) twin satellites were launched in March 2002 as a US National Space Agency (NASA) and National Aeronautics and Space Research Center of Germany (DLR) joint project, aiming to track the changes in the gravity field initially for a five-year duration. The GRACE mission employed the dual-frequency one-way K-band phase measurements transmitted and received by both satellites [7,10]. Hence, precise mapping of the Earth's gravity field with a 200–300 km half-wavelength resolution every 30 days was managed with the contribution of the mission data. Thus, many studies and applications in Earth science disciplines have benefited from the information relating to mass transport and redistribution in the Earth system by GRACE-based observations [11]. The science mission of GRACE ended in October 2017. As its successor, the Gravity Recovery and Climate Experiment Follow-On (GRACE-FO) mission was launched in May 2018 to extend GRACE's observations and hence to continue its contribution to research [11,12].

Other than the GRACE and GRACE-FO missions, the Gravity field and steady-state Ocean Circulation Explorer (GOCE) was launched in March 2009 as a European Space Agency (ESA) project. The GOCE satellite mission utilized the satellite gravity gradiometer (SGG) for the first time in combination with the hl-SST technique. In addition to a novel SGG technique (see, e.g., [13,14]), with its low orbit altitude, the GOCE mission was dedicated to producing a near-global map of Earth's static gravity field with high accuracy and resolution [5,8,15,16]. However, it should also be noted that considering the noise characteristics of the gradiometer, the SGG data alone are not sufficient to determine the low-degree harmonic coefficients of the gravity field with high accuracy. Therefore, the GPS SST data are essential to determine the precise orbit, and thus geo-locate the tensor observables, and the long-wavelength part of the gravity field. With the GOCE mission, an approximate 1–2 cm accuracy of geoid undulations and 1–2 mGal accuracy of gravity anomalies were targeted. An unprecedented spatial resolution of 100 km, which approximately corresponds to 200 degrees of spherical harmonic expansion, was accomplished in mapping the gravity field with the contribution of the GOCE satellite mission data [17].

A number of gravity field solutions were produced by the High-Level Processing Facility for GOCE (HPF) using the GOCE data. The official level 2 products of HPF can be obtained through datacenters such as the International Center of Global Earth Models (ICGEM) for GOCE users (geodesists, solid Earth scientists, oceanographers, etc.) [18–21].

In the released products, when the spherical harmonic expansion models (GGMs) are considered, it is seen that the maximum degrees of these models and hence their spectral contents vary in a rather large spectrum. Regarding the spectral bands, expressed by the maximum expansion degree, the GGMs can be classified as a low degree (having a degree/order (d/o) of harmonic expansion between 0 and 100), medium degree (100 < d/o ≤ 300), high-degree (300 < d/o ≤ 3000) and ultra-high degree

(3000 < d/o) models (see, e.g., [15]). This described categorization of the GGMs is adopted in the present article.

This article aims to provide an overview of the geopotential models, particularly those released after the GOCE mission, in terms of the results of comprehensive quality validations using independent terrestrial data. The numerical tests include the GGM validations and their use in Turkey with the high-accuracy GPS/leveling data. These data include high order (first, second and third order) network quality GNSS coordinates (approximately 1–2 cm three-dimensional position accuracy) and orthometric heights (with ~2–3 cm accuracy) in the benchmarks. Turkey is one of the restricted areas in terms of terrestrial gravity field data sharing for global geopotential model calculations. As a consequence, the results of this study may be assumed to be a new statistical contribution to the literature on the model performance assessments of gravity field mapping using the recent HPF models in this territory. However, it should be emphasized that rather than suggesting a new methodology, it was aimed to demonstrate the performance gain of the geopotential models and to highlight the progress of HPF GOCE models from release to release, with the numerical assessment results using terrestrial data from Turkey. The impact of the improvement in geopotential model accuracies on the detailed geoid modeling in the area with rather sparse terrestrial gravity data was investigated as a second aim of this article. In summary, this article was written as a case study for providing a well-coordinated reference to users for geopotential model assessments, addressing the appropriate numerical methods that were suggested in the notable literature (i.e., spectral enhancement method or filtering the terrestrial data), rather than providing an innovative contribution to the theory.

The numerical validations of the GOCE- and GRACE-based geopotential models in the study area are accompanied by the high degree EGM2008 model. EGM2008, with 2190 d/o of spherical harmonic expansion, has been developed as a combination of the ITG-GRACE03S (complete to spherical harmonic d/o 180) geopotential model with the terrestrial, airborne and altimetry-derived gravity anomalies using least squares adjustment estimation technique. Reference [22] explains the computation method of EGM2008 in detail and reports the absolute accuracy of the model as ±5 cm to ±10 cm in well-covered areas with high-quality gravity data. Use of EGM2008 in the tests as the baseline model allowed an objective assessment of the level of improvement achieved by the GOCE mission, specifically at the low to middle frequencies of the gravity field spectrum, since this high-resolution model does not include the GOCE mission data. It should be noted that there is no official report that declares the terrestrial gravity or GPS/leveling data from Turkey that has been included in EGM2008 computation. The validation results in this study showed that the accuracy of EGM2008 ( = 2190) is at the decimeter level in Turkey.

In addition, EGM2008 has also been employed in this study for spectral enhancement of the validated GGMs for approximating the omission error in comparisons with the GPS/leveling data. The spectral content of the tested GGMs is naturally not comparable with the high-frequency GPS/leveling data. This is because of the truncated degrees in the spherical harmonic expansion equation. Recovering the omitted part of the signal, the spectral enhancement method (SEM) was applied. The SEM is one of the options that is given in the literature to overcome this handicap in validations [19,23]. Low-pass filtering of high-frequency data is another recommended option by [24–26]. In numerical tests, the short-wavelength components of the gravity field were taken into account by employing a high-frequency content of EGM2008 in addition to the residual terrain effect (RTE) from satellite-based high-resolution topographic data.

In the following section, the applied methods for validating internal and external accuracies of the tested GGMs, as well as for computing the detailed regional geoid models using the optimum and maximum d/o of the tested GGMs, are explained. A brief explanation of the main characteristics of the models is also provided in this section. The third section introduces the terrestrial data, which were employed in validations and regional geoid model computations. The numerical test results obtained from the validations of the GGMs, as well as the detailed geoid models, are provided using graphics and tables. Results are interpreted and discussed in this section. The fourth section summarizes the major outcomes of the study.

#### **2. Materials and Methods**

#### *2.1. Practical Need of Global Geopotential Model Validation Results*

Testing and evaluating geopotential model performances is important since determining an optimal model, with the highest accuracy and best resolution, has certain practical consequences in specific applications of the Earth science and engineering disciplines. We exemplify some of the applications in which GGM performance is crucial in the following by addressing the applied numerical tests under this study.

The determination of a high-resolution regional geoid using the remove–compute–restore (RCR) method combines low to high-frequency content of the gravity field signal employing the reference GGM, the terrestrial gravity observations and the topographic height and density data [27]. In this strategy, the best-fitting GGM to the gravity field in the region contributes to the accuracy of the high-resolution geoid by improving the long-wavelength component in the error budget [27–29]. The GGM contribution to the regional geoid model accuracy can be even more crucial if there is a lack of high-quality and sufficiently dense terrestrial gravity data under certain modifications of the kernel function and when a few-centimeter accuracy geoid model is desired (see e.g., [30]). Regarding this issue, we evaluated the released GGMs in high-frequency geoid computations using gravity data in Turkey. The computations were carried out with fast Fourier transform (FFT) evaluation of an unmodified Stokes integral. The conclusions depicted a centimeter-level contribution of the GGMs to the accuracy improvement of the regional geoid models.

The independent data provides an objective measure to examine the gravity field mapping performances of the GGMs in a region. Since the gravity field parameters and their gradients are directly estimated with spherical harmonic expansion equations in some applications, their accuracy checks are important. Regarding this, [31] uses the deflection of vertical (DoV) components estimated from GGMs to compare with the astrogeodetic DoVs obtained through the observations with the digital zenith camera system (DZCS); [32] exploits the GRACE-based GGMs derived gravity gradients in interpretation of geophysical or geodynamical patterns in Iran; [32] compares the upward continued airborne gravity data with the GOCE-based gravity gradients in Antarctica; and [33] issues the GGM-derived geoid heights and gravity anomalies to compare with the mean sea surface and free air anomalies derived from altimetry data in open ocean and coastal regions. In some GNSS applications, GGM-derived geoid undulations/height anomalies are considered for vertical control purposes. Before using the estimated undulations in the transformation of GNSS heights, the accuracies and the fit of the GGMs to the regional vertical datum should also be validated. For such a purpose, [34] evaluates five combined GGMs in a very dense GPS/leveling network on the Greece mainland and tested capability of parametric models for fitting the GGMs to the regional vertical datum.

The independent validations of the GGMs and test results are also important for the height system unification (HSU) and world height system (WHS) realization studies, because the precise determination of the offsets among the regional vertical datums, as well as with the WHS surface, is critical to establishing a globally unified height datum [23,35]. In this way, using a high-precision GGM as a reference in the calculations allows for higher accuracy calculation of regional height system offsets [35,36]. Although numerical assessments on the determination of the geopotential value (Wo) of regional vertical datum and height system unification are not explicitly undertaken in this article, [37] is recommended for further reading on the detailed numerical evaluations of a number of GGMs on the height system unification between Turkey and Greece.

#### *2.2. Overview the Tested Global Geopotential Models*

The tested GGMs in this study include the recent models that were calculated with different processing strategies from satellite orbits and GOCE gradiometer observations. These models were released (tagged with the release version from RL1 to RL5) as the products of the GOCE High-Level Processing Facility (HPF) of the ESA project. The first releases of the models (RL1) consist of the initial two months' GOCE observations (between November 2009 and January 2010); the RL2 models are calculated employing the eight months' observation cycle (November 2009 to July 2010); and the RL3, RL4 and RL5 releases exploit the 18 months, 26 months and the full mission observations, respectively [38]. The sixth releases of the direct and time-wise models, which were also based on the GOCE full mission dataset, were recently published and made available during the preparation of this article [21]. Therefore, the RL6 models were also employed and tested in the content of the study in order to compare their performances with the RL5 models, which have been already exploited the full mission observations.

Essentially, the models were developed according to three HPF gravity field processing approaches, namely, the direct (DIR) [39], time-wise (TIM) [40] and space-wise (SPW) [41] approaches. Each approach follows a different processing strategy in calculating the releases. These strategies include mathematical models, the adopted a priori models (or no a priori model use), reference information in constraints, spherical cap regularization, filter algorithms of the observations, etc. Table 1 provides a detailed comparison of the validated GGMs including a short report on the differences in their processing strategies. References [18,42] provide a detailed description of the basics of these data processing methods. Reference [18] reviews and discusses the DIR, TIM and SPW methods, their processing philosophies and architectures in computations of the GOCE gravity field models using RL1 models. Reference [16] reviews the second, third and fourth generation GOCE-based Earth gravity field models with their computation strategies, and evaluates their performances with spectral sensitivities over different functionals.

The products used in computations include gravity gradients in gradiometer reference frame (with identifier EGG\_NOM\_2), common-mode accelerations (EGG\_CCD\_2C), attitude quaternions (EGG\_IAQ\_2C), and precise science orbits (including the sub-products: kinematic orbits (SST\_PKI\_2I), variance–covariance information of orbit positions (SST\_PCV\_2I), reduced-dynamic orbits (SST\_PRD\_2I) and quaternions of transformation from Earth-fixed to inertial reference frame (SST\_PRM\_2I)); see [43] for the product descriptions and standards that were applied during the processes. In addition to the observations and products, depending on the applied processing strategy, the reference gravity field models were also adopted in some of the solutions. These reference gravity field models were either exploited as a priori information or for regularization, internal assessment and validation purposes. EIGEN5C, EIGEN51C, ITG-GRACE2010S [21] were used in the DIR method for internal validation purposes, whereas the GOCE quick-look gravity field model was used in the first release of the SPW approach as a priori information. EGM2008, EIGEN5C, EIGEN6C3stat were also employed for signal and error covariance computation in SPW solutions [21]. However, as the basic difference among the three methods, in the processing strategy of TIM method no gravity field a priori information entered the solution and, hence, this method yields pure GOCE-only models. All the information given here were retrieved from the headers of the spherical harmonic coefficients' files and respective data sheets, published by [21]. The main characteristics of the models are summarized as follows:

The DIR method, as the classical approach, is based on the orbit perturbation theory and combines a priori orbits and an a priori gravity field model, hl-SST observations and common-mode accelerations, while constituting the normal equations in determining the gravity field model coefficients in an iterative way:

• DIR models contain GRACE observations in the lower to medium degrees of the expansions (in addition to GOCE gravity gradient data).


The TIM approach exclusively relies on an epoch-wise processing of GOCE SGG and hl-SST data according to the least squares principle. In the method, the SST data analysis is based on the kinematic orbit solutions with the energy integral approach (short-arc integral approach for TIM-RL4, RL5 and RL6). Hence, since TIM solutions are independent of any other gravity field information, they can be used for independent comparisons and combination with other satellite-only models (such as GRACE), terrestrial gravity data and altimetry data on the normal equations' level. However, the TIM solutions are not competitive with the GRACE models in low degrees, since they are based solely on kinematic GOCE orbits but no external (such as GRACE) information:


The revision in 2012 of the processing procedure of the Level-1b products led to the expectation of performance improvement of the GOCE's SGG observations and consequently a quality improvement of the fourth, fifth and sixth releases of the models. Reference [47] reports that the gradiometry-only gravity field estimates show the largest improvement in recovery of low and medium degree coefficients, due to the new Level-1b products (see [16]).

In the SPW method, spatialized observations are produced from SST and SGG data using Wiener orbital filter [48] and the gravity field coefficients are derived by spherical harmonic analysis after transforming these observations into a spatial grid using least squares collocation [49]:

• The SPW RL1 model includes the first GOCE quick-look model as the a priori information and the EGM2008 was used for error calibration of the estimated gravitational potential along the orbit that affects the low degrees of the solution. However, the later releases, RL2 and RL4, were not corrected using any a priori model, thus they are the GOCE-only models. EIGEN5C and EIGEN6C3stat models were respectively used in SPW RL2 and RL4 models' computations for signal covariance modeling, in addition to FES2004 for ocean tide modeling.

• In all SPW models, the off-diagonal tensor elements Vxz were included.

The maximum d/o of spherical harmonic expansion also changes (see the 2nd column of Table 1). The maximum d/o of the models was determined by the processing teams of HPF, depending on the signal-to-noise ratio, processing issues and the strategy of using additional a priori information [18].

**Table 1.** Description of direct (GO\_CONS\_GCF\_2\_DIR), time-wise (\_TIM), space-wise (\_SPW) in addition to GOCO05S, GOGRA04S and EGM2008 models (data contents in m: month, y: year) [21,50].



#### **Table 1.** *Cont*.

#### *2.3. Synthesis of the Gravity Field Parameters Using GGMs Spherical Harmonic Coe*ffi*cients*

The International Center for Global Earth Models [60] database has published 175 geopotential models so far and almost 30% of these models include GOCE mission data. In order to quantify the quality of the GGMs independently, validating the models using external information is necessary. So-called validations require the level-3 (L3) products from the satellite gravity missions to be compared with terrestrial data [24]. The GPS/leveling data is typically used for independent geoid undulations (*NGPS*/*lev*.) at the benchmarks to compare with the GGM derived functionals (*NGGM*) using the following equation:

$$N\_{\rm GCM} = \frac{GM}{r\_{\uparrow}} \sum\_{\ell=2}^{\ell\_{\rm max}} \left(\frac{a}{r}\right)^{\ell} \sum\_{m=0}^{\ell} \left(\Delta \bar{\mathbb{C}}\_{\ell m} \cos m\lambda + \bar{\mathbb{S}}\_{\ell m} \sin m\lambda\right) \bar{P}\_{\ell m}(\cos \theta) \tag{1}$$

where (θ, λ*, r*) are co-latitude, longitude and geocentric radius of the computation point, respectively; *a* is the major axis radius of the reference ellipsoid; *GM* is the product of the gravitational constant and the mass of the Earth; and γ is the mean gravity of the reference ellipsoid. Δ − *<sup>C</sup><sup>m</sup>* <sup>=</sup> <sup>−</sup> *C <sup>m</sup>* <sup>=</sup> <sup>−</sup> *<sup>C</sup><sup>m</sup>* <sup>−</sup> <sup>−</sup> *C ell*. *m*, and <sup>−</sup> *S<sup>m</sup>* are the fully normalized spherical harmonic coefficients, related to the normal potential of the reference ellipsoid from a specific degree of and order *m* (*m* = 0, 1, ... , ). The coefficients of the ellipsoidal normal potential (<sup>−</sup> *C ell*. *m*) contain only terms for order *m* = 0 (rotational symmetry) and degree <sup>=</sup> even (equatorial symmetry), and <sup>−</sup> *S <sup>m</sup>* <sup>=</sup> <sup>−</sup> *Sm*. To calculate the normal potential in practice in most cases, it is sufficient to consider only the coefficients <sup>−</sup> *C ell*. <sup>00</sup> , − *C ell*. <sup>20</sup> , − *C ell*. <sup>40</sup> , − *C ell*. <sup>60</sup> and <sup>−</sup> *C ell*. <sup>80</sup> [20]. In the equation, <sup>−</sup> *Pm*(*cos*θ) are the fully-normalized associated Legendre functions.

As can be seen, Equation (1) does not include the zero-degree geoid height term (*No*), which is represented by the zero-degree harmonic coefficient in the spherical harmonic expansion model and means the position of the GGM geoid with respect to the geo-center of the Earth. The zero-degree term is expressed as depending on the difference between the constant (*GM*) value of the GGM and that of the reference ellipsoid (*GMGRS*80), which is GRS-80 for Turkey (see Equation (2)):

$$N\_o = \frac{GM - GM\_{GR880}}{r\,\gamma} - \frac{\mathcal{W}\_o - \mathcal{U}\_o}{\mathcal{Y}} \tag{2}$$

where *Uo* corresponds to the normal gravity field of the GRS-80 reference ellipsoid with parameters of *GMGRS*<sup>80</sup> = 398, 600.50 <sup>×</sup> 109 <sup>m</sup>3/s2; thus, it is *Uo* = 62, 636, 860.85 m2/s2 [61]. Using *<sup>r</sup>* = 6378.137 km and γ = 981 Gal, the zero-degree geoid height (considering solely the first component at the right-hand side of Equation (2)) corresponds to the *No* = −0.936 m that is to be added to the *NGGM*, derived from Equation (1). The residual zonal-coefficients (shown as Δ − *C<sup>m</sup>* in Equation (1)) account for the difference between the reference radius of the GGM and the semi-major axis of the GRS-80 ellipsoid [62,63]. In the synthesis of GGMs' spherical harmonic coefficients, the zero-degree term is considered either by taking the zero-degree zonal-harmonic coefficients (as <sup>−</sup> *<sup>C</sup>*<sup>00</sup> <sup>=</sup> 1.0, <sup>−</sup> *S*<sup>00</sup> does not exist, − *<sup>C</sup>*<sup>10</sup> <sup>=</sup> 0.0, <sup>−</sup> *S*1*<sup>m</sup>* = 0.0) into account via initiating the computation from the = 0 or otherwise separately adding the zero-degree geoid offset, calculated with Equation (2). The second component at the right-hand side of Equation (2), (<sup>−</sup> *Wo*−*Uo* <sup>γ</sup> ), introduces the difference between the gravity potential of the geoid (*Wo*) and the normal gravity potential on the surface of the normal ellipsoid (*Uo*). In the ideal case, without irregularities of topographic densities, the two potentials are equal (*Wo* = *Uo*). As a primary parameter for the definition of the reference (zero-height) surface with respect to Earth's body (which provides an absolute definition to the vertical datum of a height system), a global estimation of *Wo* is required [64].

In addition to the zero-degree term definition, another issue in the synthesis of the gravity field parameters from the GGMs' spherical harmonic coefficients is adopting the proper permanent tide-system to which the GGMs' functionals refer. The permanent tide-systems, commonly adopted in studies investigating Earth dynamics include mean-tide, zero-tide and tide-free [65]. In studies related to the Earth's potential field, investigations of the Earth's crust deformations or determination of the best-fitting ellipsoid to the geoid, a proper permanent tide system is adopted. According to Bruns' definition of geoid undulation [61], the geoid undulation changes from one tide system to another depending on the geometric change in the shape of the geoid (but with a constant *W* = *Wo* potential for each tide-system) relative to a fixed reference ellipsoid [65]. In other words, the *Wo* value does not change among the systems, but the shape of the geoid changes. The effect of changing the permanent

tide-system is defined basically with the second-degree zonal harmonic coefficient <sup>−</sup> *C*<sup>20</sup> [65].

In this study, the gravity field parameters were synthesized using the GGM coefficients in a tide-free system. In order to have comparable geoid undulations at the GNSS/leveling benchmarks, the orthometric heights in regional vertical datum were converted from the mean-tide to tide-free system according to following the equation [66]:

$$H^{\text{tide\\_free}} = H^{\text{mean\\_tide}} - 0.68 \left( 0.099 \,\text{\textdegree } -0.296 \,\text{sin}^2 \text{\textdegree p} \right) \tag{3}$$

#### *2.4. Assessments of the Global Geopotential Models Using Spectral Enhancement Method (SEM)*

The spectral constituents of Earth's gravity field are denoted with the spectral bands of a GGM [15]. The gravity field quantities as observed on the Earth's surface contain the full spectral signal power, while any global geopotential model is limited by its spectral resolution, which leads to omission error in the output functional. When comparing GGMs with independent observational data, one should take into account spectral incompatibility of both data sets. To overcome the spectral band limitation of the mode, the so-called spectral enhancement method (SEM) is suggested and applied by [18]. The fundamental principle of the SEM is based on bridging the spectral gap between the GOCE-based GGM functional and ground-truth data, using the relevant auxiliary data set (such as a higher resolution geopotential model like EGM2008, and the residual terrain data, to account for the ultra-high frequencies) as much as possible. Hence, the ground-truth data for validation purposes

are approximated much better, and thus the influences of omission error on the comparison results are consequently minimized. In spectral enhancement, the GGM, expanded to a spherical harmonic degree <sup>1</sup> (e.g., <sup>1</sup> = 240 for DIR\_RL1), is enhanced using the spectral bands over the degrees of (<sup>1</sup> + 1) until <sup>2</sup> from a higher resolution Earth geopotential model (e.g., <sup>2</sup> = 2190 when using EGM2008). The very short-wavelength components of the gravity field, represented with spectral bands beyond degree <sup>2</sup> (e.g., <sup>2</sup> = 2190 in our case) until the maximum possible degree (e.g., 216,000), are completed as the residual terrain effect (RTE) of topographic data on the estimated geoid heights (see Equation (4)).

$$\mathbf{N}\_{\rm emh.} = \mathbf{N}\_{\rm o} + \begin{bmatrix} \mathbf{N} \\ \mathbf{N} \end{bmatrix} \begin{array}{c} \ell\_1 \\ \mathbf{2} \end{array} + \mathbf{N} \begin{array}{c} \ell\_2 \\ \ell\_1 + \mathbf{1} \end{array} \begin{array}{c} \ell\_2 \\ \ell\_1 + \mathbf{1} \end{array} \begin{array}{c} 216000 \\ \ell\_2 + \mathbf{1} \end{array} \tag{4}$$

Equation (4) is a symbolic representation of the enhancement procedure in terms of geoid undulation derived using the GGM, where No is zero-degree geoid undulation; N 1 <sup>2</sup> <sup>+</sup> <sup>N</sup> 2 <sup>1</sup> + 1 is the geoid undulation calculated with enhanced coefficients according to Equation (1). Here, NRTE is the residual terrain effect on geoid undulation, calculated using the digital topographic data [27]. As the summation of the components that represent each different spectral constituent of the Earth's gravity field, the full spectral content geoid undulation (N*enh*.), which is comparable with ground truth data, is obtained.

In Equation (4), *NRTE* was calculated using the residual terrain model (RTM) mass reduction method evaluated in the Gravsoft program package, following the processing methodology outlined by [27], *Section 8.4.4 Terrain E*ff*ects and High Resolution Global Geopotential Models* in pp. 366–371. In computations, 3 arc-second Shuttle Radar Topography Mission (SRTM) topographic data was used for approximating the real terrain and the spherical harmonic expansion of the DTM2006.0 (up to d/o 2190) was used to model the mean elevation surface [27].

#### *2.5. Spectral Analysis of GGMs*

In the assessment of the quality of the GGM solutions, two types of errors are defined. Firstly, the noise in the observations that are used in the calculation of the GGM coefficients propagates from the measurements to the solutions, which is called the commission error. Secondly, an error is revealed because of estimating the projection of the gravity field functional (taking the anomalous potential as an example) on the finite dimensional subspace (generated by linear combinations of solid spherical harmonics up to a limited degree) instead of deriving its full content. This error is known as the omission error. The omission error has an easy formulation that is set by considering the coefficients left out from the truncated convergent series. In fact, this is the sum of the squares of all coefficients of degree higher than *max*. Clearly, however, this is an unknown quantity, which cannot be known a priori. However, the so-called degree variances (i.e., the sum over all orders of the squares of the coefficients up to a certain degree) can provide an idea of the GGMs' signal decay, which allows one to judge the commission error and compute the omission error according to an adopted law [27]. Accordingly, the signal and error degree variances of the GGMs provide an insight into the omission/commission error content of a GGM under consideration (see [24]). Hence, they are regarded as internal error estimates for the investigated models.

Using the fully normalized spherical harmonic coefficients, <sup>−</sup> *C m*, − *S<sup>m</sup>* from a specific degree of *max* and order *m* (*m* = 0, 1, ... *max*), the signal degree amplitudes (or square root of power per degree ) of functions of the disturbing potential *T*(ϕ, λ, *R*) at the Earth's surface are computed as follows [20,21,27]:

$$
\sigma\_{\ell} = \sqrt{\sum\_{m=0}^{\ell} \left( \overline{\mathbf{C}\_{\ell m}^{\prime 2}} + \overline{\mathbf{S}\_{\ell m}^{2}} \right)} \tag{5}
$$

Equation (5) provides the signal degree amplitude σ in terms of unitless coefficients, and the σ(*N*) in terms of geoid heights (meter) is shown in Equation (6):

$$
\sigma\_{\ell}(\mathsf{N}) \, := \mathsf{R} \, \sigma\_{\ell} \tag{6}
$$

The error spectra of GGMs are investigated depending on the error degree amplitudes, calculated using the estimation errors of the Stokes' coefficients (σ<sup>−</sup> *C<sup>m</sup>* , σ<sup>−</sup> *S<sup>m</sup>* ) of a spherical expansion model:

$$\sigma\_{\mathfrak{c}\ell}(\mathsf{N}) = \mathsf{R} \sqrt{\sum\_{m=0}^{\ell} \begin{pmatrix} \sigma^2 & +\sigma^2\\ \bar{\mathsf{C}}\_{\ell m} & \bar{\mathsf{C}}\_{\ell m} \end{pmatrix}} \tag{7}$$

As a function of minimum and maximum degrees , the accumulated (signal, error) degree amplitudes provide the power spectrum accumulated over a spectral band (between 1—usually <sup>1</sup> = 0 or 2 as minimum degree of the expansion—and 2—usually <sup>2</sup> = *max* as the maximum degree of the harmonic expansion equation) and shows the increase in overall power with increasing degree of :

$$\sigma\_{\ell\_1\ell\_2}(\text{accumulated}) = \sqrt{\sum\_{\ell=\ell\_1}^{\ell\_2} \sigma\_\ell^2} \tag{8}$$

#### *2.6. GGMs' Contribution in High-resolution Regional Geoid Modeling*

For the determination of high-resolution precise regional gravity field models, remove–compute–restore (RCR) is a very well-known approach and has been used in calculation of many countries' geoid models to date [27,61]. In the RCR scheme the terrestrial gravity and topography/bathymetry data are used along with a global geopotential model to smooth the observations for data gridding, transformations, predictions and to eliminate the aliasing effects [27]. The computation steps of the RCR procedure are as follows:

$$
\Delta \mathbf{g}\_{\rm res} = \Delta \mathbf{g}\_{FA} - \Delta \mathbf{g}\_{\rm GGM} - \Delta \mathbf{g}\_{tc} \tag{9}
$$

Equation (9) represents the 'remove' step where the long-wavelength component (Δ*gGGM*, from the GGM according to Equation (10)) of the gravity field, as well as the short-wavelength components (Δ*gtc*, the direct topographical effect on gravity, which consists of the subtraction of the attraction of condensed topographic masses from the attraction of all topographic masses above the geoid and is calculated as according to a selected reduction scheme—i.e., Helmert's Second Method of Condensation was used in this study (see *Equations (3–97) of Section 3.9* in [61])—using topographic data), are removed from the free-air gravity anomalies (Δ*gFA*). Hence the calculated and reduced residual gravity anomalies (Δ*gres*) on the geoid surface are evaluated in Stokes' formula.

$$
\Delta \mathbf{g}\_{\rm GGM} = \frac{\mathbf{G}M}{r^2} \sum\_{\ell=2}^{\ell\_{\rm max}} (\ell - 1) \sum\_{m=0}^{\ell} \left( \mathbf{\bar{\Delta C}}\_{\ell m} \cos m\lambda + \mathbf{\bar{S}}\_{\ell m} \sin m\lambda \right) \mathbf{\bar{P}}\_{\ell m} (\cos \theta) \tag{10}
$$

where the terms in Equation (10) are as introduced previously in Equation (1).

The 'compute' step includes the computation of the residual co-geoid using Stokes' integral (*N*Δ*g*), the long-wavelength geoid undulation (*NGGM*) using the GGM according to Equation (1) and finally the indirect effect of topographic mass attraction on geoid undulation (*Nind* using topographic data according to Equation (13)). After computing all constituents separately, their compilation step is called 'restore' and formulated as follows:

$$N = N\_{\rm GGM} + N\_{\Lambda \emptyset} + N\_{\rm ind} \tag{11}$$

as the high-resolution geoid undulation (*N*).

The Stokes' formula for calculating *N*Δ*<sup>g</sup>* is:

$$N\_{\Delta \mathfrak{g}} = \frac{R}{4\pi \mathfrak{r}\_{\mathcal{V}}} \iint\_{\sigma} S(\mathfrak{\psi}) \Delta \mathfrak{g}\_{\text{res}} d\sigma \tag{12}$$

where σ is the surface element, *R* is the mean Earth radius and *S*(ψ) is Stokes' function having the angular distance ψ from the computation point to the differential surface element *d*σ. *Nind* is:

$$N\_{\rm ind} = \frac{\pi \, G\rho H\_Q^2}{\gamma} - \frac{G\rho R^2}{6\gamma} \iint\_{\sigma} \frac{\left(H\_Q^3 - H\_P^3\right)}{\mathbf{s}^3} d\sigma \tag{13}$$

where ρ is the density of topographic masses; *HQ* and *HP* are the heights of the running and computation points, respectively; and *s* is the planar distance between *P* and *Q*. Hence, the output of the equation is *Nind* according to Helmert's Second Method of Condensation reduction scheme [27]. The details of the computation steps followed in this study were explained in [28] pp. 433–435 and Figure 13 given in the cited reference illustrated the adopted computational schema in here.

In the computation of a regional geoid model with hybrid (mixed) data in RCR, the GGM is responsible for the long-wavelength (signal) error content (see Equations (10) and (11)). When calculating a geoid model over a spatially limited region, while having low-accuracy and sparse terrestrial data, the quality of the GGM becomes even more important to compensate for the weakness of the terrestrial data accordingly in the RCR algorithm with specific modifications [30,67].

#### **3. Results and Discussion**

#### *3.1. Internal Error Estimates of Tested GGMs*

The error degree variances of the tested models were considered for internal uncertainty estimates of the GGMs. The theory and formulations for calculating error degree variances were given in Section 2.5. (see Equations (7) and (8)). Figure 1 shows the degree-wise accumulated error amplitudes of the models. The figures reveal the overall geoid undulation accuracies up to the maximum degree of each model. For d/o 200, which corresponds to the spatial resolution of 100 km, the cumulative errors are 0.8 cm for the DIR RL5 model (see Figure 1a), 2.2 cm for the TIM RL5 model (Figure 1b), 2.7 cm for the SPW RL4 model (Figure 1c), 2.0 cm for the GOCO05S model and 2.9 cm for the GOGRA04S model (Figure 1d), whereas it is 7.1 cm for the EGM2008 model, in terms of geoid undulation.

**Figure 1.** *Cont*.

**Figure 1.** Error amplitudes (accumulated) as a function of the spherical harmonic degrees "" in terms of geoid heights: (**a**) direct (DIR RLx) models; (**b**) time-wise (TIM RLx) models; (**c**) space-wise (SPW RLx) models; (**d**) GOGRA04S, GOCO05S and EGM2008 models.

When the cumulative error amplitudes are investigated, it is seen that the improvement with the final releases of the GGMs is up to 80% compared to their predecessors. It is also noteworthy that the cumulative error estimates of TIM RL5 and DIR RL5 differ by a factor of three. However, it is arguable if these cumulative error estimates depending on degree variances are realistic and, on the other hand, whether calibration is required or not. In this manner, in a similar study on GGM validations, [51] suggests applying a calibration factor of two for the DIR RL5 model.

Related to concerns about the uncertainty estimates of the GGMs depending on the error degree variances approach, [68] emphasizes these issues: (1) firstly, the correlations among the coefficients are ignored while evaluating the GGMs with degree variances; (2) the effect of the polar-gap problem on the zonal and near-zonal coefficients of the GOCE-only models obviously influences the provided standard deviations of the spherical harmonic coefficients, specifically for the GOCE-only TIM and SPW models; and (3), on the other hand, dependence of the GOCE models' uncertainties and performances on the geographic location cannot be ignored in evaluations. Regarding the mentioned handicaps, beside the degree variances-based assessments, evaluating the GGMs using geographically dependent standard deviations (if possible) with the propagation of full variance–covariance matrices of the spherical harmonic coefficients is suggested by [68].

In summary, although the accumulated error degree variances provide a global mean of the GGM's performance over the entire sphere, which is significantly influenced by the large uncertainties in the polar gap, deriving concluding measures for its performance in a local region is not straightforward. Therefore, Section 3.2 provides results of local validations of tested GGMs using terrestrial data.

#### *3.2. Overview of the Regional Accuracies of the GGMs*

The results of evaluating the gravity field models using various data sets in different regions are being published regularly as the new models are released. However, the methodologies applied in these assessments vary to differing degrees. Regarding a number of significant studies from the literature, the following can be noted.

Reference [19] provides a detailed assessment of the first generation GOCE-based models using LEO satellite orbit perturbations and GPS/leveling geoid height differences. The conclusions of their study confirm that the orbital perturbations are mostly sensitive to the long-wavelengths of the gravity signal and hence provide clear results in quality assessments of the GGMs in a global manner. In contrast, the geoid comparisons using GPS/leveling data give the opportunity to assess the model performances at the medium to short-wavelengths. In the study, the regional assessments using the

GPS/leveling data were carried out in Australia, Europe, North America (US and Canada), Japan and Germany. The preliminary conclusions from this study give a global geoid accuracy of about 5–6 cm with a 111 km spatial resolution (at d/o of 180). Here, it is emphasized that they obtain this level of accuracy as a result of just two-month GOCE data, and the goal of the mission (1–2 cm geoid accuracy with 100 km resolution) could be reached with the availability of more data. In terms of geoid residual RMS values, [19] reports the EGM2008 model accuracies as the most improved in Germany (3.5 cm), Japan (10 cm) and Canada (10.5 cm), whereas weaker fitting performances were found in Australia (24 cm), Europe (21.5 cm) and the US (26.5 cm) with comparatively larger residual RMS values. The non-homogeneous control datasets and vertical datum distortions were cited as possible reasons for the larger error levels in the continent-wide assessments.

Reference [21] provides the validation statistics of the released model for the maximum truncation degrees (without spectral enhancement) using similar datasets. Reference [15] evaluated the first generation GOCE-based models globally using EGM2008 and regionally using terrestrial gravity data and astrogeodetic vertical deflections. The analyses by [15] clearly show the improvement in gravity field mapping with GOCE data contribution and identify the reasons for the reported improvement specifically at certain spectral bands. Reference [69] evaluated the first and second releases of the models by means of gravity anomalies and radial gradients from GOCE gradiometer data using least squares collocation at several regions. The evaluations of the models from the first to third releases were published by [70] using the terrestrial gravity data in Norway and [71] over the Earth by means of gravity gradients from the GOCE gradiometer data. There are also other studies that validate the GOCE-based GGMs over different regions using various terrestrial data sets including [72] (in Sudan), [73] (in Brazil), [74] (in Hungary) and [68] (in Germany). Reference [75] provides the recent results of regional validations of all releases over different countries. However, the literature lacks information by means of GGM validation statistics over many areas of Earth. This may stem from the countries' data restriction policies and/or lack of quality terrestrial data for the region. Accordingly, the numerical section of this study focused on regional evaluation of the GGMs in Turkey.

In the following sections, in addition to reporting the models' formal error estimates, the performances of the GOCE-based GGMs were evaluated in terms of absolute accuracies. The GOCE-based models were compared with EGM2008 in order to clarify the GOCE observations' contribution to the GPS/leveling benchmarks in Turkey. In addition, the spectral bands of the spherical harmonic expansion models, where the improvement of tested GGM is the most significant, were determined and employed in the regional geoid modeling.

#### 3.2.1. Assessment of GGMs' Accuracies in Turkey

In accuracy assessments of the HPF-released GGMs in Turkey, a number of numerical tests were carried out in the country at 36◦N–42◦N latitude and 26◦E–45◦E longitudes. The test statistics of the geoid height residuals (Δ*N*) of the observations and the models were calculated for two different high order GPS/leveling network benchmarks, separately. The GPS/leveling networks include 30 and 81 points, respectively. The first dataset covers the whole of Turkey (Dataset I), and the latter is denser but localized in the northwest of the country (Dataset II) [76]. Carrying out the tests individually using both datasets (instead of using a unified dataset) aims to make validation results independent from the effects of possible datum differences between the datasets. A detailed description of the employed data is provided in Table 2. The distributions of the benchmarks on the topographic maps are given in Figures 2 and 3.


**Table 2.** GPS/leveling datasets description.

<sup>1</sup> The datasets include first order GPS network points of Turkey National Fundamental GPS Network with Helmert orthometric heights (third-order leveling network) in regional vertical datum. BM: benchmark.

**Figure 2.** Distribution of 30 benchmarks (Dataset I) on Shuttle Radar Topography Mission (SRTM) topography.

**Figure 3.** Distribution of 81 benchmarks, northwest of Turkey (Dataset II) on SRTM topography.

The numerical results of the study were interpreted from different perspectives, including: (1) verifying the role of the data amount and the calculation strategy of the model on its accuracy and performance; (2) determining an optimum d/o of the models available in the study area (whereas

the term of the "optimum" possibly sounds vague and its definition is dependent on the application, the "optimum d/o" was used here to mean the degree of the spherical harmonic expansion at which the highest accuracy is obtained in the tests performed with the terrestrial data in the study area); and, (3) clarifying the contribution of a best-fitting geopotential model with its optimum d/o to the development of a precise regional geoid model using RCR approach.

In the following, Figures 4a and 5a show the evaluation results of the DIR RL1, RL2, RL3, RL4 and RL5 models, which are enhanced using the high-resolution EGM2008 model and high-resolution DTM data (SRTM3), at GPS/leveling networks with 30 benchmarks throughout Turkey and 81 benchmarks in the northwest of the country, separately. The graphics show the standard deviations of the geoid height residuals, calculated at the co-located GPS/leveling benchmarks with the GGM-derived geoid undulations, which were truncated and enhanced degree by degree from two to the maximum harmonic expansion d/o of the tested models (see Equations (1) and (4)). Looking at the statistics of the two datasets in the graphics, the improvement of the model performances from the first release to the last can be verified.

Similarly, Figures 4b and 5b show the performances of the TIM models, depending on the standard deviations of the geoid undulation residuals at co-located GPS/leveling benchmarks. Considering the results in graphics, the improvement of the TIM models as the release number increases is also clear.

The SPW models were also compared for the given two geodetic network datasets in Figures 4c and 5c. The statistics of SPW model validations also show the progress with increasing release number.

The numerical tests, which were carried out fulfilling the first objective of the current section, show improvement in the models with the increasing amount of data (by means of the observation cycle; please see Table 1). This conclusion has been drawn independently from the GGM computation strategy. Carefully looking at the graphics given in Figures 4 and 5, the direct models reveal almost similar performances to each other and with the EGM2008 model, of up to approximately 70 d/o of the spherical harmonic expansions; however, this is not the case for the TIM and SPW models. This situation can be explained as a consequence of the differences among their computation strategies (compare the strategies and the a priori information in computations in Table 1). Since EIGEN-51C combined GGM [39] was employed for the regularization of the GOCE-based models in the direct approach, they were reinforced by this background model. In the long-wavelengths, the used background model utilizes the information from GRACE and CHAMP, and the medium to short-wavelengths are provided by DNSC08GRA [77] and thus (indirectly) the terrestrial gravity (as contained by EGM2008) over the land areas. Regarding the information on the determination of the direct models, it is not possible to strictly quantify which spectral bands provide improvement, and the extent to which improvement is provided by the GOCE observations when considering solely these models.

The processing strategy for computation of the time-wise models relies exclusively on GOCE gradiometry and hl-SST observations, and only Kaula's regularization is employed to constrain the model at the short-wavelengths. Hence, the assessments of time-wise models can provide a realistic measure of the GOCE-based contribution to the GGMs without the intervention of other data sets. When the evaluation results of these models are considered, the contribution of the GOCE signal to the improvement of gravity field mapping in Turkey is clarified (see Figures 4b and 5b). The maximum improvement is seen at the spectral bands between d/o 100 and 250 as is promised by the GOCE Earth gravity field satellite mission.

The improvement in gravity field mapping with a GOCE-based contribution is shown to be significant in Figure 5. However, it is not highly significant in the test results obtained with 30 benchmarks, distributed over the country (see Figure 4). This may be because of the insufficient density and distribution of the test benchmarks (as seen in Figure 1) and relatively lower accuracy of the GPS/leveling heights at these benchmarks. Hence, it may be concluded that the first dataset, with 30 benchmarks, is not sufficient by means of data quality and distribution for carrying out a detailed assessment of the geopotential models regarding different spectral bands. However, the evaluation results using this data set were nonetheless considered as a country-wide look at the accuracy level of the tested GGMs. In this manner, the test results that were obtained through the locally distributed 81 benchmarks (Dataset II) revealed the significant improvement of the GOCE-based models in the medium-wavelength spectral bands.

**Figure 4.** Validation of global geopotential models (GGMs) regarding the standard deviations of the geoid undulation residuals (*NGPS*/*lev*. <sup>−</sup> *<sup>N</sup>enh*. *GGM*) in meters (for Dataset I): (**a**) DIR RLx models; (**b**) TIM RLx models; (**c**) SPW RLx models; (**d**) comparison among the last versions of the satellite-based and combined models.

Considering the entire statistics from the evaluations of direct, time-wise and space-wise models (namely, GO\_CONS\_GCF\_2\_ DIR, TIM, SPW) for the two datasets, it is concluded that the computation strategy of the GGMs does not have a significant role in the improvement of the model accuracies. In this part of the study, in addition to the GOCE-based DIR, TIM and SPW models (actually computed with the contribution of the GOCE and GRACE missions' data), the final releases of the two other combined GGMs, namely, the GOGRA04S and GOCO05S models, were also evaluated. The GOGRA04S model combined approximately four years of GOCE gradiometry data with GRACE observations from a cycle of satellite tracking in excess of 10 years. Despite the complementary advantage of the two missions in the solution, the GOGRA04S model did not reveal significant improvement in mapping the regional gravity field compared to the last releases of the GOCE models (see graphics in Figure 4d). The GOCO05S model combined four years of GOCE gradiometry data with GRACE, CHAMP and SLR observations (with the GRACE normal equations from the ITSG-GRACE2014S model). In the

assessment results, the performance of the model is better than that of GOGRA04S and almost equal to the final releases of the GOCE-only models up to d/o of 155 and, in relative assessment, it performs better than the other models for degrees between 230 and 270 (Figures 4d and 5d).

**Figure 5.** Validation of GGMs regarding the standard deviations of the geoid undulation residuals (*NGPS*/*lev*. <sup>−</sup> *<sup>N</sup>enh*. *GGM*) in meters (Dataset II): (**a**) DIR RLx models; (**b**) TIM RLx models; (**c**) SPW RLx models; (**d**) comparison among the last versions of the satellite-based and combined models.

Accordingly, the accuracies of the GGMs in terms of standard deviations of the geoid height residuals, obtained through their maximum and optimum degrees (optimum degree corresponds to spherical harmonic expansion degree of the model with the highest accuracy with a minimum standard deviation of the geoid undulation differences; see Figure 5), are shown in Figure 6a,b, separately computed for each test area, respectively. Given statistics in figures are derived using the enhanced coefficients of the GGMs over optimum (blue bars) and maximum (orange bars) expansion degrees of the models using the EGM2008 model and high-resolution DTM. The accuracies that are obtained through employing the full-expansion of the EGM2008 model are shown with a black color bar at the end of each graphic.

**Figure 6.** Standard deviations of the geoid height residuals in cm (note that the geoid heights (*Nenh*. *GGM*), derived from the enhanced GGMs over optimum (blue bar) and maximum (orange bar) degrees for (**a**) the first dataset; (**b**) the second dataset.

The given statistics show that the GGMs, enhanced on their optimum d/o, revealed an almost 50% better fit comparing the results obtained from the enhanced models on their maximum degrees. Comparing with the EGM2008 model, the maximum improvement was recorded by the SPW RL5 model (a 23.8% improvement with respect to the EGM2008 model) in the test results using the second data set. Regarding the overall test results, obtained using both datasets, separately, it was seen that the GOCE-based GGMs had accuracies between 10 cm and 15 cm in Turkey. While interpreting the reasons for obtaining different fitting statistics of the GGMs in the tests with two datasets, it is worth noting the possible role of the tectonic structure in the region. Turkey is in the collision zone between three plates (African, Arabian and Eurasian) (see [78] for the active faults in Turkey) and hence continuously exposes horizontal and vertical crustal movements (see [79,80] for vertical and horizontal velocities in Turkey). Accordingly, the approximate rates in horizontal and vertical positions are reported as 2–3 cm/year in [80]. Moreover, an earthquake of Mw ≥ 6 occurs every 1–2 years, particularly around the North Anatolian Fault Zone. All these activities continuously distort the geodetic networks by time and even have a minor effect on the change of the gravity field; however, the latter is not significant for the scope of this study. Hence, the first geodetic network (Dataset I) was exposed to the Izmit earthquake of 1999 (7.6 Mw) and the northwest part of this geodetic network, including the 81 benchmarks of Dataset II, have been remeasured and adjusted since then. Although the networks were positioned

at the same geodetic datum, the physical establishments of the network stations were distorted by time and these distortions negatively influence their quality as control data in model evaluations. As a conclusion, the ground-truth data employed to validate the qualities and fitting performances of the GGMs should be of high quality, sufficiently dense, homogeneously distributed over the test area and as up-to-date as possible, especially in areas under the influence of dynamic Earth activities, such as Turkey.

In addition, the recently published sixth releases of the direct and time-wise models were considered in evaluations and they were compared with the RL5 models by means of standard deviations of the geoid undulations at the second dataset benchmarks. Figure 7 shows the comparisons of the models. According to obtained statistics, the RL6 models shown similar performances with the RL5 models between the 100 and 200 d/o of the spectrum. However, in degrees between 200 and 280, TIM RL6 models shown slightly better performance comparing with the RL5 and DIR RL6 models and had an improvement in standard deviation less than 1 cm.

**Figure 7.** Comparison of RL5 and RL6 models: (**a**) the standard deviations of the geoid undulation residuals (*NGPS*/*lev*. <sup>−</sup> *<sup>N</sup>enh*. *GGM*) in meters (Dataset II), (**b**) the standard deviations of geoid undulations in cm, and please note that the geoid heights (*Nenh*. *GGM* ), derived from the enhanced GGMs over the optimum (blue bar) and the maximum (orange bar) degrees of the models.

#### 3.2.2. Testing the GGMs Contribution in Detailed Geoid Modeling

A high-resolution geoid model has been an essential component of the modern geodetic infrastructure in many countries, particularly in recent decades (see the geoid models' repository at [81] for some of the released regional geoid models). As a consequence of the progress in global positioning systems, many countries, such as Canada, New Zealand and the US, redefined their vertical datum to refer to a precise geoid model utilizing GNSS techniques. Hence, they also aimed to obtain further benefit from the satellite-based positioning technology for obtaining precise physical point heights in real time in many geodetic applications. Equations (11)–(13) provide the fundamental theory in the determination of a high-resolution geoid model using RCR steps. As was seen, the method combines GOCE-based GGM data with terrestrial data. Thus, the error content of the geoid undulation (Equation (11)) includes the errors from the global geopotential model, as well as the terrestrial gravity data and digital height data. Therefore, in order to clarify how the GGM accuracy affects the determination of the high-resolution geoid model, the numerical results from regional geoid modeling studies in Turkey are shared and discussed in this article. The terrestrial gravity data used in the RCR computations of this study are in five arc-minute resolution gravity anomalies in the IGSN-71 datum. Reference [82] reports that the estimated accuracy of the used gravity anomaly grid is at the level of 7–8 mGal based on cross-validations, as well as validations using point-wise gravity observations.

Using the terrestrial gravity data and the identical computation parameters, twelve experimental detailed geoid models (expTG-) were calculated for Turkey. Each of these models relies on a different GGM with either the optimum (*opt* = 155) (called expTG-Opt models) or the maximum (called expTG-M models, having *max* = 330/300/280) degrees of the spherical harmonic expansions. In the tests, in addition to the best performing DIR RL5, TIM RL5, SPW RL5 and GOCO05S models, the EGM2008 coefficients up to the identical expansion degrees (*max* = 155, 280, 300, 330) of the other tested models were also considered (see the test statistics of the experimental models in Table 3).

The twelve experimental models were validated at 81 GPS/leveling benchmarks of the geodetic network in northwest Turkey. Regarding the obtained results (see Table 3), all the validated experimental geoid models (relying on the reference GGM with optimum d/o of spherical harmonic expansion) provided an accuracy of 16.5 cm in terms of standard deviations of the residual geoid undulations without removing the tilt/trend at GPS/leveling benchmarks. However, employing the GGMs with the maximum degrees of their expansions, the calculated experimental geoid models revealed an accuracy of approximately 20 cm. This means that the use of optimum d/o of the geopotential model led to an improvement in the accuracy of the regional geoid model using Stokes' integral without any modification and any cap size limitation. The table also includes the statistics of the experimental geoid models after de-trending the models' surface simply using a third-order polynomial equation. The cross-validation of the de-trended models gave a so-called improved accuracy of around 9.7 cm for the models. The expTG-Opt-1 geoid model surface is shown in Figure 8.

**Figure 8.** The experimental Turkey geoid, calculated using the remove–compute–restore (RCR) approach based on the reference geopotential model DIR-RL5.

As also seen in Table 3 (statistics before fit), the detailed geoid models, which were calculated using the classical RCR algorithm with GOCE-based reference models, are not more accurate than the enhanced EGM2008 model. This is a consequence of insufficient accuracy and density of the mean terrestrial gravity anomalies that were used in the computations. The role of the terrestrial gravity data accuracy for the quality of the calculated geoid model is explained by [83,84]. In accordance with a 1–2 mGal accuracy of the terrestrial gravity data, having 1–2 km spatial resolution with homogenous distribution is required if 1–2 cm accuracy of the geoid undulations is aimed. Compared to this criterion, the accuracy and resolution of the used mean gravity anomalies (with 6 mGal accuracy and 5 grid spacing) in computations do not satisfy the requirements for determining a centimeter accuracy regional geoid. The accuracy of the used GPS/leveling data (~1.5–2.0 cm three-dimensional position accuracy and ~2.0 cm leveling height accuracy) in validations was sufficient for the purpose of this study. Although it is not officially declared in technical reports explaining the calculation strategy and data content of EGM2008, if the terrestrial gravity data of Turkey or the surrounding area was used in EGM2008 computations, this could explain the superior performance of this model in the validation results.

It is possible to obtain slightly improved accuracy in regional geoid models by applying modifications to the Stokes' integral in order to achieve a more rigorous combination of the geopotential models with terrestrial data by also considering their variances [38]. Hence, it will be possible to compensate for the random errors that are contained in the terrestrial data with the improved quality of the geopotential models up to a certain amount.


**Table 3.** Validation statistics of experimental geoid models at 81 GPS/leveling benchmarks (in cm).

expTG: experimental Turkey Geoid, Opt: optimal d/o of reference GGM, -M: max. d/o of ref. GGM.

#### **4. Conclusions**

The purpose of this study is to investigate the progress of GOCE High-Level Processing Facility (HPF) released geopotential models that include data from gravity field satellite missions, and provide assessment results of these models for Turkey. The analyses were carried out by means of both spectral and spatial approaches. Regarding the numerical results and obtained statistics through these analyses, the following conclusions were drawn:

GOCE HPF-released GOCE- and GRACE-based global geopotential models were validated using the cumulative degree variances. Regarding the spherical harmonic expansion degrees (between 100 and 200 degrees of SH expansions, corresponding to the medium-wavelengths of the gravity field, where a significant contribution of the GOCE mission is expected), the cumulative formal errors (in terms of the geoid undulations) were 0.8 cm, 2.2 cm and 2.7 cm for DIR RL5, TIM RL5 and SPW RL4 models, respectively, whereas they were 2.0 cm, 2.9 cm and 7.1 cm for GOCO05S, GOGRA04S and EGM2008 models, respectively.

Considering the cumulative error amplitudes at the corresponding degrees for 100 km spatial resolution, the significant improvement toward the final releases of the geopotential models was verified (the improvement of final releases is almost up to 80% with respect to the first releases). On the other hand, the cumulative formal error estimate of the TIM RL5 model is three times worse than that of the DIR5 model. Though the difference seems to be quite significant, it is arguable if such a comparison without any calibration is objective.

The regional validations of the tested geopotential models against the GPS/leveling data were carried out with the spectral enhancement method. Hence, the higher frequencies in the content of the tested GGMs were synthesized from ultra-high degree coefficients of the EGM2008 model and the regional DTM data. As a conclusion, the spectral enhancement method revealed meaningful results and is recommended to be used for GGM validations using terrestrial data. In the results of the comparisons among the model releases (in every group of models as DIR, TIM, SPW), it is concluded that the model accuracy improves as its release increases, and that this improvement is independent of the computation strategy.

The spatial analyses of GGMs were carried out using GPS/leveling data of high-order geodetic networks in Turkey. The accuracy of the used control data, their spatial distribution and density, as well as the size of the validation area, are critical for the model validations and affect the reported performances. In this manner, the second test dataset can be said to be more suitable and hence the obtained validation statistics are more realistic in terms of the detailed analyses of the GGMs in the study area. The spatial analyses results clarified the GOCE mission's contribution to the improvement of the GGMs (with respect to the EGM2008 model), especially in the corresponding spectral bands of 100 and 250 degrees. Moreover, comparing standard deviations of the geoid undulation differences calculated with an increasing degree of the spherical harmonic expansion model, the optimum degree was determined with a minimum standard deviation value. Accordingly, the optimal value for d/o of spherical harmonic expansion is around 155 in the study area. In the test results explained in Section 3.2.1, an approximate accuracy for the tested GGMs is about 12 cm. EGM2008 model accuracy is calculated to be around 15 cm in Turkey. This is three times greater than the reported absolute accuracy of this model for well-covered areas with quality terrestrial data on Earth. This result indicates that either EGM2008 does not include the terrestrial data of Turkey or the quality of the included data is low. In the analyses, the GOGRA04S and GOCO05S models were also validated. However, since the obtained statistics did not reveal any improvement, it can be concluded that the complementary advantage of the GOCE gradiometry data with the observations from other gravity field satellite missions and SLR is not significant.

In the final part of the study, the role of the reference geopotential model accuracy, as well as its optimal expansion degree in determination of the detailed hybrid geoid model, were validated based on the RCR computation algorithm.

The obtained results revealed a 15% improvement in the accuracy of the gravimetric geoid model (considering the results without corrector surface fitting) when the GGM is employed with its optimum degree instead of the maximum degree (~16.5 cm vs. 19.5 cm for DIR RL05 as seen in Table 3). This conclusion was drawn using an unmodified Stokes kernel. On the other hand, a combination of the GGMs with terrestrial gravity data is possible using the RCR algorithm with stochastic modification of the Stokes kernel. Various approaches for the modification can be found in the literature. Hence, a more rigorous combination of the data may contribute to further improvement of the detailed geoid accuracy via compensating for the error budget, and realizing the GGMs' improvement to a certain extent would be possible. A research project, which was recently carried out in Turkey and the present study is a part of it (with The Scientific and Technological Research Council of Turkey, TUBITAK, support for contract number 114Y581), performed different methodologies related to detailed geoid modeling using hybrid data, and clarified the role of the modifications on a rigorous combination of GGMs with terrestrial data in order to improve the accuracy of a regional geoid model.

As a summary, global geopotential models have significantly progressed due to the data contribution from recent Earth gravity field satellite missions. The improved GGMs, in terms of accuracy and resolution, hence benefit many engineering and Earth science disciplines that use gravity field parameters in their analyses. The most significant impact of the improvement in GGMs' quality is on the vertical datum definition and realization, and global unification of the height systems. This study provided the methodological steps for the assessment of GGMs and quantified the improvement of the models in various regions, with cited statistics from the literature and numerical test results in Turkey.

**Author Contributions:** Conceptualization, B.E. and S.E.; methodology, B.E.; software, M.S.I.; validation, B.E., M.S.I. and S.E.; formal analysis, M.S.I.; investigation, M.S.I.; resources, B.E.; data curation, B.E., M.S.I. and S.E.; writing—original draft preparation, B.E.; writing—review and editing, S.E. and M.S.I.; visualization, M.S.I.; supervision, S.E.; project administration, B.E.; funding acquisition, B.E. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by The Scientific and Technological Research Council of Turkey (TÜB˙ ITAK), grant number 114Y581 and Istanbul Technical University Scientific Research Projects (ITU BAP) funds, grant number 38828.

**Acknowledgments:** We acknowledge International Center for Global Earth Models (ICGEM) at GFZ German Research Center for Geosciences that the used geopotential models were obtained from the service's data center.

**Conflicts of Interest:** The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Next-Generation Gravity Missions: Sino-European Numerical Simulation Comparison Exercise**

**Roland Pail 1,\*, Hsien-Chi Yeh 2, Wei Feng 3, Markus Hauk 1, Anna Purkhauser 1, Changqing Wang 3, Min Zhong 3, Yunzhong Shen 4, Qiujie Chen 4, Zhicai Luo 5, Hao Zhou 5, Bingshi Liu 6, Yongqi Zhao 6, Xiancai Zou 6, Xinyu Xu 6, Bo Zhong 6, Roger Haagmans <sup>7</sup> and Houze Xu <sup>3</sup>**


Received: 10 October 2019; Accepted: 11 November 2019; Published: 13 November 2019

**Abstract:** Temporal gravity retrieval simulation results of a future Bender-type double pair mission concept, performed by five processing centers of a Sino-European study team, have been inter-compared and assessed. They were computed in a synthetic closed-loop simulation world by five independent software systems applying different gravity retrieval methods, but were based on jointly defined mission scenarios. The inter-comparison showed that the results achieved a quite similar performance. Exemplarily, the root mean square (RMS) deviations of global equivalent water height fields from their true reference, resolved up to degree and order 30 of a 9-day solution, vary in the order of 10% of the target signal. Also, co-estimated independent daily gravity fields up to degree and order 15, which have been co-estimated by all processing centers, do not show large differences among each other. This positive result is an important pre-requisite and basis for future joint activities towards the realization of next-generation gravity missions.

**Keywords:** next-generation gravity mission; temporal gravity field; numerical closed-loop simulation; satellite mission constellations; mass transport

#### **1. Introduction**

Next Generation Gravity Missions (NGGMs), expected to be launched in the midterm future, have set high anticipations for an enhanced monitoring of mass transport in the Earth system, establishing that their products are applicable to new scientific fields and serving societal needs. Past and current gravity missions such as Gravity Recovery and Climate Experiment (GRACE, [1]) and GRACE Follow-On [2] have improved our understanding of many mass change processes, such as the

global water cycle, ice mass melting of ice sheets and glaciers, changes in ocean mass closely related to eustatic sea level rise, which are subtle indicators of climate change, but also gravity changes related to solid Earth processes such as big earthquakes or glacial isostatic adjustment (GIA). The Gravity field and steady-state Ocean Circulation Explorer (GOCE) mission [3] has improved our knowledge of long-term mass distribution and has provided the physical reference surface of the geoid, with a resolution down to 70–80 km.

The main objective of NGGMs is not only the continuation and extension of existing mass transport time series, but also a significant gain in spatial and temporal resolution. Correspondingly, the science and user needs of NGGMs were defined in [4]. In several previous studies, different mission concepts have been studied in detail, with emphasis on orbit design and resulting spatial-temporal ground track pattern, enhanced processing and parameterization strategies, and improved post-processing/filtering strategies in order to reduce temporal aliasing effects, which are one of the main error sources of current temporal gravity field solutions [5]. The typical GRACE-type concept of two satellites (satellite pair) following each other on the same orbit with an inter-satellite distance of about 200 km observes only the along-track component of the Earth's gravity field, and therefore leads to a very anisotropic error structure and the typical striping patterns in the resulting temporal gravity solutions. An alternative mission concept is a satellite pair in pendulum configuration, where the trailing satellite performs a pendulum motion with respect to the leading satellite, thus observing not only the along-track, but also parts of the cross-track component. This can be realized by a shift of the right-ascension of the satellites of the second pair, resulting in a shifted orbit plane of the second satellite pair compared to the first one. Based on this concept, in 2010 the mission proposal "e.motion —Earth System Mass Transport Mission" [6] was submitted in response to ESA's Earth Explorer 8 call. Another promising mission scenario is a double-pair constellation being composed of two in-line pairs, the so-called Bender configuration [7]. It is composed of a polar pair similar to the GRACE-type concept, and an inclined pair with an inclination of 65–70 degrees. An example of the resulting ground-track pattern of such a Bender constellation is shown in Figure 1.

**Figure 1.** Ground track pattern of Bender configuration of Scenario A (cf. Table 1). The red curve shows the 9-day near-repeat ground track pattern of the polar satellite pair, and the blue curve the ground track of the inclined pair with an inclination of 70◦. The zoom-in illustrates the resulting regular spatial sampling pattern as well as the fact that the direction of the inter-satellite ranges is different.


**Table 1.** Main orbit parameters of numerical simulation scenarios.

Extensive numerical simulations have shown that this set-up also results in a significantly improved isotropy and reduction of stripes, e.g., [8,9], which is mainly due to the different directions of inter-satellite ranging observations, and the different orbital frequencies of the two pairs. Based on this concept, in 2016 e.motion<sup>2</sup> was proposed as ESA Earth Explorer 9 mission [10]. Another innovative observation concept is high-precision high-low inter-satellite ranging, which was proposed as the MOBILE mission in response to ESA's Earth Explorer 10 call, e.g., [11] and [12]. On US side, the United States National Academies of Sciences, Engineering, and Medicine published the decadal strategy for Earth observation from space [13], where mass change was identified as one of the top five designated observables to be implemented by future US Earth observation missions, in order to ensure continuity and enable long-term mass budget analyses of the Earth system.

A NGGM concept was also proposed in China, and the key technologies have been developed to meet the requirements of NGGM [14–16]. The objectives of Chinese NGGM are not only to perform global gravity measurements, but also to verify the maturity of key technologies required for space-based gravitational-wave detection [17] to a certain level of precision. The scientific objectives and the system requirements of Chinese NGGM are under investigation in these years. This mission is expected to be approved soon, and is planned to launch in the middle 2020s.

The existence of this study roots back to a Round Table China–Europe meeting on Satellite Gravity Exploration, which took place in September 2013 in Beijing. There it was agreed to evaluate options for further joint activities on NGGMs, especially science studies on numerical mission simulations. This was eventually realized in the frame of the European Space Agency's "Additional Constellation and Scientific Analysis Studies of the Next Generation Gravity Mission (ADDCON)" study. ESA's ADDCON study team and a Chinese study team joined forces in an initiative to perform numerical studies on NGGMs based on double-pair and multi-pair constellations of GRACE/GRACE-FO-type satellites.

A main prerequisite for the performance assessment and interpretation of joint numerical simulations is to ensure comparability of the results of the contributing groups. Therefore, it was decided early in the project to jointly set up common test scenarios and corresponding data sets, which shall be processed by all contributing processing centers. This paper reports on results of this inter-comparison exercise. The contributing processing centers are Technical University of Munich (TUM), Institute of Geodesy and Geophysics, Chinese Academy of Sciences (IGG-CAS), Tongji University (Tongji), Wuhan University (WHU), and Huazhong University of Science and Technology (HUST).

The paper is structured as follows. Chapter 2 describes the basic idea of a numerical closed-loop NGGM simulation, and describes the main input data, which have been jointly used by all groups. Chapter 3 provides an overview of different methods applied by the study teams. Chapter 4 reports the numerical simulation results, their comparison, assessment, and inter-comparison. The final Chapter 5 contains main conclusions as well as an outlook to future work.

#### **2. Closed-Loop Simulations and Input Data**

The basis of the inter-comparison exercise are so-called numerical closed-loop simulations. Compared to real-data processing, the advantage of these closed-loop simulations, which are performed in a completely synthetic simulation world, is that the truth is perfectly known. Therefore, the effect of any data set, instrument noise assumption, or processing strategy is directly reflected by deviation of the final result from the "true world", and can be studied separately from all other error contributors to the total error budget.

Figure 2 shows an overview of the numerical simulation set-up. The simulation starts with the definition of the "true world" (upper left), using force models for the static gravity field as well as tidal and non-tidal temporal variations. All gravity-related models are parameterized as coefficients of a spherical harmonic (SH) series expansion. In parallel, specific orbit constellations and the corresponding orbit parameters have to be defined, in our case Bender-type double pairs. The corresponding orbits are generated by numerical integration. Based on the a-priori force models, error-free observations are computed along the orbit tracks, with a sampling period of 5 s. They are then superimposed by instrument errors with pre-specified spectral noise characteristics, resulting in "true observations", which mimic as realistically as possible the real observations. On the other hand, a reference world is built up (upper right). This is necessary because the functional model of gravity field estimation from low–low inter-satellite ranging observations is highly non-linear, and this requires a Taylor point as approximate prior knowledge. In the same way as for the "true observations", so-called "reference observations" are computed, which are based on force models that are usually different from those used for the "true world" scenario. The main goal of the gravity field determination is then to estimate residuals with respect to these a-priori models. Adding the residuals to the a-priori models should result in the ideal case (no errors in the system) in the true world models. In the case that errors do exist, such as instrument errors, but also aliasing errors due to the specific space-time sampling properties of the chosen orbit constellation, the estimated gravity field will deviate from the true world, and thus provides a clear hint on the achievable performance of the specific mission design.

**Figure 2.** Scheme of a closed-loop simulation set-up. The red blocks are related to orbit design and input data generation, the blue blocks are tightly linked to the gravity parameter estimation process, while the green block refers to the comparison and validation of the resulting gravity models.

 For the inter-comparison exercise, two Bender double-pair scenarios have been defined (Table 1). They are based on extensive prior studies on optimum orbit design of double-pair configuration with the goal to optimize the space-time sampling. The two scenarios A and B mainly differ in the orbit altitude of the polar satellite (390 vs. 340 km) and their repeat cycle (9 vs. 7 days). The orbits were designed such that ground tracks of the polar and inclined satellite pair drift at the same rate within the sub-cycle period, and (b) the retrieval period is always equal to the length of the sub-cycle, so that the gravity field retrieval is always supported by the densest possible ground track pattern. As a side

aspect, these drifting orbits have the additional advantage that they produce a dense ground-track distribution over longer periods of time, which might improve the spatial resolution for the gravity retrieval of long-term static gravity solutions.

Table 2 provides an overview of the force and noise models of the true and reference world. Static, tidal, and non-tidal gravity models are used up to SH degree and order (d/o) 120.

**Table 2.** Force and noise models of the "true" and "reference" world used in the Next Generation Gravity Mission (NGGM) simulations.


The ISL is assumed to be realized by a Laser Ranging Interferometer (LRI). The assumed spectral noise behavior of the LRI in terms of range rate is expressed by the following Amplitude Spectral Density (ASD) [22]:

$$d\_{\rm range\text{-}rates} = 2 \times 10^{-8} \times 2\pi f \times \sqrt{\left(\frac{10^{-2} \text{Hz}}{f}\right)^2 + 1} \frac{\text{m}}{\text{s} \sqrt{\text{Hz}}} \tag{1}$$

where *f* is the linear frequency. The resulting error spectrum is shown in Figure 3a (dark blue curve), and the corresponding noise realization in light blue color.

**Figure 3.** Amplitude spectral density (ASD) of (**a**) relative distant measurement errors in terms of range-rates, and (**b**) accelerometer errors. The dark blue and red curves show the analytical noise models in accordance with Equations (1) to (3), and the light blue and green curves the corresponding noise realizations, which have been generated scaling the spectrum of normally distributed random time-series with their individual spectral model.

The non-gravitational forces are typically sensed by the on-board accelerometers located in the center-of-mass of the satellite. In our simulations, an electrostatic accelerometer is assumed with two high-sensitive axes oriented in the flight direction (largest signal) and radial direction, and one low-sensitive axis in the cross-track direction (see Figure 3b). The accuracy level in terms of accelerations is expressed by [22]:

$$d\_{\rm acc. along} = d\_{\rm acc. radial} = 10^{-11} \sqrt{\left(\frac{10^{-3} \text{Hz}}{f}\right)^4 \left(\left(\frac{10^{-5} \text{Hz}}{f}\right)^4 + 1\right) + 1} + 1 + \left(\frac{f}{10^{-1} \text{Hz}}\right)^4 \frac{\text{m}}{\text{s}^2 \sqrt{\text{Hz}}}\tag{2}$$

$$d\_{\text{acc. across}} = 10 \times d\_{\text{acc. along}}\tag{3}$$

It has a noise level of 10−<sup>11</sup> m/s2 in the measurement bandwidth, and a 1/ *f* <sup>2</sup> slope at frequencies from 10−<sup>3</sup> Hz and lower. Equation (3) expresses the fact that electrostatic accelerometers, as they have been used in all previous and current satellite gravity missions, usually have two high-sensitive axes and one less-sensitive axis. In our set-up, we assume a degraded performance of the less-sensitive axis by one order of magnitude, and that it is oriented in an across-track direction. In this study, we further assume that the attitude, which is usually a minor error contributor compared to ISL and ACC, is known perfectly. Orbit errors are simulated by white noise with a standard deviation of 1 cm. The generation of the ISL and ACC noise time-series was done by scaling the spectrum of normally distributed random time-series with their individual spectral analytical noise model, as given in Equations (1) to (3).

#### **3. Methods**

The five contributing processing centers used their specific software systems for NGGM simulation and gravity field recovery. However, as a joint spatial–temporal parametrization, low-degree daily gravity field parameters were co-estimated together with the higher-degree temporal gravity solution, where the latter covers a retrieval period of 9 days (Scenario A) and 7 days (Scenario B). This co-estimation was first proposed by [23] and investigated in detail in [9]. There it could be shown that by this "Wiese parameterization" temporal aliasing can be reduced significantly, because short-term temporal gravity signals (mainly atmosphere and ocean signals), which would alias in standard processing into the solutions, are explicitly estimated, thus avoiding their aliasing effect. On the one hand this "self-dealiasing" approach improves the gravity field solution over the whole spectral range, and on the other hand, as a positive side effect daily gravity field products, which are independent of each other, are produced. This might be an important aspect for operational service applications, where usually a high temporal resolution and short latencies are required [24].

In all five processing centers, the non-tidal gravity field signals along the orbits were derived from the ESA Earth System Model (ESM, [19]) for all five components: atmosphere (A), ocean (O), hydrology (H), ice (I), and solid Earth (S), together abbreviated by AOHIS. Based on the 6-hourly AOHIS input products, the signals at certain measurement epochs were derived by linear interpolation. In the recovery, the full AOHIS signal was estimated, i.e., no a-priori non-tidal de-aliasing is performed, which is possible due to the co-parameterization of daily gravity field parameters [8,9]. Therefore, also no additional errors were assumed for the AOHIS signal. The true world is composed of this AOHIS signal itself, and the reference world does not contain any non-tidal temporal gravity signal (cf. Table 2).

In the following, we describe briefly the different methods applied by the five processing centers. Detailed information can be found in the cited references.

TUM: All simulations were executed with a full-scale numerical mission simulator [25,26], which has already been successfully applied in real data applications to recover satellite-only gravitational field models for CHAMP (Challenging Mini-Satellite Payload), GRACE and GOCE. The simulation environment is based on numerical orbit integration following a multistep method [27], which applies a modified divided difference form of the Adams-Bashforth-Moulton Predict-Evaluate-Correct-Evaluate (PECE) formulas and local extrapolation. The adopted gravity field approach is based on a modification of the integral equation approach from [28] where the orbit is divided into continuous short arcs of 6 h length. The position vectors at the arc node points are set up as unknown parameters, which are estimated together with the gravity field coefficients. As a stochastic

observation model, autocovariances computed from an auto-regressive moving average (ARMA) filter model, approximating the spectral characteristics of the residuals of instrument errors, are used [9].

IGG-CAS: Simulation work was done with a full numerical mission simulator, which is modified from the software tool developed in IGG-CAS for recovering satellite-only gravitational field models from GRACE Level1b observations [29]. The simulation environment is based on numerical orbit integration applying the multistep Gauss-Jackson method [30], which uses start-up value using Runge-Kutta single step integral method. The gravity field recovery is based on the variational equation approach or dynamic method, where the orbit is divided into continuous arcs of 6 h length, and initial state vectors at the beginning of the arc are set up as unknown parameters, which are estimated together with the gravity field coefficients. The 1-cpr effect in the residuals of range-rate observation are modelled [31], which are also estimated together with the gravity field coefficients.

Tongji: The simulations were carried out by using the Satellite Gravimetry Analysis Software (SAGAS) developed by Tongji University [32–35], which has been successfully applied to determine gravity field models from GRACE Level-1b observations. The simulation environment is based on numerical orbit integration, where Adams and Kiogh-Shampine-Gordon (KSG) numerical integration methods are jointly used. The adopted gravity field approach is the modified short-arc approach [32–34], where the observation equation is directly linearized with respect to the real orbit observations and the integral arc is divided into continuous short arcs of 2 h in length. Non-conservation acceleration observation errors are estimated together with the gravity field coefficients according to [33,34]. Variance–covariance matrices for both orbit and range-rate observations are constructed from residuals of the instrument errors on the basis of [35].

WHU: WHU team has developed a software platform which was used in the real data processing of the modern satellite gravity missions such as GRACE and GOCE [36,37]. For the current work on the evaluation of the NGGM schemes described in this paper, the brute-force dynamic method was selected as the basic approach in the numerical simulations [38]. For the non-conservative forces there was only accelerometer noise provided, so the orbit was integrated with the non-gravitational forces being constant zeros. In the parameter estimation step, the arc length is adjusted to 1.5 h for an optimal estimation. The initial satellite states of every arc and low degree geo-potential coefficients up to d/o 15 were processed as the local parameters, and the other geopotential coefficients to degree 70 were estimated as the global ones [23]. Regarding the stochastic model, a filtering strategy of processing band-limited measurements proposed by Schuh [39] was applied to deal with the colored observation noise.

HUST: All simulation works were implemented with a full numerical mission simulator, which has been successfully used to develop the HUST gravity field models [40,41]. The simulation environment is based on numerical orbit integration. The Gauss-Jackson multistep method is used for the numerical integration, which applies start-up values using the Runge-Kutta single step integral method. The adopted gravity field approach is based on the variational equation approach or dynamic method. The orbit is divided into continuous arcs of 6 h length, and initial state vectors at the begin epoch are set as unknown parameters, which are estimated together with the gravity field coefficients. Kinematic empirical parameters for range-rates are estimated to reduce the 1-cpr effect in the residuals of range-rate observations. The filtered predetermined strategy (FPS) was applied to process these kinematic empirical parameters [40]. No stochastic observation model was used in this simulation work.

Table 3 gives an overview of these methods and the specific settings chosen in the gravity retrieval.



All processing centers exploit both orbit data for Global Positioning System (GPS) satellite-to-satellite tracking (SST) in high-low mode, as well as the inter-satellite ranging observations (satellite-to-satellite tracking in low-low mode), by adding the corresponding normal equation (NEQ) systems. No additional constraints such as regularization are applied to the NEQ. As shown in Table 3, the methods mainly differ in the co-estimation of additional parameters and the stochastic modelling of instrument errors. WHU, HUST, and IGG-CAS co-parameterize, in addition to daily gravity parameters, only initial state vectors (orbit positions and velocities) per arc. At TUM, additional boundary conditions between arcs are introduced in order to avoid jumps of consecutive arcs. At Tongji, additional daily parameters (polynomials, bias, scale) are co-estimated.

Table 3 shows that the processing centers have used different maximum degree and order for the computation of the observations (70 vs. 120), but all of them have resolved gravity fields up to d/o 70. However, the spectral leakage effect of the signals beyond d/o 70 is minor compared to the other error sources. If there were a significant spectral leakage effect, it would clearly show up in a severely degraded performance of the very highest resolved degrees [42], which is obviously not the case, as will be shown in Figure 5.

#### **4. Results and Discussions**

The simulation scenarios described in Chapter 2 have been processed by all contributing groups applying the methods described in Chapter 3. In this chapter, the results will be compared, assessed, and discussed.

Figure 4 shows the differences of the gravity field solutions of Scenarios A and B to their corresponding references solutions, which are represented by AOHIS mean fields over the 9- and 7-day period, respectively. They are expressed in terms of the degree root mean square (RMS) error of equivalent water heights (EWH), which is computed from fully normalized coefficients of a spherical harmonic series expansion *Cnm*, *Snm* of degree *n* and order *m*, by [43]

$$
\sigma\_n(EWH) = \frac{a\rho\_c}{3\rho\_w} \frac{2n+1}{1+k\_n} \sqrt{\sum\_{m=0}^n \overline{C}\_{nm}^2 + \overline{S}\_{nm}^2} \tag{4}
$$

where ρ*<sup>w</sup>* and ρ*<sup>e</sup>* represent the average densities of water and Earth, respectively, *a* the semi-major axis of the Earth, and *kn* the load Love number of degree *n*. *Cnm*, *Snm* can either represent the full signal, or the differences between estimated and true coefficients. The EWH expresses the height of a mass-equivalent column of water per unit area. It is used here for comparing the results of five processing centres, because it is one of the standard quantities to express mass change processes in the Earth system, and was also used in the reference document [4] for this purpose. Figure 4 shows all 9 and 7-daily solutions obtained by the TUM processor for a two-month period as thin lines, as well as the mean of the two-month period as solid lines. Even though Scenario B is based on an inclined satellite with a lower orbit altitude, the performance of these two scenarios is very similar. This can be explained by the fact that Scenario A comprises 9-day solutions, while in Scenario B the solutions are based only on 7 days each. This lower number of observations seems to counteract the lower orbit altitude of the second pair. Since the performance of these two scenarios is very similar, in the following, when we will inter-compare the solutions of the five processing centers, we will concentrate on Scenario A.

**Figure 4.** Degree root mean square (RMS) of Scenarios A and B computed with the TUM simulator. The light blue and green lines show the individual 9- and 7-day solutions, respectively, and the blue and red solid lines are the average performance over the two-month period.

Figure 5a shows the degree RMS of the solutions with respect to the AOHIS input signal of the five processing centers for the first 9-day period. Even though there are several differences, they perform reasonably similarly. Apart from a slightly different error level over the whole SH spectrum, evidently the Tongji solutions perform better in the higher degrees, followed by TUM and WHU. A possible reason could be that both processing centers are using full covariance information as a stochastic model (cf. Table 3). In order to analyze the variability of the solutions among themselves, Figure 5b shows again the degree RMS, but now with the TUM solution as the reference. Evidently, the internal variability of the solutions is almost at the same level as their deviation from the input AOHJS signal.

**Figure 5.** Degree RMS of first 9-day period of Scenario A of the five numerical simulators: (**a**) degree RMS with respect to the input AOHIS signal; (**b**) degree RMS with respect to the TUM solution.

In order to analyze these differences in more detail, Figure 6 shows the SH coefficients of the input AOHIS signal averaged over the first 9 days of the analysis period (Figure 6a), as well as the coefficient differences of the five individual solutions from this AOHIS signal (Figure 6b–f). In general, there is a weakness in the estimation of the sectorial and near-sectorial coefficients, which is intrinsic to

the in-line inter-satellite ranging concept. It measures mainly in a North–South direction, while the cross-track direction, which is mainly represented by the sectorial coefficients, is determined worse. This general weakness of in-line pairs is also nicely reflected in the estimated formal coefficient standard deviations (Figure 7), representing the main diagonal elements of the variance–covariance matrix.

**Figure 6.** (**a**) Spherical harmonic (SH) coefficients of the AOHIS model, and SH differences to the AOHIS model of the (**b**) TUM; (**c**) IGG-CAS; (**d**) Tongji; (**e**) WHU; (**f**) HUST solutions. Shown is the first 9-day period of Scenario A. Colorbar scale: log10(| ... |).

**Figure 7.** Estimated formal error standard deviations of SH coefficients of the (**a**) TUM; (**b**) IGG-CAS; (**c**) Tongji; (**d**) WHU; (**e**) HUST solutions (empirically scaled). Shown is the first 9-day period of Scenario A. Colorbar scale: log10(| ... |).

A typical feature of Bender-type solutions are the error bands forming an inner triangle, which is directly related to the inclination of the second pair, expressing the fact that the near-polar areas are covered only by observations from the polar satellite pair. Since this feature results from the orbit configuration of the double pair, it is also visible in the estimated error standard deviations (Figure 7). In general, these error standard deviations mainly reflect the observation type, the location of the observations, and the stochastic model of observation errors, but they are blind with respect to signal-related temporal aliasing effects, which affect only the right-hand side of the NEQ system. An exception would be if these systematic effects are included in the stochastic observation model, as it is done in the Tongji solution. Therefore, the coefficient differences to the true solutions (Figure 6) show a higher error level than the formal error estimates and also additional features, because they are affected by temporal aliasing effects.

In Figure 7, the formal error estimates were scaled by empirically derived factors to make them comparable among each other. This was necessary because the processing centers followed different strategies regarding stochastic modelling (cf. Table 3). Of course, in principle the scaling (calibration) of the formal errors could have been done also by means of a-posteriori variance estimates from the post-fit residuals. However, on the one hand this information was not available for all processing centers, and on the other hand these variance estimates would be hampered by the systematic errors related to temporal aliasing effects anyway.

Analyzing the coefficient differences (Figure 6) and the corresponding scaled statistical error estimates (Figure 7) of the various processing centers in more detail, a striking fact is that the triangular error structure related to the Bender-type constellation of the Tongji solution in Figure 6d is much weaker than for the other processing centers, even though it is clearly visible in the error estimate (Figure 7c). This error triangle of the Tongji solution is hardly visible in the coefficient differences to the true AOHIS model of the Tongji solution and looks generally very different from the other solutions (Figure 7c), with much larger degradation of the sectorial and near-sectorial coefficients. As already addressed above, this could be explained by the fact that the covariance information is derived from the post-fit residuals, which contain also temporal aliasing effects. In the IGG-CAS error estimates (Figure 7b), the contrast between the well-estimated near-zonals of lower SH degrees and worse-estimated near-sectorials of high degree is much larger than in the TUM (Figure 7a), WHU (Figure 7d), and HUST (Figure 7e) solutions. Another issue to mention in the error standard deviations is the transition at harmonic degree 15, which is related to the maximum degree of the daily co-parameterization, which is slightly visible in the IGG-CAS formal errors (Figure 7b), while the other four solutions show a very smooth transition. Analyzing the error estimates of the low-degree SH coefficients also reveals significant differences. The error estimates of the IGG-CAS (Figure 7b) and HUST (Figure 7e) solutions predict much lower errors of the low-degree components relative to the high-degree coefficients than the other solutions, which might indicate that the GPS-SST high-low component has been given a higher relative weight when combining the NEQs. In this sense, the analysis of the error estimates allows deeper insights into the gravity recovery strategies, leading, however, as discussed above to very comparable results in terms of SH coefficient retrieval.

Figure 8a shows the global EHW field derived from the full AOHIS signal based on the ESA AOHIS model [19] (cf. Table 2) of the first 9-day period of Scenario A, and Figure 8b–f show EWH difference grids associated to the coefficient differences shown in Figure 6. These global grids have been evaluated up to the full SH resolution of d/o 70. All EWH difference fields show the typical striping pattern which is mainly resulting from residual temporal aliasing. The Tongji and TUM solutions show the lowest stripiness. In order to quantify these results, Table 4 provides an overview of the main statistical parameters of these EWH (difference) fields. Also here, the Tongji solution shows the smallest RMS deviation from the "true" AOHIS model, followed by the TUM result. Evidently, these statistics are dominated by the remaining high-frequency errors expressed by stripes in the EHW fields. Therefore, we re-do the analysis for fields resolved only up to SH d/o 30. According to Figure 5, at this degree the signal-to-noise ratio is safely above one for all five solutions. The results are shown in Figure 9 and Table 5. Please notice that the scale is reduced from 20 cm to 10 cm for the EWH (difference) fields. In this d/o 30 analysis, now the IGG-CAS solution performs best, followed by the results of TUM, WHU, and HUST. These three solutions are practically on the same error level. This is also consistent with the degree RMS results shown in Figure 5.

**Figure 8.** (**a**) Equivalent water heights (EWH) (cm) grid related to the input AOHIS model up to d/o 70, and EWH (cm) differences to this AOHIS model of the (**b**) TUM; (**c**) IGG-CAS; (**d**) Tongji; (**e**) WHU; (**f**) HUST solutions. Shown is the first 9-day period of Scenario A.


**Table 4.** Main statistical parameters of EWH input and EWH difference fields up to SH d/o 70 of the first 9-day period of Scenario A.

**Figure 9.** (**a**) EWH (cm) grid related to the input AOHIS model up to d/o 30, and EWH (cm) differences to this AOHIS model of the (**b**) TUM; (**c**) IGG-CAS; (**d**) Tongji; (**e**) WHU; (**f**) HUST solutions. Shown is the first 9-day period of Scenario A.


**Table 5.** Main statistical parameters of EWH input and EWH difference fields up to SH d/o 30 of the first 9-day period of Scenario A.

As a general conclusion, at degree 30 the RMS variation among the five solutions is in the order of 10% of the total signal RMS. Solutions for other time periods show very similar behavior to the ones presented above, and are therefore not explicitly presented and discussed in this paper.

As already mentioned in Chapter 3, all processing centers co-estimated daily gravity field parameters up to SH d/o 15 in order to reduce temporal aliasing effects. Figure 10 shows the resulting degree error RMS curves, which represent differences of daily estimates to the daily AOHIS "true" signals (black curve), averaged over the whole recovery period. Also here, the performance of these

daily solutions are very comparable among the processing centers, where IGG-CAS shows a slightly better performance than TUM, HUST, and WHU, and only the Tongji daily estimates are worse.

**Figure 10.** Degree RMS of co-parameterized daily gravity solutions up to d/o 15, averaged over the whole recovery period.

#### **5. Conclusions and Outlook**

Temporal gravity retrieval results of a future Bender-type double pair mission concept performed by five processing centers of a Sino-European NGGM study team based on jointly defined mission scenarios have been inter-compared and assessed. In spite of some remaining differences, the solutions of the contributing teams TU Munich, Institute of Geodesy and Geophysics, Tongji University, Wuhan University, and Huazhong University of Science and Technology show quite similar performances. In general, these results demonstrate that one of the main goals of this inter-comparison exercise, the consistency of simulation results among the contributing groups, could be achieved to a large extent. Remaining differences might be explained by different processing strategies and partially different parameterizations, such as the co-parameterization of accelerometer biases and drifts, or empirical parameters to compensate for long-wavelength errors in the observation time series. Some remaining differences also exist in the error level of the co-estimated daily solutions.

As already mentioned above, the main study goal was not to identify a "winner" among the contributing groups, but rather to achieve consistency of numerical simulation results of NGGM concepts to the best possible extent, even though five independent software packages, which are based partly on different gravity retrieval methods, have been applied. This positive result is an important pre-requisite and basis for future joint activities towards the realization of NGGMs.

Based on this successful inter-comparison exercise of numerical simulators and in view of the fact that in the future several satellite pairs might be in orbit in parallel, further joint studies on optimized constellations of triple- and multi-pairs will be performed, and their impact on the main fields of applications, such as hydrology, glaciology, and solid Earth physics, will be quantified. The results of these studies will be presented in a follow-up paper.

**Author Contributions:** Conceptualization, R.P., H.-C.Y., W.F., Z.L., Y.S., M.Z., H.X.; Methodology, R.P., Q.C., M.H., A.P., X.X., B.Z., H.Z., X.Z.; Software, Q.C., M.H., A.P., X.X., B.Z., H.Z., B.L., Y.Z., X.Z.; Validation of results, A.P., R.P.; Writing—original draft preparation, R.P.; Writing—review and editing, R.P., Q.C., W.F., R.H., M.H., Z.L., A.P., Y.S., C.W., X.X., B.Z., H.Z., B.L., Y.Z., X.Z., H.Y., M.Z.; visualization, R.P.

**Funding:** This research was performed in the framework of the study "Assessment of satellite constellations for monitoring the variations in Earth gravity field (ADDCON)", ESA-ESTEC, Contract AO/1-7317/12/NL/AF funded by the European Space Agency. This study was also funded by the National Natural Science Foundation of China (projects no. 41704012, 41931074 and 41731069), and the Strategic Priority Research Program of the Chinese Academy of Sciences (grant nos. XDB23030100 and XDA15017700).

**Acknowledgments:** We further acknowledge the contribution of the ADDCON project partners Pieter Visser, Nicolas Sneeuw, Wei Liu, Johannes Engels, Qiang Chen, Tonie van Dam and Thomas Gruber for a successful completion of the ESA project, as well as Christian Siemes and Luca Massotti for their valuable support during this project.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Combination Analysis of Future Polar-Type Gravity Mission and GRACE Follow-On**

#### **Yufeng Nie 1, Yunzhong Shen 1,\* and Qiujie Chen 1,2**


Received: 20 December 2018; Accepted: 18 January 2019; Published: 21 January 2019

**Abstract:** Thanks to the unprecedented success of Gravity Recovery and Climate Experiment (GRACE), its successive mission GRACE Follow-On (GFO) has been in orbit since May 2018 to continue measuring the Earth's mass transport. In order to possibly enhance GFO in terms of mass transport estimates, four orbit configurations of future polar-type gravity mission (FPG) (with the same payload accuracy and orbit parameters as GRACE, but differing in orbit inclination) are investigated by full-scale simulations in both standalone and jointly with GFO. The results demonstrate that the retrograde orbit modes used in FPG are generally superior to prograde in terms of gravity field estimation in the case of a joint GFO configuration. Considering the FPG's independent capability, the orbit configurations with 89- and 91-degree inclinations (namely FPG-89 and FPG-91) are further analyzed by joint GFO monthly gravity field models over the period of one-year. Our analyses show that the FPG-91 basically outperforms the FPG-89 in mass change estimates, especially at the medium- and low-latitude regions. Compared to GFO & FPG-89, about 22% noise reduction over the ocean area and 17% over land areas are achieved by the GFO & FPG-91 combined model. Therefore, the FPG-91 is worthy to be recommended for the further orbit design of FPGs.

**Keywords:** gravity field recovery; GRACE Follow-On; orbit configuration; synergistic observation

#### **1. Introduction**

The Earth is undergoing complicated and continuous mass transport [1]. To better quantify the mass changes, the Gravity Recovery and Climate Experiment (GRACE) mission was launched in 2002, which consists of the twin satellites flying in the same near-polar orbit at about 500-km altitude and separated by 220 km [2]. The key technology that enables GRACE to observe large-scale mass variations near the Earth surface is the onboard K-band microwave ranging system (KBR), which delivers inter-satellite range measurements with an accuracy at micron level to precisely capture orbit perturbation differences between the two GRACE satellites caused by heterogeneous mass distribution of the Earth [2,3]. Along with the KBR, other important sensors onboard GRACE satellites include Global Positioning System (GPS) receivers for precise orbit determination (POD), high-precision accelerometers for measuring non-gravitational forces and star cameras for attitude determination [4–6]. Based on the precise data collected by the above sensors, many scientific institutions have produced high-quality static and temporal gravity field models [7–11], which are successfully adopted for quantitative researches concerning the Earth's mass changes, such as global mean sea-level variations [12,13], continental water balance [14,15], and the Tibetan Plateau and Antarctic regional ice losses [16,17].

In view of such a great success achieved by GRACE in terms of the Earth's mass change exploration, its successor GRACE Follow-On (GFO) was launched on 22 May 2018 to continue providing the geodetic observation records [18,19]. As a demonstrator, GFO carries a laser ranging interferometer (LRI) for measuring inter-satellite ranges with an accuracy level of a few tens nanometers in addition to the sensors onboard the GRACE satellites [18–20]. However, even with such a dramatic accuracy improvement for the range observations, some full-scale pre-launch simulations indicated that only a moderate improvement on gravity field estimation can be reached by GFO as compared to GRACE [18,21]. The dominant barrier to further improvement on gravity field determination by GFO is the temporal aliasing issue, which is more intractable than other limiting factors in the GFO error budget [18,21].

From a digital signal processing aspect, aliasing is due to undersampling of the signal to be recovered. According to the Nyquist sampling theorem, when sampling frequency is smaller than twice of the original signal frequency, aliasing definitely occurs [22]. In terms of GRACE monthly gravity field recovery, temporal aliasing is mainly attributed to those high-frequency signals originating from the oceanic tidal variations at daily or sub-daily scale, as well as the short-term energy exchange process (generally within a few hours) between atmosphere and ocean [23,24]. Because of the limited bandwidth for observations and sparse ground track of GRACE, most part of these signals typically need to be reduced by priori background models during gravity field recovery process [25], which are, however, subject to errors [26,27]. Therefore, the remaining high-frequency signals result in the temporal aliasing errors in monthly-average gravity solutions, which manifest themselves as the typical north-south stripes in spatial pattern of unfiltered GRACE models [28,29].

To mitigate the troublesome temporal aliasing effects, post-processing filtering techniques are generally applied as a standard procedure [28–30]. However, the fundamental solution should be based on the enhanced signal sampling and gravity field modelling strategy [31–34]. One of the most straightforward and efficient ways is to increase the number of satellites in space for denser sampling of gravity signals. Thereby, many researchers have investigated novel concepts of satellite formation flight (SFF) or satellite constellations dedicated to gravity field recovery, including the cartwheel, pendulum, and Bender-type configurations [35–39]. Among all, the Bender-type SFF, formed by two pairs of GRACE-type inline satellites (in polar and inclined orbits), is most promising from both technological and economic aspects. Therefore, it is selected as the primary candidate for the Next Generation Gravity Mission (NGGM) concepts [40,41] by the European Space Agency (ESA). During the service period of GFO, other Future Polar-type Gravity mission (FPG) will be probably launched [42]. Therefore, the primary motivation of this article is to investigate the optimal FPG orbit configuration in combination with GFO, under the precondition that the FPG has the same payload accuracy as GRACE and comparable capability of recovering global static and temporal gravity fields to GRACE. With this in mind, the search space for FPG's key orbit elements (i.e., orbit altitude and inclination) is greatly reduced. Considering a five-year nominal mission lifetime for FPG, its orbit altitude is similar to GRACE. As such, the emphasis of FPG orbit adjustment will be put on the orbit inclination selection issue.

The rest of this article is organized as follows. Section 2 presents the details of full-scale simulations for FPG orbit selection and performance evaluation. Gravity field results from the standalone FPG, with different inclinations, and jointly with GFO are demonstrated in Section 3. In Section 4, we explain the underlying mechanism of the presented results and further discuss the practical consideration of FPG orbit design. Conclusions are given in Section 5.

#### **2. Numerical Simulation Scheme**

#### *2.1. Mathematical Model for Gravity Field Recovery*

The dynamic approach is adopted in our gravity field model recovery [43], which is widely used in real GRACE data processing by different scientific institutions [7,8,44–46]. In this section, we outline basic formulas of the dynamic approach used in the article, as well as relevant references for more detailed descriptions.

The Earth gravity field model can be expressed via a set of Spherical Harmonic Coefficients (SHC) *u* = *Clm*, *Slm* with following Spherical Harmonic (SH) expansion,

$$V(\rho,\rho,\lambda;\mathfrak{u}) = \frac{GM}{\rho} \left[ 1 + \sum\_{l=2}^{L\_{MAX}} \sum\_{m=0}^{l} \left( \frac{R\_E}{\rho} \right)^l \left( \overline{\mathbb{C}}\_{lm} \cos m\lambda + \overline{\mathbb{S}}\_{lm} \sin m\lambda \right) \cdot \overline{\mathbb{P}}\_{lm}(\sin \varrho) \right] \tag{1}$$

where *V* is the gravitational potential of an arbitrary point on or outside the Earth with the spherical coordinate of radius *ρ*, latitude *ϕ* and longitude *λ*, in the Earth-fixed frame. *GM* is the product of the gravitational constant and the Earth mass, and *RE* is the Earth's mean equatorial radius; *l* and *m* stand for the degree and order of SH expansion, respectively, with *LMAX* being the maximum degree of the expansion; *Plm* is the fully normalized associate Legendre function of degree *l* and order *m*.

The estimation of SHC *u* from satellite observations is based on the Newton's law of motion [47], which is formulated as a three-dimensional second-order nonlinear ordinary differential equation (ODE),

$$
\ddot{r}(t) = a(t; r(t), \dot{r}(t), p, \mu) \tag{2}
$$

where *r*(*t*), *r*˙(*t*), and *r*¨(*t*) denote the vectors of satellite's position, velocity and acceleration at epoch *t*, respectively. *a* is the force acting on the unit mass of the satellite, where *p* represents the parameters vector for the force models except the Earth gravity field. In the case of GRACE, *p* is referred to accelerometer calibration parameters (e.g., bias and scale parameters) [48]. When the force models are given, the position and velocity of satellite's orbit are determined as follows [49],

$$\begin{cases}
\quad
\dot{r}(t) = r(t\_0) + (t - t\_0) \cdot \dot{r}(t\_0) + \int\_{t\_0}^t \int\_{t\_0}^\tau \mathbf{a}(\tau'; r(\tau'), r(\tau'), p, \mathbf{u}) d\tau' d\tau \\
\quad
\dot{r}(t) = \dot{r}(t\_0) + \int\_{t\_0}^t \mathbf{a}(\tau; r(\tau), r(\tau), p, \mathbf{u}) d\tau
\end{cases} \tag{3}$$

where (*rT*(*t*0), ˙*rT*(*t*0))*<sup>T</sup>* denote the position and velocity vectors at initial epoch *<sup>t</sup>*<sup>0</sup> of an orbit arc. According to Equation (3), the position and velocity vectors at any epoch *t* within the orbit arc are the function of parameter vector *x* = (*wT*, *uT*) *T* , where *w* = (*rT*(*t*0), ˙*rT*(*t*0), *pT*) *<sup>T</sup>* is the arc-specific parameters vector. Linearizing the Equation (3) with respect to the approximate vector *x*<sup>0</sup> of parameters, we have,

$$\begin{cases} \begin{array}{l} \dot{\boldsymbol{r}}(t) = \boldsymbol{r}^{REF}(t) + \frac{\partial \boldsymbol{r}^{REF}(t)}{\partial \mathbf{x}\_{0}} \delta \mathbf{x} \\\ \dot{\boldsymbol{r}}(t) = \dot{\boldsymbol{r}}^{REF}(t) + \frac{\partial \boldsymbol{r}^{REF}(t)}{\partial \mathbf{x}\_{0}} \delta \mathbf{x} \end{array} \tag{4}$$

where the vectors of position *rREF*(*t*) and velocity *r*˙ *REF*(*t*), named as the reference orbit, are numerically integrated with Equation (3) by replacing *x* with *x*0. *δx* is the correction vector to *x*0. Substituting *r*(*t*) = *rOBS*(*t*) + *vr*(*t*) into the first equation of Equation (4), we get the observational equation for position at epoch *t*,

$$
\sigma\_r(t) = \frac{\partial r^{REF}(t)}{\partial \mathbf{x}\_0} \delta \mathbf{x} - \left[r^{ORS}(t) - r^{REF}(t)\right] = \frac{\partial r^{REF}(t)}{\partial \mathbf{x}\_0} \delta \mathbf{x} - \Delta r(t) \tag{5}
$$

where *rOBS*(*t*) and *vr*(*t*) represent the position observation and its correction at epoch *t*. The inter-satellite range rate . *ρ*(*t*) between GRACE-A and B can be expressed as,

$$
\dot{\rho}(t) = \mathbf{e}\_{AB}^T(t)\dot{\mathbf{r}}\_{AB}(t) \tag{6}
$$

where, the subscripts A and B denote the values of GRACE-A and B, *r*˙ *AB*(*t*) = *r*˙*B*(*t*) − *r*˙ *<sup>A</sup>*(*t*) is the velocity difference vector and *eAB*(*t*) = *rAB*(*t*)/ *rT AB*(*t*)*rAB*(*t*) with *rAB* = *r<sup>B</sup>* − *rA*. By substituting

. *<sup>ρ</sup>*(*t*) = . *ρ OBS*(*t*) + *v* . *<sup>ρ</sup>*(*t*) and the second equation of Equation (4) into Equation (6) for both GRACE-A and B, we get the following observational equation for range rate measurement at epoch *t*,

$$v\_{\dot{\rho}}(t) = \frac{\partial \dot{\rho}^{REF}(t)}{\partial \mathbf{x}\_0} \delta \mathbf{x} - \left[\dot{\rho}^{ORS}(t) - \dot{\rho}^{REF}(t)\right] = \frac{\partial \dot{\rho}^{REF}(t)}{\partial \mathbf{x}\_0} \delta \mathbf{x} - \Delta \dot{\rho}(t) \tag{7}$$

where, . *ρ REF*(*t*) is computed with the reference orbit, . *ρ OBS*(*t*) and *v* . *<sup>ρ</sup>*(*t*) stand for the range rate observation and its correction, respectively. The partial derivatives *<sup>∂</sup>rREF*(*t*) *<sup>∂</sup>x*<sup>0</sup> and *<sup>∂</sup>r*˙ *REF*(*t*) *<sup>∂</sup>x*<sup>0</sup> are numerically integrated by solving the so-called 'variational equation' and *<sup>∂</sup>* . *ρ REF*(*t*) *<sup>∂</sup>x*<sup>0</sup> is the combination of them, which can be referred to [50] (pp. 240–243), [51] (pp. 30–32) and [49] for detailed descriptions. The observational equations in the form of Equations (5) and (7) for all epochs within one arc, i.e., arc *j*, are collected together and briefly expressed in the following form,

$$
\delta \mathbf{w}\_{\circ} = \mathbf{A}\_{\circ} \delta \mathbf{x}\_{\circ} - \mathbf{y}\_{\circ} = \left( \begin{array}{cc} \mathbf{A}\_{\mathbf{w}\_{\circ}} & \mathbf{A}\_{\mathbf{u}\_{\circ}} \end{array} \right) \begin{pmatrix} \delta \mathbf{w}\_{\circ} \\ \delta \mathbf{u} \end{pmatrix} - \mathbf{y}\_{\circ} \tag{8}
$$

where *v<sup>j</sup>* denotes the vector of all corrections, *Aj*, *Aw<sup>j</sup>* and *Au<sup>j</sup>* are the design matrices, and *y<sup>j</sup>* is the vector of all constant terms of the arc *j*. The Normal Equation (NEQ) of the arc based least squares adjustment is derived as [48],

$$
\begin{pmatrix}
\mathbf{N}\_{\boldsymbol{w}\_{\boldsymbol{j}}\boldsymbol{w}\_{\boldsymbol{j}}} & \mathbf{N}\_{\boldsymbol{w}\_{\boldsymbol{j}}\boldsymbol{w}\_{\boldsymbol{j}}} \\
\mathbf{N}\_{\boldsymbol{w}\_{\boldsymbol{j}}\boldsymbol{w}\_{\boldsymbol{j}}}^{T} & \mathbf{N}\_{\boldsymbol{w}\_{\boldsymbol{j}}\boldsymbol{w}\_{\boldsymbol{j}}}
\end{pmatrix}
\begin{pmatrix}
\delta\boldsymbol{w}\_{\boldsymbol{j}} \\
\delta\boldsymbol{u}
\end{pmatrix} = \begin{pmatrix}
\mathbf{R}\_{\boldsymbol{w}\_{\boldsymbol{j}}} \\
\mathbf{R}\_{\boldsymbol{u}\_{\boldsymbol{j}}}
\end{pmatrix}
\tag{9}
$$

where *Nwjw<sup>j</sup>* = *A<sup>T</sup> wj PjAw<sup>j</sup>* , *Nuju<sup>j</sup>* = *A<sup>T</sup> uj PjAu<sup>j</sup>* , *Nwju<sup>j</sup>* = *A<sup>T</sup> wj PjAu<sup>j</sup>* and *Rw<sup>j</sup>* = *A<sup>T</sup> wj Pjy<sup>j</sup>* , *Ru<sup>j</sup>* = *A<sup>T</sup> uj Pjy<sup>j</sup>* . *P<sup>j</sup>* denotes the weight matrix constructed based on the precision of observations. To recover *δu* with observations of multiple arcs, such as one month, the arc-specific parameters *δw<sup>j</sup>* should be pre-eliminated arc-wise, forming the reduced normal equation. After that, reduced normal equations are summed up to *J* arcs, and then the SHC parameter *δu* is obtained by solving Equation (10),

$$
\dot{\mathsf{N}}\_{l} \cdot \delta u = \mathsf{R}\_{l} \tag{10}
$$

where *N <sup>J</sup>* = ∑ *J N<sup>j</sup>* and *R<sup>J</sup>* = ∑ *J R<sup>j</sup>*. *N <sup>j</sup>* and *R<sup>j</sup>* are coefficient matrix and the right-hand side of the reduced normal equation of arc *<sup>j</sup>*, with *<sup>N</sup> <sup>j</sup>* <sup>=</sup> *Nuju<sup>j</sup>* <sup>−</sup> *<sup>N</sup><sup>T</sup> <sup>N</sup>*−<sup>1</sup> *<sup>w</sup>jw<sup>j</sup> Nwju<sup>j</sup>* and *<sup>R</sup><sup>j</sup>* <sup>=</sup> *Ru<sup>j</sup>* <sup>−</sup> *<sup>N</sup><sup>T</sup> <sup>N</sup>*−<sup>1</sup> *<sup>w</sup>jw<sup>j</sup> Rw<sup>j</sup>* [48].

*wju<sup>j</sup> wju<sup>j</sup>* As regard to the GFO and FPG combination, it can be carried out on NEQ level by Equation (11),

$$(\tilde{\mathbf{N}}\_{I}^{GFO} + \tilde{\mathbf{N}}\_{I}^{FPG}) \cdot \delta \mathbf{u} = (\tilde{\mathbf{R}}\_{I}^{GFO} + \tilde{\mathbf{R}}\_{I}^{FPG}) \tag{11}$$

where *<sup>N</sup> GFO <sup>J</sup>* , *<sup>R</sup>GFO <sup>J</sup>* and *<sup>N</sup> FPG <sup>J</sup>* , *<sup>R</sup> FPG <sup>J</sup>* are established by Equation (10) of GFO and FPG, respectively. It should be mentioned here that the same variance of unit weight must be employed in constructing the weight matrices of GFO and FPG, otherwise the two normal equations cannot be simply summed up as in Equation (11).

#### *2.2. Orbit Configuration Selection for FPG*

In gravity field recovery, the orbit altitude and inclination play the key role [52] (pp. 41–43). Since the gravity signals attenuate rapidly with the increase of orbit altitude [47], the ideal gravimetric satellite should fly as low as possible in order to be sufficiently sensitive to the signals. However, the lower altitude of the satellite, the stronger atmospheric drag acting on it, which requires more fuels to maintain the mission operation and thus shortens its lifetime [37]. As regard to the orbit inclination, it could not be better to select the 90-degree inclination for a global coverage of the Earth. However,

due to the launch limit, it is a challenge to put satellites in an exact polar orbit. For example, GRACE was flying in a near-polar orbit with 89-degree inclination, which left two 1-degree-radius circles in the south and north poles (namely the polar gaps). In general, a large polar gap will significantly impact the estimation of zonal SHC and eventually corrupt the estimated gravity field. The maximum recoverable degree and order (d/o) of SHC in the case of a given *g*-degree polar gap can be roughly calculated via dividing 180 (degree) by *g* [51] (p. 205).

Considering the twin satellites of GFO separated by 220 km both in the near-circular orbit of 490-km altitude and 89-degree inclination [18], its initial orbit configuration in the following simulations will be kept fixed as listed in Table 1. In consideration of a nominal five-year mission lifetime, the FPG initial orbit altitude is currently supposed to be the same as GRACE (namely 500 km) [2]. With regard to the inclination choice of FPG, the relationship between the polar gap and the maximum recoverable SHC d/o, as mentioned above, should be taken into consideration. For typical monthly temporal gravity field model up to 60 d/o, it requires the polar gap smaller than 3 degrees, while for static gravity field models up to 180 d/o, the polar gap should be within 1 degree in theory. Once the polar gap size is determined, there exist two kinds of orbits, namely the prograde orbit (inclination smaller than 90 degrees) and the retrograde orbit (inclination larger than 90 degrees) [53] (p. 99). For example, the previously dedicated gravity missions CHAllenging Mini-satellite Payload (CHAMP) (87-degree inclination) [54] and GRACE (89-degree inclination) are in prograde orbits, while Gravity field and steady-state Ocean Circulation Explorer (GOCE) (96.5-degree inclination) [55] is in a retrograde orbit. In general, the retrograde-orbit satellites require more fuel than the prograde ones during launch in order to overcome the Earth's rotation speed, but it is not a technical problem to launch either prograde or retrograde satellites.

In the following simulation study, we preliminarily select four sets of FPG configurations, which differ only in orbit inclinations. Their initial orbit elements are listed in Table 1 and denoted as FPG-87, FPG-89, FPG-91, and FPG-93 (corresponding to the inclinations of 87, 89, 91, and 93 degrees, respectively). Note that the initial Right Ascension of Ascending Nodes (RAAN) are selected as 0 degree for GFO, FPG-87, FPG-91, and FPG-93 while 180 degrees for FPG-89, which is designed to avoid the nearly full overlay of ground tracks of GFO and FPG-89 when combining the two [32]. The initial inter-satellite distances for all GFO and FPG configurations are fixed as 220 km.


**Table 1.** Initial orbit configurations of GFO and FPG for simulation study

#### *2.3. Numerical Simulation Procedures*

The simulation procedures involving the forward and the backward simulation steps [18] are shown in Figure 1. In the forward simulation step, we apply a set of true background models to generate the simulated observations. The true background models are listed in Table 2, where EIGEN6C4 [56] is used as the true static gravity field model and EOT11a (including main wave and interpolated secondary wave) [57] is treated as the true ocean tide model. For non-tidal variation signals, the AOHIS component of ESM is adopted to represent the integrated signal of atmosphere (A), oceans (O), terrestrial hydrological water (H), continental ice-sheets (I), and the solid earth (S) [24]. The maximum degree and order for all models is chosen as 100 [18]. Given the initial orbit elements defined in Table 1, the satellite orbits for both GFO and FPG can be generated by numerical integration [50] (pp. 117–146). The combined eighth-order Runge–Kutta and eighth-order

Gauss–Jackson integrator with an integration step of 5 s is applied to generate the error-free data, which include satellite orbit positions and inter-satellite range-rates. To produce more realistic observations, above error-free data should be added with sensor noises. In this work, the sensor noises are treated as Gaussian white noise with zero expectation [31,32], while more sophisticated noise models can be found in [58]. The standard deviations (SD) of major onboard sensors' noises, listed in Table 3, are based on their performances in real GRACE data. As shown, 1-cm SD is selected for orbit positions [31–33]. With regard to the range-rate data, we add noise with SD of 0.2 μm/s to KBR measurements in GFO and FPG [18,45], while with SD of 10 nm/s to LRI measurements in GFO [18]. Although the non-gravitational forces are not simulated during orbit integration, the accelerometer's noise with SD of 0.3 nm/s2 is directly simulated during the integration of orbit [52] (pp. 50–51). We simulate full-year observations in 2006 for all orbit configurations, which are subsequently used in the backward simulation step for gravity field model recovery.

**Figure 1.** Processing flow of the forward and the backward simulations.

**Table 2.** Background models for numerical simulation. True models are used for simulated observations generation while the reference models are applied to gravity field recovery.


In the backward simulation step, we recover the monthly gravity field model up to 100 d/o [18] by the dynamic approach using above simulated observations. For parameterizations, we choose the orbit arc length as 6 h for the initial position and velocity estimation [44–46], and estimate the accelerometer parameters (bias and linear drift) every 1.5 h (orbital period) [45]. Observations' weight matrices in Equation (9) are constructed based on corresponding SD values listed in Table 3 with the same variance of unit weight. The reference background models, used for reference orbit and partial derivatives integration in Equations (5) and (7), are listed in Table 2. As shown, FES2004 [59] is applied as reference ocean tide model, and DEAL plus AOerr components of ESM [27] are used to represent the AO components of AOHIS in true background models [24]. Since the emphasis of the article is on temporal gravity model recovery, the true and reference static models are identical [21,33,35]. The differences between true and reference background models are designed to introduce the temporal aliasing errors to the recovered gravity model [18,21,33,37]. As shown in Figure 1, the temporal signal we want to recover in each monthly model is the average HIS of this month, which is the integrated signal of terrestrial hydrological water (H), continental ice-sheets (I), and the solid earth (S) [18,24].


**Table 3.** The standard deviations (SD) of white Gaussian sensor noises used for the generation of simulated observations

#### *2.4. Evaluation Metrics for Recovered Gravity Field Model*

In a simulated world, the quality of the recovered gravity field model can be easily evaluated by comparing the true and the recovered fields. In this article, the evaluation metrics for recovered gravity field models are derived based on monthly SHC differences {Δ*Clm*, Δ*Slm*}, which are calculated by subtracting the recovered model from the true static model (EIGEN6C4 100 d/o) plus the true monthly HIS signal (100 d/o) as shown in Equation (12) according to [18],

$$
\Delta \mathcal{C}\_{lm} = \mathsf{\mathsf{\mathsf{T}}}\_{lm}^{EIGEN6\mathcal{C}4 + HIS} - \mathsf{\mathsf{\mathsf{T}}}\_{lm}^{REC}, \\
\Delta \mathcal{S}\_{lm} = \mathsf{\mathsf{\mathsf{\mathsf{\mathcal{S}}}}\_{lm}^{EIGEN6\mathcal{C}4 + HIS} - \mathsf{\mathsf{\mathsf{\mathcal{S}}}}\_{lm}^{REC} \tag{12}
$$

where the superscripts *EIGEN6C4*, *HIS*, and *REC* represent the SHC from EIGEN6C4, monthly HIS and the recovered models in simulations. Given SHC differences, the degree geoid height error (DGHE) and cumulative geoid height error (CGHE) of the recovered model can be calculated by

$$\begin{aligned} DGHE(n) &= R\_E \cdot \sqrt{\sum\_{m=0}^{l} (\Delta \mathcal{C}\_{lm}^2 + \Delta \mathcal{S}\_{lm}^2)}, (l = n) \\ \mathcal{CGHE}(n) &= R\_E \cdot \sqrt{\sum\_{l=2}^{n} \sum\_{m=0}^{l} (\Delta \mathcal{C}\_{lm}^2 + \Delta \mathcal{S}\_{lm}^2)} \end{aligned} \tag{13}$$

where *DGHE*(*n*) represents the DGHE of degree *n* and *CGHE*(*n*) represents the CGHE up to degree *n*. One can also construct spatial-domain metrics by the spatial differences plot (SDP) [60] (pp. 24–25). For the *k*-th grid point on the Earth surface, its geoid height difference can be calculated by Equation (14) [60] (p. 22), with (1) and (12),

$$
\Delta N\_k(\varphi\_k, \lambda\_k) = R\_E \cdot \sum\_{l=2}^{L\_{MAX}} \sum\_{m=0}^{l} \left( \Delta C\_{lm} \cos m\lambda\_k + \Delta S\_{lm} \sin m\lambda\_k \right) \cdot \overline{P}\_{lm}(\sin \varphi\_k) \tag{14}
$$

where Δ*Nk* stands for the geoid height difference, and other symbols are the same as those defined in Equation (1); *LMAX* is chosen as 100. Based on Equation (14), the SDP can be created by plotting all grids' Δ*Nk* within a specific spatial range **Ω** on the world map given corresponding geographic locations. **Ω** can be global or regional, e.g., oceans, Amazon basin. The grid size (interval) is chosen as 1-by-1 degree in the followings. Additionally, the latitude-weighted root mean square (wRMS) [60] (pp. 24–25) of above SDP can be calculated by Equation (15),

$$wRMS(\Omega) = \sqrt{\frac{\sum\_{\forall k \in \Omega} (\Delta N\_k^2 \cdot \cos \varphi\_k)}{\sum\_{\forall k \in \Omega} \cos \varphi\_k}}\tag{15}$$

where all grids within **Ω** are summed based on their area weights reflected by latitudes.

Since metrics in (13) and (15) represent absolute errors (positive values) of the recovered model, comparisons between any two recovered models, in terms of DGHE, CGHE, or wRMS, can be carried out by calculating error reduction rate (ERR) of the model *i* over the model *j* as shown in Equation (16),

$$ERR\_j^i(q) = \frac{q\_j - q\_i}{q\_j} \times 100\% \tag{16}$$

where *q* can be any value of DGHE, CGHE and wRMS.

#### **3. Results and Analysis**

#### *3.1. Standalone Models of FPG*

We first evaluate standalone models recovered by different FPG configurations listed in Table 1. The results in January (2006) are demonstrated in Figure 2, while other months show similar results. In Figure 2, the black curve represents the true average HIS signal in January (an integrated signal of terrestrial hydrological water, continental ice-sheets, and the solid earth [24]), while other curves are the degree geoid height errors (DGHE) of different FPG standalone models in this month. Observing signal (in black) and error (in colored) curves, we find that the recovered models are gradually dominated by noises after degree *n* = 26, which is indicated by the intersection point of signal curve and error curves around degree 26. Among four FPG standalone models, the FPG-91 in general produces smaller errors especially for degrees beyond 60. When looking to low-degree SHC near the intersection point, we find that the retrograde FPGs (FPG-91 and FPG-93) outperform the prograde ones (FPG-89 and FPG-87). The expected degradations of FPG-87 and FPG-93 due to their larger polar gaps compared to FPG-89 and FPG-91, however, is not apparent in Figure 2. This can be explained by denser ground tracks gained by FPG-87 and FPG-93 in the mid and low latitudes, which finally compensate the negative effects caused by large missing polar data, as discussed in [51] (pp. 204–214).

**Figure 2.** True HIS signal and degree geoid height errors (DGHE) of different FPG standalone models in January 2006.

Further comparisons are carried out in terms of absolute SHC differences (|Δ*Clm*|, |Δ*Slm*|) of each degree *l* and order *m*, as drawn in Figure 3. Subfigures (a), (b), (c), and (d) stand for results of FPG-87, FPG-89, FPG-93, and FPG-91, respectively. Now we can clearly observe the deficiencies of FPG-87 and FPG-93 in zonal SHC, i.e., *Cl*0, estimation, which are demonstrated by larger |Δ*Cl*0| values in (a) and (c) compared to that in (b) and (d). Recall the approximate relationship between the polar gap size and the maximum recoverable gravity signal in Section 2.2, the weaknesses of FPG-87 and FPG-93, with 3-degree polar gaps, are expected to become much more pronounced in terms of high-degree, e.g., 180 d/o [11], static gravity model recovery.

**Figure 3.** Absolute spherical harmonic coefficients (SHC) differences of different FPG standalone models in January 2006: (**a**) FPG-87; (**b**) FPG-89; (**c**) FPG-93; (**d**) FPG-91.

#### *3.2. Combined Models of GFO-LRI and FPG*

Since GFO carries both LRI and KBR, the combinations between GFO and FPG will be carried out in GFO-LRI & FPG and GFO-KBR & FPG modes respectively. In the GFO-LRI & FPG mode, the information from GFO-LRI will dominate the combined model because it has much larger weight in Equations (9) and (11) than the KBR of FPG, based on Table 3. Therefore, we can expect only minor differences among different GFO-LRI & FPG combined models. Even so, comparisons can still be carried out by calculating the error reduction rate (ERR) of different GFO-LRI & FPG models with respect to the GFO-LRI standalone model. Here, we use the ERR in terms of DGHE, and therefore set *qi* = *DGHEGFO*-*LRI* & *FPG* and *qj* = *DGHEGFO*-*LRI* in Equation (16). Figure 4 plots the degree-wise *ERRGFO*-*LRI* & *FPG GFO*-*LRI* (*DGHE*) for all four GFO-LRI & FPG models, which can be regarded as contributions of adding different FPGs' information to the GFO-LRI data.

As expected, the contributions are not significant. Among four combined models, GFO-LRI & FPG-93 has the largest ERR for the majority of SHC, and GFO-LRI & FPG-91 is next after that. Moreover, even with the same polar gap, retrograde FPGs (FPG-91 and FPG-93) outperform the prograde ones (FPG-89 and FPG-87). This can be attributed to the more isotropic integrated system of GFO & retrograde-FPG than GFO & prograde-FPG, as learnt from Bender-type SFF design [37]. More specifically, the orbit inclination differences between the GFO (in prograde orbit) and retrograde FPGs are larger than that in GFO & prograde-FPGs, which enables the GFO & FPG integrated system to observe the gravity signals in a more isotropic sense. However, degradations occur at some very low degrees of the combined models, which is likely due to insufficiently perfect data weighting between GFO-LRI and FPG [61].

**Figure 4.** Error reduction rate (ERR) of different GFO-LRI & FPG combined models with respect to the GFO-LRI standalone model in terms of degree geoid height error (DGHE) in January 2006.

#### *3.3. Combined Models of GFO-KBR and FPG*

More representative results can be demonstrated by GFO-KBR & FPG combined models. In Figure 5, we plot the true HIS signal (black curve) of January 2006, as well as error curves (in colored) of four GFO-KBR & FPG models in terms of DGHE in this month. Basically, the results are comparable with those in Figure 4, that is, the retrograde FPGs generally outperform prograde ones for their lower DGHE. Table 4 records the ERR of GFO-KBR & FPG combined models with respect to the GFO-KBR standalone model in terms of cumulative geoid height error (CGHE) in Equation (13), which is calculated by setting *qi* = *CGHEGFO*-*KBR* & *FPG* and *qj* = *CGHEGFO*-*KBR* in Equation (16). It shows that the retrograde FPGs start highlighting their advantages after degree 50.

**Figure 5.** True HIS signal and degree geoid height errors (DGHE) of different GFO-KBR & FPG combined models in January 2006.

**Table 4.** Error reduction rate (ERR) of different GFO-KBR & FPG combined models with respect to the GFO-KBR standalone model in terms of cumulative geoid height error (CGHE) in January 2006


To observe the error patterns in spatial domain, we draw the global spatial differences plots (SDP), introduced in Section 2.4, of above four GFO-KBR & FPG combined models. As shown in Figure 6, obviously fewer stripes can be found in the retrograde-orbit based GFO-KBR & FPG combined models, comparing subfigure (a) GFO-KBR & FPG-87 with (c) GFO-KBR & FPG-93 or (b) GFO-KBR & FPG-89 with (d) GFO-KBR & FPG-91. These typical north-south stripes are, to a large extent, the product of temporal aliasing errors, which are caused by errors in both ocean tide model and Atmosphere and Ocean De-Aliasing (AOD) product [25,26]. It seems that the retrograde FPGs, in combination with GFO, are more effective on reducing the temporal aliasing errors than prograde ones, especially in low-latitude areas within (30◦*S*, 30◦*N*). However, only marginal improvement can be observed over high-latitude regions in Figure 6.

**Figure 6.** Global spatial differences plots (SDP) of different GFO-KBR & FPG combined models in January 2006: (**a**) GFO-KBR & FPG-87; (**b**) GFO-KBR & FPG-89; (**c**) GFO-KBR & FPG-93; and (**d**) GFO-KBR & FPG-91.

#### *3.4. One-Year GFO-combined Model Series of FPG-89 and FPG-91*

In view of FPG-87 and FPG-93 s deficiencies in zonal SHC determination, we therefore choose FPG-89 and FPG-91 in further GFO-joint gravity model recovery for the entire year 2006. We produce 12 monthly GFO-KBR & FPG models for FPG-89 and FPG-91, respectively. Monthly wRMS values of their global SDPs are demonstrated in Figure 7. As expected, GFO-KBR & FPG-91 outperforms the GFO-KBR & FPG-89 for its lower wRMS values over the whole year. Furthermore, we can observe slight variations in wRMS values of GFO-KBR & FPG-91 and GFO-KBR & FPG-89 models from month to month. Therefore, April and October are selected for further illustrations with corresponding results plotted in Figure 8.

**Figure 7.** Monthly latitude-weighted root mean square (wRMS) of global spatial differences plots (SDP) for GFO-KBR & FPG-89 and GFO-KBR & FPG-91 in 2006.

**Figure 8.** True monthly HIS signal spatial plots and global spatial differences plots (SDP) of GFO-KBR & FPG-89 and GFO-KBR & FPG-91 in April and October 2006: (**a**) HIS signal in April; (**b**) HIS signal in October; (**c**) SDP of GFO-KBR & FPG-89 in April; (**d**) SDP of GFO-KBR & FPG-89 in October; (**e**) SDP of GFO-KBR & FPG-91 in April; (**f**) SDP of GFO-KBR & FPG-91 in October.

On a global scale, results in April and October are plotted in the left and right columns of Figure 8, respectively. Subfigures in the top row represent the spatial signal plots of monthly true HIS signals, which are created the same way as SDP except that the (Δ*Clm*, <sup>Δ</sup>*Slm*)=(*CHIS lm* , *S HIS lm* ) in Equation (14); The middle and bottom rows represent SDPs of GFO-KBR & FPG-89 and GFO-KBR & FPG-91, respectively. As shown, the GFO-KBR & FPG-91 apparently suffers less from the temporal aliasing errors, especially in low- and medium-latitude areas. The error reductions in these areas are beneficial to the scientific study of global water cycle since the world's major river basins (e.g., the Amazon and Yangtze Rivers), as well as the majority of oceans, are located in the low and medium latitudes.

On a regional scale, yearly-average wRMS of GFO-KBR & FPG-89 and GFO-KBR & FPG-91 are respectively calculated based on their regional SDPs. The regions that we select basically represent typical basins in low, medium, and high latitudes as marked in Figure 9. Besides, the ocean and land areas are also separately analyzed. As listed in Table 5, the superiority of GFO-KBR & FPG-91 over GFO-KBR & FPG-89 is mainly presented in the low- and medium-latitude regions, including the Amazon basin, North Murray–Darling basin and North China Plain (NCP). The last column of Table 5 shows ERR of GFO-KBR & FPG-91 over GFO-KBR & FPG-89 in terms of regional wRMS by setting *qi* = *wRMS***<sup>Ω</sup>** *GFO*-*KBR* & *FPG*-91 and *qj* = *wRMS***<sup>Ω</sup>** *GFO*-*KBR* & *FPG*-89 in Equation (16). It demonstrates that the GFO-KBR & FPG-91 reduces about 22% errors in ocean and 17% in land areas with respect to GFO-KBR & FPG-89. However, ERR mainly decreases with higher latitude and is even negative in the Greenland. The series of GFO-LRI & FPG-89 and GFO-LRI & FPG-91 models are not demonstrated here because of their indistinctive differences in spatial domain.

**Figure 9.** Typical regions selected for GFO & FPG-89 and GFO & FPG-91 models' comparisons.

**Table 5.** Yearly-average latitude-weighted root mean squares (wRMS) of regional spatial differences plots (SDP) in GFO-KBR & FPG-89 and GFO-KBR & FPG-91 models. The final column shows the error reduction rate (ERR) of GFO-KBR & FPG-91 with respect to GFO-KBR & FPG-89 in terms of regional wRMS.


#### **4. Discussion**

The advantages of GFO & retrograde-FPGs over GFO & prograde-FPGs are believed to be contributed by two aspects. On one hand, since the prograde and retrograde orbits sample the Earth in different directions, combining the two can obtain an integrated system that observes the Earth in a more multi-perspective view. Furthermore, as GFO is flying in a prograde orbit, adding retrograde FPGs (rather than the prograde ones) can make the GFO & FPG integrated system more isotropic [37]. On the other hand, according to the Kaula's linear perturbation theory [47], the RAANs of prograde and retrograde orbits precess in different directions in the inertial frame. Therefore, when projected into the Earth-fixed frame, the satellite ground tracks of the two evolve with different rates since the Earth rotates eastwards. As such, the signal sampling of the two differs in time domain. Once combined, they contribute to a more homogeneous system for signal sampling in terms of temporal resolution.

Due to the designed polar orbit for polar-type satellites, the low- and medium-latitude areas suffer from much sparser ground tracks than the polar region. The former, however, is just the place where most of temporal aliasing errors appear. Therefore, improved ground tracks in these areas matter much to the final gravity solution, and that is where most of the gains of GFO & FPG-91 come from. However, we also observe the degradation of GFO & FPG-91 at high latitudes such as the Greenland compared with GFO & FPG-89, which is possibly due to the pattern change of GFO & FPG-joint ground tracks at the intersection of prograde FPG-89 and retrograde FPG-91. Similar degradations in the intersection of two Bender-type orbits are also found in [60] (pp. 108–111), which needs further investigations.

#### **5. Conclusions**

In this paper, we investigate the orbit configurations for Future Polar-type Gravity missions (FPG) with emphasis on orbit inclination selection via full-scale simulations based on both standalone and joint GFO constellations. Four monthly models in January 2006 under FPG configurations with varying inclinations (87, 89, 91, and 93 degrees) show similar performances except that FPG-87 and FPG-93 suffer from relatively serious deficiencies in the zonal SHC estimation. To improve the GFO-only monthly models, the information from different FPGs are respectively added to them, leading to different GFO & FPG combined models. Among combined models, the retrograde-orbit FPGs (FPG-91 and FPG-93) generally outperform the prograde ones (FPG-89 and FPG-87). The advantages are revealed not only in spectral domain by smaller degree geoid height error (DGHE) and cumulative geoid height error (CGHE), but also in the spatial domain by obvious fewer north–south stripes, especially at low and medium latitudes. More comprehensive investigation is carried out for FPG-89 and FPG-91 by analyzing one-year time series of GFO-KBR joint monthly models, which further highlights the superiority of FPG-91 over FPG-89. Detailed regional analysis shows that improvements mainly reside over those low- and medium-latitude regions. The mechanism behind is explained by more isotropic GFO & FPG observation system that integrates two different orbit types (prograde and retrograde). Based on our study, the FPG configuration with 91-degree inclination is therefore worthy to be recommended for the further orbit design of FPG.

**Author Contributions:** Y.S. conceived and designed the experiments; Y.N. and Q.C. performed the experiments; Y.N. and Q.C. analyzed the data; Y.N. wrote the paper; Y.S. and Q.C. revised the manuscript.

**Funding:** This research was funded by the Natural Science Foundation of China (41731069) and Alexander von Humboldt Foundation.

**Acknowledgments:** We are grateful to Wei Feng for the GRACE Matlab Toolbox (GRAMAT) [62] used in some figures' plotting of the article. The first author would like to thank Weiwei Li from Shandong University of Science and Technology and Tianyi Chen from Tongji University for their valuable suggestions on preparing the article. Two anonymous reviewers and the academic editor are sincerely appreciated for improving the article.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Gravity Field Recovery Using High-Precision, High–Low Inter-Satellite Links**

#### **Markus Hauk \* and Roland Pail**

Institute of Astronomical and Physical Geodesy, Technical University of Munich, Arcisstrasse 21, 80333 Munich, Germany; roland.pail@tum.de

**\*** Correspondence: markus.hauk@tum.de

Received: 5 February 2019; Accepted: 27 February 2019; Published: 5 March 2019

**Abstract:** Past temporal gravity field solutions from the Gravity Recovery and Climate Experiment (GRACE), as well as current solutions from GRACE Follow-On, suffer from temporal aliasing errors due to undersampling of the signal to be recovered (e.g., hydrology), which arise in terms of stripes caused by the north–south observation direction. In this paper, we investigate the potential of the proposed mass variation observing system by high–low inter-satellite links (MOBILE) mission. We quantify the impact of instrument errors of the main sensors (inter-satellite link and accelerometer) and high-frequency tidal and non-tidal gravity signals on achievable performance of the temporal gravity field retrieval. The multi-directional observation geometry of the MOBILE concept with a strong dominance of the radial component result in a close-to-isotropic error behavior, and the retrieved gravity field solutions show reduced temporal aliasing errors of at least 30% for non-tidal, as well as tidal, mass variation signals compared to a low–low satellite pair configuration. The quality of the MOBILE range observations enables the application of extended alternative processing methods leading to further reduction of temporal aliasing errors. The results demonstrate that such a mission can help to get an improved understanding of different components of the Earth system.

**Keywords:** mass transport in the Earth system; GRACE and GRACE follow-on mission; current and future observation concepts and instruments

#### **1. Introduction**

In times of a changing climate the need for innovative observation techniques for capturing geophysical processes in the Earth system becomes increasingly urgent. In this context, the observation of the temporal gravity field by satellites from space play an important role when investigating, e.g., rapid changes in the cryosphere, oceans, water cycle, and solid Earth processes on a global scale. For the determination of temporal gravity fields, in the last decade satellite missions such as GRACE [1] or Challenging Minisatellite Payload (CHAMP) [2,3] orbited around the globe and helped to get a better understanding of the Earth's mass flux signals. CHAMP was based on high–low satellite-to-satellite tracking (SST) exploiting the Global Positioning System (GPS) [4] over a time span of 10 years. The accuracy of the CHAMP orbit information of 2–3 cm [5] derived from GPS allowed for resolving only the long wave range of the time varying gravity field, with spatial scales of ≈1000 km, e.g., Baur 2013 [6]. Analyzing the perturbed orbit of other Low Earth Orbiters (LEO), such as the Swarm satellites, allows for a similar performance [7]. The GRACE mission reached spatial scales of the temporal gravity field of ≈300 km and below due to a combination of K-band microwave low–low inter-satellite ranging between two identical satellites following each other in the same orbit at a distance of about 220 km with micrometer precision, and high–low GPS satellite-to-satellite tracking plus accelerometer observations. These missions improved our knowledge of water mass variations on the continents, in the oceans, and the atmosphere to a great extent. Additionally, the static gravity field retrieved from the Gravity field and steady-state Ocean Circulation Explorer (GOCE) [8] mission has improved our knowledge of the long-term static mass distribution, and has provided the physical reference surface of the geoid with centimeter precision with a spatial resolution down to 70–80 km.

The observation of the Earth's gravity field will be continued by the GRACE Follow-On mission [9], which was successfully launched in May 2018. The instruments have been slightly modified compared to those used in GRACE, and additionally the GRACE Follow-On mission includes an inter-satellite laser ranging interferometer as a demonstrator [10], resulting in an increased accuracy of the SST observations to a few nanometers.

One of the main error contributions when observing the time variable gravity field results from geophysical signals with periods shorter than the temporal resolution of the satellite mission, as they alias into the solutions. Traditionally, the temporal resolution of the GRACE data products has been 30 days [1], leading to temporal aliasing effects due to non-tidal mass variations, especially atmospheric, oceanic, and hydrological signals having large amplitudes in the high-frequency range [11], as well as tidal signals with mainly semi-diurnal and diurnal periods. The temporal aliasing errors in GRACE appear in terms of striping effects, which are caused by the anisotropic error behavior resulting from along-track inter-satellite ranging. This type of error pattern also remains for the GRACE Follow-On mission because temporal aliasing errors clearly dominate the error budget of gravity field retrieval [12–15], while instrument errors of the laser interferometer only play a minor role in the total error budget.

In general, there are different approaches of how to deal with temporal aliasing errors: The gravity field solutions retrieved by GRACE are typically treated with de-striping and filtering techniques (e.g., References [16–19]), which are applied a posteriori in order to reduce the striping effects. A further de-aliasing method is proposed by Watkins et al. 2015 [20], where observations from GRACE are being processed using spherical cap mascons resulting in greater resolutions for smaller spatial regions. In the context of a Next Generation Gravity Field Mission (NGGM), various satellite constellations enabling a self-de-aliasing of high-frequency signals were investigated, such as the dual-tandem Bender-type mission [21], consisting of one near-polar pair and one inclined pair. Such a constellation allows the combination of two anisotropic measurements taken in different directions, which increases the isotropy of the combined system. A further concept is the pendulum formation where two satellites are on slightly shifted orbit planes in such a way that the line of sight between the satellites does not only contain along-track components, but also cross-track components [22]. Such innovative satellite constellations offer the application of improved gravity field processing methodologies in order to exploit the full potential of gravity field solutions with enhanced spatial and temporal resolution. Wiese et al. 2011 [23] proposed a method where low-resolution gravity field solutions are co-parameterized for short periods (e.g., daily) together with the long-term solutions (e.g., monthly) in order to mitigate the non-tidal high-frequency signals (especially atmosphere, ocean, and to some extent, also hydrology). A method to reduce tidal aliasing errors proposed by Hauk & Pail 2018 [24] aims at a co-parameterization of ocean tide parameters over time spans of several years, where the estimated tide model is used for de-aliasing during gravity field retrieval in a second processing step.

In this paper we pick up the point of new satellite constellations using an innovative observation concept of high-precision high–low inter-satellite ranging, which was proposed as the (MOBILE) mission [25] in response to European Space Agency (ESA)'s Earth Explorer 10 call. The observation geometry and the measuring principle of this satellite constellation are based on the Geodesy and Time Reference in Space (GETRIS) [26] concept describing a global space-borne infrastructure for data transfer, clock synchronization, and ranging, where gravity field recovery can be one of the first beneficiary applications of such an advanced geodetic space infrastructure.

In this study, we analyze the potential of the MOBILE mission qualitatively and quantitatively, and compare gravity field solutions with GRACE Follow-On-like solutions by means of full-scale numerical simulations. Furthermore, the potential of an extended processing method for the reduction of temporal aliasing is investigated for the MOBILE concept. In the following Section 2, the MOBILE observation configuration is described, while Section 3 gives an overview of the simulation environment including

observation equations and stochastic modelling. In Section 4 the estimated gravity field solutions are analyzed and assessed. The main conclusions are summarized in Section 5, and in Section 6, a short outlook is given.

#### **2. Mission Concept**

#### *2.1. Observation Geometry*

In contrast to the past GRACE and the current GRACE Follow-On missions, which are mainly based on LEO satellites (several hundred km), the MOBILE minimum configuration consists of a constellation of two high and one low orbiting satellites. As done for GRACE and GRACE Follow-On, the main observable is the gravity-induced inter-satellite distance change, which is in case of MOBILE measured between medium orbiting satellites (MEO; several thousand km) and LEO satellites though. As a second gravity observation type, high-precision orbit positions based on Global Navigation Satellite System (GNSS) orbit determination are used. This idea of high-precision high–low tracking was first investigated by Hauk et al. 2017 [26] using the inter-satellite link technique as part of the payload on-board Galileo satellites of future generation in connection with LEO satellites, where the main error sources and the corresponding achievable performance were analyzed. Due to the fact that the MOBILE constellation presents a stand-alone concept without the need to place an additional payload on another space infrastructure, and the very large distance between the high- and the low orbiting satellites, which plays a crucial role in the framework of high-precision high-low tracking, dedicated MEO satellites were included in the concept. It should be emphasized that alternatively, a constellation of one MEO and two LEO satellites could be envisaged, which turns out to provide nearly the same performance as the proposed one, but might be more expensive due to the need to build and maintain at least two LEO satellites in orbit.

Figure 1 shows a schematic overview of the MOBILE satellite formation. The orbit parameters, which are used in the simulation environment to build up different satellite constellations, are listed in Table 1. The MEO satellites orbit at an altitude of about 10,150 km in the same orbital plane, separated by an 180-degree mean anomaly as alternating targets of the LEO satellites, in order to maximize the visibility and thus observation time. The LEO satellite is orbiting in an altitude of about 360 km. Both LEO and MEO satellites are flying in polar orbits in order to maintain a long-term stable formation (no relative drifts of the orbit planes). Additionally, two LEO satellites with near-polar orbits flying in an altitude of about 470 km with an inter-satellite distance of 200 km are set up in order

to perform comparability studies between the MOBILE constellation and a GRACE Follow-On-like mission. All orbits have certain repeat cycles, after which the satellites reach the same position on Earth again in order to maintain a stable ground track pattern and related stable gravity model quality. The choice of the orbit height of the MEO satellites underlies three major constraints: (1) A high altitude of several thousand kilometers is necessary in order to ensure long observation periods and preferably measurements of multi-directional distance variations, with a strong dominance of the radial component, resulting in a close to isotropic error behavior of the retrieved gravity field solution (see Section 4). (2) The distance between a MEO–LEO pair must not be too large, because the larger the distance, the more difficult it is to fulfill the 1-μm accuracy requirement for the inter-satellite link established by a laser range interferometer (see Section 2.2). (3) The third constraint is driven by solar radiation belts encircling the Earth in which energetic charged particles are trapped inside the Earth's magnetic field [27], which are of different intensities dependent on the solar cycle, altitude, and inclination of the satellite orbit. As a result, altitude ranges of several thousand kilometers below the chosen orbit height drop out. These conditions connected with the repeat orbit lead to an altitude of about 10,000 km for the MEO satellites.

The high–low tracking concept enables a multi-directional observation geometry with differing elevation angles from 3◦ (assumed minimum elevation angle of visible MEOs observed by the LEO) up to a near-radial direction. However, due to the observation geometry of the MEO–LEO satellite pairs and the changing satellite links from one to the other MEO, data gaps arise for every satellite pair, leading to a non-continuous measurement time series of these pairs. For the simulated MOBILE constellation, this results in a ranging window maximum of 45 min, and a maximum data gap of 18 min. The separation of the two MEO satellites of 180-degree mean anomaly is chosen to keep the time period of the data gap as small as possible. In Figure 2, the LEO ground track of the MOBILE concept is displayed for 1 day together with the corresponding elevation angles.


MEO 1 10.149 90 124/30 0 MEO 2 10.149 90 124/30 180 Low–low 1 467 89 412/27 53.21 Low–low 2 467 89 412/27 51.51


**Figure 2.** One-day LEO ground track of the MOBILE mission constellation together with the elevation angle of visible MEOs observed by the LEO.

#### *2.2. Instrumentation*

The main observable in the MOBILE mission are range measurements from the LEO to the MEOs, where the MEOs are alternating targets. The ranging accuracy is on the micrometer level in order to be sensitive for gravitational forces and its changes on Earth. For distances of several thousand kilometers, a laser-based distance measurement system can reach such an accuracy. The laser range interferometer is placed at the LEO satellite, while the MEOs are equipped with passive reflectors or transponders. In case of the GRACE Follow-On mission, the measurement of inter-satellite ranges by laser range interferometry (LRI) has been successfully established. The link between the two satellites was generated with an active laser on one satellite, and a phase-locked amplifying transponder on the second spacecraft [10]. For the MOBILE concept, the laser ranging instrument needs to be adapted due to the very large distance and the relative motion of the LEO and MEO satellites. In contrast to the GRACE Follow-On, the large distance and the relative speed lead to a range of Doppler shifts of several GHz compared to a few MHz, which causes the need of a reference laser source with a larger range of reference frequencies and a faster phase-tracking capability than implemented for the GRACE Follow-On. The required parameters (<10 GHz range, <10 MHz/s tracking) are within the range of existing, space qualified reference lasers (e.g., the one used for the ATmospheric LIDar (ATLID) instrument on the Earth Clouds Aerosols and Radiation Explorer (CARE) mission) [28], but their compatibility with the needs of an interferometric instrument has to be the subject of further studies. Due to the relative motion of the LEO and MEO satellites, pointing tracking capabilities are required, which requires a modified link implementation. The LEO satellite is selected to play the active part in the tracking mechanism, while the partner satellites (MEOs) are equipped with passive retroreflectors. This type of laser tracking and ranging has been successfully performed for decades with active laser systems on the ground and passive retroreflectors on satellites in orbit (e.g., Laser Geodynamics Satellite (LAGEOS), Ball Lens In The Space (BLITS)) [29,30]. The scientific benefit of deploying a passive payload in space is the significantly increased mission duration when compared to complex active payloads. The main technological challenge in utilizing this setup for an LRI instrument is the need to achieve a sufficiently high level of retrieved power without the need for amplification between the two passes, ideally close to the 80 pW received by the GRACE Follow-On implementation, but at least to levels above ≈1 pW in order to allow phase tracking. The main design factors impacting the received power are the initial output power, the size of the retroreflector, and the size of the receiving telescope.

Satellite on-board sensors play an important role in the gravity field retrieval by influencing satellite observations due to correlated noise. In our study, the error assumptions used for the laser ranging instrument in the MOBILE concept are based on the time-series provided by Schäfer et al. 2013 [31], which originated in connection with ESA's GETRIS study, and show micrometer ranging accuracy around 1 MHz. Due to simulation purposes, this time-series was adapted by means of cascaded second order Butterworth auto regressive moving average (ARMA) filter model. The spectral behavior of the LRI is shown in Figure 3 (light green curve) in terms of an amplitude spectral density (ASD). The relative distance measurement errors assumed for the low–low satellite pair are identical to those used in the frame of the ESA-Assessment of Satellite Constellations for Monitoring the Variations in Earth's Gravity Field (SC4MGV) project [32], provided from the consultancy support of Thales Alenia Space Italia, and show a performance of about several 10 nanometers. The corresponding analytical noise model of the used laser interferometer is given by the ASD in terms of range-rates (Figure 3, light blue curve):

$$\text{Ind}\_{\text{range}-\text{random}} = 2 \times 10^{-8} \cdot 2\pi \text{f} \cdot \sqrt{\left(\frac{10^{-2} \text{Hz}}{\text{f}}\right)^2 + 1} \frac{m}{\ast \sqrt{\text{Hz}}}.\tag{1}$$

The generation of all noise time-series was done by scaling the spectrum of normally distributed random time-series with their individual spectral model.

**Figure 3.** Amplitude spectral density (ASD) of the relative distant measurement errors in terms of range-rates for the low–low pair formation (dark blue) and for MOBILE (dark green). The generation of the noise time-series was done by scaling the spectrum of normally distributed random time-series with their individual spectral model (light blue and light green curves).

The non-gravitational forces are typically sensed by the on-board accelerometers located in the center-of-mass of the satellite. In case of the LEO satellites, the implementation of an accelerometer is absolutely necessary due to air drag as the main contributor. For the low–low pair a GRACE-like electrostatic accelerometer is assumed with two highly sensitive axes oriented in the flight direction (largest signal) and in the radial direction, and one low-sensitive axis in the cross-track direction (see Figure 4, blue and red curves). The accuracy level in terms of accelerations is derived by Iran Pour et al. 2015 [32], and is expressed by:

$$\mathbf{d}\_{\text{acc. x}} = \mathbf{d}\_{\text{acc. x}} = 10^{-11} \sqrt{\left(\frac{10^{-3} \text{Hz}}{\text{t}}\right)^4 \left(\left(\frac{10^{-5} \text{Hz}}{\text{t}}\right)^4 + 1\right) + 1 + \left(\frac{\text{t}}{10^{-11} \text{Hz}}\right)^4} \frac{\mathbf{m}}{\mathbf{s}^2 \sqrt{\text{Hz}}} \tag{2}$$

$$\mathbf{d}\_{\text{acc\\_y}} = 10 \cdot \mathbf{d}\_{\text{acc\\_x}} \tag{3}$$

with x denoting along-track, y across-track, and z (close to) the radial direction. Based on the heritage of previous gravity missions for MOBILE, we seek a resolution on the level of 10−<sup>11</sup> m/s2, which is the same as assumed for the along-track and radial axes of the accelerometer on-board the low–low satellite pair, but ideally with the same performance in all three directions. Furthermore, the slope at frequencies from 10−<sup>3</sup> Hz and lower is pressed down from 1/f<sup>2</sup> for the low–low pair to 1/f for MOBILE. The performance of the relative acceleration measurement error is displayed in Figure 4 (green curve). While an accelerometer is mandatory for the MOBILE LEO satellite, for the MEOs, less stringent requirements might apply because of the substantially smaller amplitude of the signal and the fact that non-conservative forces can be modelled much more accurately in high altitudes. Also, the design of the MEO could be optimized for the high predictability of non-gravitational forces, e.g., by implementing very simple geometrical surfaces wherever radiative pressure is relevant. In spite of these facts, in the MOBILE concept, the implementation of accelerometers is proposed.

**Figure 4.** Amplitude spectral density (ASD) of the relative acceleration measurement error in terms of accelerations for the low–low pair formation (dark blue and magenta) and for MOBILE (dark green). The generation of the noise time-series was done by scaling the spectrum of normally distributed random time-series with their individual spectral model (light blue, red, and light green curves).

Geo-location of satellite observations, as well as gravity retrieval, require highly accurate continuous orbit determination, making GNSS space receivers on all satellites obligatory. In our simulations, we assume an absolute kinematic positioning on a cm level. Using a laser ranging instrument as the main measurement system requires exact pointing of the tracking antenna in the order of 10 μrad or less, and therefore the implementation of systems for attitude determination and control. We assume star camera sensor errors for all satellites represented as rotation angles around the along-track (roll), cross-track (pitch), and radial (yaw) axes, expressed by the ASD of the following analytical noise models [32]:

$$\text{rad}\_{\text{roll}} = 10^{-5} \sqrt{\left(\frac{10^{-5} \text{Hz}}{\text{f}}\right)^4 / \left(\left(\frac{10^{-5} \text{Hz}}{\text{f}}\right)^4 + 1\right)} + 1 \frac{\text{rad}}{\sqrt{\text{Hz}}^4} \tag{4}$$

$$\mathbf{d}\_{\rm pldch} = \mathbf{d}\_{\rm yuw} = 2 \times 10^{-6} \sqrt{\left(\frac{10^{-2} \text{Hz}}{\text{l}}\right)^2 \left(\left(\frac{10^{-1} \text{Hz}}{\text{l}}\right)^2 + 1\right)} + 1 \frac{\text{mol}}{\sqrt{\text{Hz}}}.\tag{5}$$

In addition, for the MOBILE LEO satellite, a drag-reduction system needs to be implemented in order to maintain the orbit, and not to saturate the accelerometers due to non-gravitational accelerations. The MEO satellites will very likely require an electrical propulsion system to move to their target orbit from the lower separation altitude achievable with a low-cost launcher.

#### **3. Simulation Environment**

All simulations were executed with a full numerical mission simulator [33,34], which has already been successfully applied to recover satellite-only gravitational field models from GOCE data [35]. The simulation environment is based on numerical orbit integration, following a multistep method for the numerical integration according to Shampine & Gordon 1976 [36], which applies a modified divided difference form of the Adams predict-evaluate-correct-evaluate (PECE) formulas and local extrapolation. According to this method, the order and the step size are adjusted to control the local error per unit step in a generalized sense. The generation of "true" dynamic orbits and, subsequently, the "true" GNSS high–low SST and low–low laser ranging SST observations, is done by adding different force models according to the "true" world of Table 2. The impact of orbit errors on the gravity field processing is taken into account as well by propagating 1 cm white noise of the integrated orbit positions of each satellite. The resulting erroneous dynamic orbits serve as computational points for the reference values of the observations and enable the computation of the GNSS high–low SST observations in three directions. In order to ensure the accuracy of the inter-satellite link, error-free dynamic orbits are used for the reference values of the low–low SST observations from the laser interferometer system, which are expressed in terms of range-rates.


**Table 2.** Force and noise models of the "true" and "reference" world used in the simulations.

The adopted gravity field approach is based on a modification of the integral equation approach from Schneider 1969 [37] where the orbit is divided into continuous short arcs of 6 h length, and the position vectors at the arc node points are set up as unknown parameters, which are estimated together with the gravity field coefficients. This technique has already been successfully applied in real data applications to recover satellite-only gravitational field models for CHAMP and GRACE [38] (ITSG-Grace2016) [39]. The functional model follows the typical formulation used for low–low SST missions like GRACE, which comprises a high–low SST and a low–low SST component. Position differences between two satellites are used for the computation of the reference values for the high–low SST part of the observation system, whereas the reference values for the low–low SST part are derived by projecting position and velocity differences between two satellites onto the line-of-sight, leading to the computation of inter-satellite range-rates. Table 2 gives an overview of the force and noise models used in the processing for the "true" and "reference" world. The static gravity field model is represented by the GOCO03s model, which is a satellite-only gravity field model based on GRACE, GOCE, and LAGEOS [40]. In order to simulate geophysical signals, ESA's updated Earth system model [41] has been used, which contains the five main geophysical signal components atmosphere (A), ocean (O), hydrology (H), ice (I), and solid Earth (S) with a time resolution of six hours, linearly interpolated to the epochs. The Earth system model covers the time period 1995–2006, and contains plausible variability and trends in both low-degree coefficients and the global mean eustatic sea level. It depicts reasonable mass variability all over the globe at a wide range of frequencies including multi-year trends, year-to-year variability, and seasonal variability, even at very fine spatial scales, which is important for a realistic representation of spatial aliasing and leakage. The impact of ocean tide model errors is assessed by taking the difference of two tide models, EOT11a [42], and GOT4.7 [43].

The total stochastic model for the observations is approximated individually for both satellite formations by means of a cascade of digital Butterworth ARMA filters [44,45]. Filter coefficients are chosen in such a way that the cascade's frequency response optimally matches the inverse of the amplitude spectrum of the previously generated pre-fit residuals. They are estimated as a result of the computation of the linearized normal equations, which include differences between the "true" (only the static GOCO03s gravity field model and sensor noise are included) and the reference observations (only the static GOCO03s gravity field model is included), such that the error sources from the sensors are considered exclusively. Assuming uncorrelated high–low and low–low SST observations, weighting matrices are set up for all observation components separately.

The goal is the retrieval of all spherical-harmonic (SH) coefficients up to a maximum SH degree of 100 from observations sampled every 5 seconds for the first 30 days of the year 2001. Due to the fact of non-linear observation equations, the "reference" observations are reduced from the "true" observations as a result of the linearization process. The gravity field parameters are estimated by solving full normal equations of a least squares system based on a standard Gauss–Markov model using weighted least squares with stochastic models in accordance with the simulated instrument noise levels. The resulting gravity field coefficients are analyzed and compared regarding quality and performance in terms of retrieval errors by removing a monthly average of the true mass transport model from the recovered signal.

#### **4. Results**

#### *4.1. Gravity Field Retrieval Performance Due to Instrument Errors*

At first the impact of the instrument errors on the gravity field retrieval were quantified. For this task, we performed simulations where each error source according to the assumptions described in Section 2.2 was treated individually. Figure 5 shows the gravity field retrieval performance in terms of equivalent water height (EWH) errors per SH degree per coefficient for the low–low pair constellation and the MOBILE concept. Furthermore, the results were quantified using global RMS values of the errors in the recovered signal expressed in terms of cm of EWH, listed in Table 3 (see part: instrument errors). If only white-noise positioning errors were considered (Figure 5, green curves), the gravity field retrieval performance mainly depended on the observation geometry. The comparison between both satellite concepts revealed strongly reduced retrieval errors for MOBILE, which benefited from multi-directional observations. In the case of accelerometer noise in combination with star camera errors (Figure 5, blue curves), the MOBILE constellation showed reduced error behavior compared to the low–low pair as well. This was mainly caused by the observation geometry, but also by the improved accelerometers (≈23%) with 3D capabilities to certain parts in the case of MOBILE. In contrast, the retrieval performance of the low–low pair benefits from the nanometer accuracy of the laser interferometer compared to the micrometer accuracy of MOBILE's laser link sensor for the most part of the spectrum (>SH degree 10). This became evident when only laser interferometer noise was considered (Figure 5, red curves). However, in the very low degrees, the MOBILE concept performed better than GRACE, which was again owed to the fact of an improved observation geometry. When considering all instrument error sources together, the retrieval errors of the low–low satellite pair are dominated by the accelerometer plus star camera sensor performance, while for the MOBILE constellation, the laser link error was the dominating error source for SH degrees higher than 20, and the accelerometer plus star camera noise only dominated the spectrum in the lower degrees. These results led to the conclusion that the gravity field retrieval based on instrument error sources showed smaller errors below SH degree 40 for the MOBILE concept compared to the low–low pair, but increased errors in the higher frequency spectrum due to the lower accuracy of the laser interferometer.

**Figure 5.** Degree (error) standard deviations after 1 month of full AOHIS signal (black), low–low pair formation (dashed curves), and MOBILE constellation (continuous curves), when including only distance measurement errors (red), acceleration plus star camera errors (blue), and orbit errors (green).

**Table 3.** Global RMS values in cm EWH for the low–low pair and MOBILE constellation considering different isolated error sources. RMS values are computed up to SH degree 100 for instrument errors, and up to SH degree 50 for temporal aliasing errors. Additionally, the global RMS value of the monthly averaged AOHIS and HIS signal is given (computed up to SH degree 50 and 100).


Next to the estimation of SH coefficients, we estimated their formal errors as well, shown in Figure 6. The noise of the different sensors in combination with the observation geometry reveal the performance of a specific satellite concept. In our case, they demonstrated the impact of the MOBILE high–low tracking concept by showing an almost uniform (isotropic) error spectrum and a high sensitivity in the sectorial coefficients (SH degree equal to SH order). In case of the low–low pair configuration especially, the sectorial coefficients were less well-determined than the zonal coefficients (SH order equal to zero). Figure 6b,d gave a closer view of the formal errors located in the long wavelength (low-degree) spectrum. The comparison between MOBILE and the low–low pair led to the assumption that the determination of the very low SH coefficients could be accomplished with a higher sensitivity through the MOBILE concept. In contrast to the observations in the along-track direction of the low–low pearl-string configuration, the multi-directional observations of the high–low tracking concept with a strong dominance of the radial component enabled an improved estimation of the very low SH coefficients. The close to radial observation geometry of MOBILE was comparable to satellite laser ranging (SLR) observations, showing superior performance in observing the very long wavelength gravity field variations, in particular the zonal SH coefficient of degree 2, which physically represented the Earth's dynamic oblateness [46].

**Figure 6.** Formal error triangle plots in log10 up to SH degree and order 100 (**a**,**c**), and up to SH degree and order 10 (**b**,**d**), for the low–low pair formation (**a**,**b**), and the MOBILE constellation (**c**,**d**).

In order to make the effect of the different error spectra of both satellite concepts even more visible, spatial covariance functions were computed for a position at the equator and at 45◦ latitude (see Figure 7). They describe the correlation of the computation point with its neighborhood in the normal equation system due to the used stochastic model and the observation geometry. The spatial characteristics and the pattern of the covariances provide information about the spatial behavior of the retrieved signals. In our case, the figures show the typical stripes for the low–low tracking concept caused by the north–south observation direction that are known from the GRACE temporal gravity models, while the MOBILE concept exhibited an isotropic error structure at both latitudes.

**Figure 7.** Spatial covariance functions in m<sup>2</sup> for the equator (**a**,**c**), and for 45◦ latitude (**b**,**d**), for the low–low pair formation (**a**,**b**), and the MOBILE constellation (**c**,**d**).

#### *4.2. Temporal Gravity Field Retrieval*

The retrieval of the temporal gravity field is dominated by temporal aliasing errors due to the undersampling of high frequency geophysical signals and imperfect de-aliasing models, which has already been shown by, e.g., References [47–49] for the GRACE mission. In order to analyze the impact of different time-varying mass signals on gravity field retrieval, we performed simulations by using signals that were subdivided into non-tidal AOHIS, HIS, and tidal signals, including the instrument errors described in Section 2.2. Figure 8 displays the corresponding retrieval errors for both satellite concepts. The results indicate that the errors with the highest signal amplitudes were related to AOHIS signals (Figure 8, red curves), and in particular to atmospheric and oceanic signals. Tidal aliasing effects played a key role in the total error budget as well (Figure 8, blue curves) by representing the highest aliasing errors next to non-tidal atmosphere and ocean aliasing errors. In this context it is important to mention that errors in ocean tide models are considered as one of the major sources of error in the determination of temporal gravity field models from GRACE data [50,51]. Our simulations show that the MOBILE configuration can reduce non-tidal aliasing errors (≈45%) as well as tidal aliasing errors (≈30%) over the whole spectrum significantly (see also Table 3, part: temporal aliasing errors). Despite the fact that for the high–low tracking concept the assumed arrangement of satellites causes incomplete data time series, the multi-directional observation geometry enables the sampling of time varying signals with reduced aliasing errors compared to the low–low pair configuration.

**Figure 8.** Degree (error) standard deviations after 1 month of full AOHIS signal (black), low–low pair formation (dashed curves), and MOBILE constellation (continuous curves), when including only AOHIS signals + instrument errors (red), ocean tide signals + instrument errors (blue), HIS signals + instrument errors (green), and HIS signals + instrument errors using the extended processing method (magenta).

Usually high-frequency mass signals are a priori reduced based on atmosphere and ocean de-aliasing (AOD) products [52], and ocean tide de-aliasing models. The resulting temporal gravity field models thus contain mainly information on sub-seasonal, seasonal, and secular continental hydrological mass variations and ice mass variations on Earth [53,54], and solid Earth signals related to glacial isostatic adjustment (GIA), and co- and post-seismic gravity changes of big earthquakes. For the analysis of such mass flux signals we performed simulations by using only the HIS signal. The resulting gravity field retrieval errors (Figure 8, green curves) again revealed smaller aliasing effects for the MOBILE concept (≈60%), which is even better visible when looking at the spatial domain, shown in Figure 9. As already suggested by Figure 7, the retrieved HIS fields demonstrate, that the error pattern of MOBILE was much more homogeneous, and the typical striping of a low–low along-track ranging system is significantly reduced, particularly in the equatorial regions where the orbit ground tracks were less dense. This resulted in a clearly improved free representation of hydrological and ice mass signals for the MOBILE concept.

**Figure 9.** Global grids of EWH (cm) up to SH degree and order 50 after 1 month. The grids show the true HIS signal (**a**), the recovered signal (**b**,**d**,**f**), and the differences of the true HIS signal and the recovered signal (**c**,**e**,**g**), for the low–low pair configuration (**b**,**c**), for the MOBILE concept (**d**,**e**), and for the MOBILE concept using the extended processing method (**f**,**g**).

The high quality of multi-directional observations of the high–low tracking concept allows the application of an extended alternative processing method first proposed by Wiese et al. 2011 [23], which enables the mitigation of temporal aliasing effects due to non-tidal time varying signals, as it was stated in Section 1. Wiese et al. 2011 [23] demonstrated the benefit of the co-parameterization of additional daily low degree and order gravity field coefficients for Bender-type satellite constellations. We investigated the potential of this methodology regarding the MOBILE concept by simulating a monthly solution while co-estimating daily gravity fields up to SH degree and order 10, including HIS signal plus instrument errors. The resulting retrieval errors are displayed in Figure 8 (magenta curve). They revealed an error reduction of about 40% compared to the nominal solution (green curve), which led to an increased spatial resolution of about SH degree 50 (≈400 km) instead of 40 (≈500 km). The corresponding spatial plot (Figure 9, f and g) shows a global reduced aliasing pattern, especially in higher latitudes. The comparison between the true HIS signal and the MOBILE recovered signal (nominal and extended processed) displayed in Figure 9 shows that the quality of the solutions could

be improved to such a level that de-striping and smoothing the solutions was no longer necessary when examining signals to degree and order 50. Therefore, a possible loss of signal by a posteriori filtering of the gravity field solutions recovered by MOBILE could be avoided.

#### **5. Conclusions**

In this study, we investigated the gravity field retrieval performance of the novel and innovative MOBILE high–low satellite tracking concept and compared it with a low–low GRACE Follow-On-like configuration qualitatively and quantitatively. Based on full numerical simulations, gravity field parameters were estimated in terms of SH coefficients by solving a least-squares system by inverting full normal equations over a time span of 1 month. The most important error sources affecting the gravity field retrieval performance, key instruments on-board the satellites, as well as time varying mass flux signals, were included in order to assess their impact on gravity field retrieval for both mission concepts.

The results regarding the instrumental impact on the gravity field solution show that the performance of the MOBILE configuration was mainly limited by the assumed micrometer accuracy of the laser interferometer, especially in the short wavelength spectrum, while the performance in the lower wavelengths of the gravity field benefited from the multi-directional observation geometry and optimized 3D accelerometer. In contrast, the gravity field retrieval of the low–low pair constellation was limited mainly by the accelerometer, which predominated the nanometer accuracy of the assumed laser interferometer. The multi-directional observations of MOBILE mentioned above included a strong radial component and led to an almost uniform (isotropic) error spectrum, while the low–low tracking concept showed the typical stripes caused by the north–south observation direction. However, the high accuracy of the low–low satellite pair's inter-satellite link led to an improved gravity field performance from SH degree 40 and higher compared to MOBILE, which performed better in the long wavelength spectrum where the largest amplitudes of time varying gravity field signals occurred.

The benefit of MOBILE's multi-directional observation geometry arose when including tidal and non-tidal mass variation signals into the simulation process. The results revealed significantly reduced temporal aliasing errors in the recovered gravity field signal compared to the low–low tracking concept over the whole spectrum. In the case of the separate treatment of the HIS signal, the resulting gravity field error performance of MOBILE improved even by about 60%, and the application of an extended processing method to reduce temporal aliasing errors by co-estimation of daily gravity field parameters, led to a further reduction of retrieval errors of about 40%. Furthermore, the results show that the quality of recovered MOBILE gravity field solutions could make a treatment of such solutions using a posteriori filtering techniques obsolete.

The gravity field solutions retrieved using MOBILE can contribute to an improved understanding of different components of the Earth system, such as the estimation of continental water storage and freshwater fluxes, the quantification of large-scale flood and drought events and their monitoring and forecasting, or understanding the mass balance of ice sheets and larger glacier systems, just to name a few. The application of the extended processing method implied the co-parameterized gravity field parameters (which, in our case, are daily gravity fields) as a side product. These daily solutions with low spatial resolution could aid in improving atmospheric models, and possibly be beneficial to the oceanography community as well, as many of these short-term signals have large spatial scales.

#### **6. Outlook**

The gravity field solutions of the high–low tracking concept presented in this paper were based on the minimal configuration of MOBILE. On top of this scenario, the optional implementation of a third or fourth MEO satellite, but also a second LEO, could be considered to further increase the mission performance, but also significantly improve the temporal resolution. Due to the largely passive instrumentation of this mass transport mission, the function of the MEO satellites could be implemented as a backpack application of other MEO missions, such as the Galileo next-generation

satellites, in order to extend and maintain the infrastructure for laser ranging payloads. Aside, one of the most important fields of research is the mitigation of temporal aliasing errors. In this context it is important to mention that aliasing effects due to imperfect ocean tide models represent one of the largest error sources in temporal gravity field retrieval. The capability of the multi-directional observations of MOBILE of co-parameterize tidal parameters over long time spans, as proposed in Reference [24], in order to improve current ocean tide models will be the subject of further study.

**Author Contributions:** Conceptualization, M.H. and R.P.; methodology, M.H.; software, M.H.; validation, M.H. and R.P.; formal analysis, M.H.; investigation, M.H.; resources, M.H. and R.P.; data curation, M.H.; writing—original draft preparation, M.H.; writing—review and editing, R.P.; visualization, M.H.; supervision, R.P.; project administration, R.P.; funding acquisition, R.P.

**Funding:** This research was funded by *Deutsche Forschungsgemeinschaft (DFG)*, grant number, *PA 1543/8-2* and the APC was funded by *Institutional Open Access Program of the Technichal University of Munich (TUM)*.

**Acknowledgments:** A big part of the investigations presented in this paper was performed in the framework of the study "Two-way satellite tracking to provide a basis for gravity field mission scenarios—a simulation study with detailed error analysis II, " Deutsche Forschungsgemeinschaft (DFG), Contract No. *PA 1543/8-2* funded by DFG.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **High-Resolution Mass Trends of the Antarctic Ice Sheet through a Spectral Combination of Satellite Gravimetry and Radar Altimetry Observations**

#### **Ingo Sasgen 1,\*, Hannes Konrad 1,2, Veit Helm <sup>1</sup> and Klaus Grosfeld <sup>1</sup>**


Received: 5 November 2018; Accepted: 9 January 2019; Published: 14 January 2019

**Abstract:** Time-variable gravity measurements from the Gravity Recovery and Climate Experiment (GRACE) and GRACE-Follow On (GRACE-FO) missions and satellite altimetry measurements from CryoSat-2 enable independent mass balance estimates of the Earth's glaciers and ice sheets. Both approaches vary in terms of their retrieval principles and signal-to-noise characteristics. GRACE/GRACE-FO recovers the gravity disturbance caused by changes in the mass of the entire ice sheet with a spatial resolution of 300 to 400 km. In contrast, CryoSat-2measures travel times of a radar signal reflected close to the ice sheet surface, allowing changes of the surface topography to be determined with about 5 km spatial resolution. Here, we present a method to combine observations from the both sensors, taking into account the different signal and noise characteristics of each satellite observation that are dependent on the spatial wavelength. We include uncertainties introduced by the processing and corrections, such as the choice of the re-tracking algorithm and the snow/ice volume density model for CryoSat-2, or the filtering of correlated errors and the correction for glacial-isostatic adjustment (GIA) for GRACE. We apply our method to the Antarctic ice sheet and the time period 2011–2017, in which GRACE and CryoSat-2 were simultaneously operational, obtaining a total ice mass loss of 178 ± 23 Gt yr<sup>−</sup>1. We present a map of the rate of mass change with a spatial resolution of 40 km that is evaluable across all spatial scales, and more precise than estimates based on a single satellite mission.

**Keywords:** Mass balance; Ice Sheets; Sea-level Rise; Antarctica; GRACE; CryoSat-2; GRACE-Follow On; GRACE-FO; downward continuation; spectral methods

#### **1. Introduction**

The Antarctic ice sheet is the largest reservoir of non-oceanic water mass. Projections estimate the ice sheet's potential to raise sea levels by 15 m by the year 2500 for scenarios of unabated climate change [1]. It is currently in a state of overall decline [2,3], but regional differences are very prominent. Most of East Antarctica is relatively stable [4], while many glaciers in West Antarctica and the Antarctic Peninsula are losing mass due to enhanced discharge [5,6] and retreat [7] following the basal melt or structural collapse of ice shelves [8]. Some of these West Antarctic glaciers represent a serious threat to global sea levels as they could collapse within a few centuries, raising global mean sea level by centimeters to meters [9–11].

Satellite technology enabled much better coverage of mass balance estimates of the Antarctic and Greenland ice sheets than could be achieved with in situ or airborne measurements [2]. Three different techniques have been commonly employed to assess ice sheet mass balance: (1) The mass budget method subtracts ice discharge into the oceans and ice shelves from the net mass flux from the atmosphere onto the ice sheet's upper surface [12]. (2) Satellite gravimetry observations (mostly by the Gravity Recovery and Climate Experiment, GRACE [13]) measure the perturbations in the gravitational potential of the Earth caused by redistribution of Earth's mass. (3) Satellite altimetry enables repeat measurements of surface elevation, which can be used to quantify the total ice volume change and estimates of total mass change if the density at which volume changes occur can be inferred [14].

The gravimetric and altimetric approaches can both quantify the mass changes as spatial fields. However, their characteristics are very different. For example, the CryoSat-2 radar altimetry mission offers highly resolved maps of height changes [15], however, it is influenced by uncertainties from the backscattering properties of snow and firn, as well as limited resolvability of terrain with steep gradients in coastal or mountainous areas [16,17]. In addition, the conversion of volume to mass changes requires knowledge of the contributions to height changes by depositional processes which occur at the density of snow, by ice-dynamical imbalance which occurs at the density of ice, and by firn densification, which implies a height change without a mass change [18].

GRACE, on the other hand, measures mass change of the whole ice column. Although it is not affected by the presence of different processes in the snow, firn, and ice column, GRACE has a significantly lower effective spatial resolution and needs to be corrected for all other mass change processes not related to the present-day ice mass change (for example caused by short-term mass variability in the atmosphere and the ocean). GRACE mass trends need to be corrected for the Earth's viscoelastic response to past ice changes (glacial isostatic adjustment, GIA), which is a prominent trend signal in polar regions and superimposes with the present-day mass balance (for example [19]).

Here we present a novel method to derive spatial maps of ice sheet mass change that exploits the advantages of these two techniques, while minimizing the uncertainties that are associated with each type of observation. Satellite gravimetry and laser altimetry observations of ice change have been combined before in order to separate the signals of difference processes (for example [20–23]) and also to increase the spatial resolution [24]. Here, we perform for the first time, to our knowledge, a combination in the spectral domain (i.e., in terms of the representation of the spatial fields as spherical harmonic coefficients), the common representation in which the GRACE gravity fields are estimated and distributed (Level 2 data). The combination approach can be regarded as a downward-continuation of GRACE gravity measurements to mass changes on the Earth's surface, using mass change fields derived from CryoSat-2 data. We show that our approach allows overcoming the limited resolution of the GRACE data, producing a field of ice mass trends that is evaluable across all spatial scales, and more precise than that recovered by a single sensor. The method is applicable to other regions and components of the Earth system and may be useful to join GRACE/GRACE-FO data and additional measurements into new combined Level 4 data products.

#### **2. Data and Methods**

#### *2.1. GRACE Satellite Gravimetry*

GRACE observations of the gravitational potential are typically released by the respective processing centers as monthly sets of spherical harmonics coefficients for integer degree *j* and order −*j* ≤ *m* ≤ *j*. Coefficients of degree *j* = 1 have to be obtained from different observations (see below), the coefficient of *j* = 0 is constant by definition (conservation of total mass). The spatial representation of the GRACE geoid height observation at colatitude *θ*, longitude *ϕ*, and time *t* can be expressed as

$$\log(\theta,\varphi,t) = \sum\_{j=0}^{j\_{\text{max}}} \sum\_{m=-j}^{j} \left( \mathbb{C}\_{jm}(t) \, \, \mathcal{Y}\_{jm}(\theta,\varphi) \right) \tag{1}$$

with *Yjm*(*θ*, *ϕ*) = *Pjm*(cos *θ*) cos(*mϕ*) *f or m* ≥ 0 *Pj*|*m*|(cos *<sup>θ</sup>*) sin(|*m*|*ϕ*) *f or m* <sup>&</sup>lt; <sup>0</sup> and *Cj*<sup>0</sup> <sup>≡</sup> 0 for *<sup>m</sup>* <sup>&</sup>lt; 0, where *Pjm* is the normalized associated Legendre polynomial of degree *j* and order *m* [25,26]. We utilize monthly coefficients *Cjm*(*t*) of release 5 by the Center for Space Research, The University of Texas at Austin from *j* = 2 up to *j*max = 90 (CSR RL05; [27]). The GRACE coefficients (*j*, *m*) = (2, 0) are replaced by values from satellite laser ranging [28]. The GRACE spectrum is completed with coefficients of *j* = 1 estimated by the approach of Swenson et al. [29]. A pole drift correction has not been applied.

We decompose the time series of each coefficient by adjusting, in a least squares sense, a temporal model consisting of the most pronounced oscillations (an annual cycle at 1 year period, the S2 tide with a 171 days repeat cycle, and the P1 tide at 161 days), as well as a linear and quadratic trend in time. For consistency with CryoSat-2 data, the adjustment period is from February 2011 to June 2017, which is the last available GRACE monthly solution. We create an ensemble of the linear trend *<sup>∂</sup><sup>g</sup> <sup>∂</sup><sup>t</sup>* by coefficient-wise random perturbation of the fitted linear trend according to the propagated standard deviation of the trend estimate (30 realizations). We subtract GIA contributions according to the three different estimates IJ05r2 [30], AGE1a [31] (this is the version independent of GRACE observations), and ICE6G [32]. Thus, we obtain 90 ensemble members representing both the uncertainties of the GRACE coefficients, as well as uncertainties of the GIA correction. Figure 1 shows the mean and standard deviation of the ensemble, as well as the uncertainty components of each ensemble as a degree-power spectrum. It is visible that the GIA-induced uncertainty peaks in the range of degree 5 to 15, while the measurement uncertainty of the coefficients gradually increases.

**Figure 1.** Degree-power spectrum of the (**a**) rate of mass change (kg<sup>2</sup> m−<sup>4</sup> yr<sup>−</sup>2) for the ensemble mean (solid) and ensemble standard deviation (dashed) of CryoSat-2 (blue), GRACE (red), and the combined solution (green). The vertical dashed line indicates the degree at which the power of the GRACE and

CryoSat-2 uncertainties are equal, separating the spectrum into a GRACE and CryoSat-2 dominated part. Light shading shows the ad hoc quadratic transition to the GRACE related weights of zero for j ≥ 90, meaning that degrees and order 90 to 512 are only supplied by CryoSat-2. (**b**) Degree-power spectrum of the ensemble standard deviation for CryoSat-2 (dark blue) and GRACE (light blue), along with the uncertainty components re-tracker and adjustment method (red), snow/ice density model (dark green) for CryoSat-2, and the glacial-isostatic adjustment (GIA) correction (orange) and uncertainty of the GRACE trend coefficients (light green). Note that a different range of degrees is plotted in (**a**) and (**b**).

Note that regional optimized models exist for the Amundsen Sea Embayment [33], the Antarctic Peninsula [34], and the Siple Coast [35], yielding GIA-induced apparent mass changes of 17 Gt yr−1, 3 Gt yr−1, and a range of ± 6–8 Gt yr−1, respectively. Using the model for the Amundsen Sea Embayment will increase the mass loss from GRACE and the discrepancy to the CryoSat-2 estimate shown later. However, these models are computed with Earth structure models optimized for these regions and should not be simply superimposed with continent-wide GIA simulations adopting an average Earth structure for Antarctica.

Each member of the geoid height trend *<sup>∂</sup><sup>g</sup> <sup>∂</sup><sup>t</sup>* ensemble is then converted to mass on the Earth surface *∂σ <sup>∂</sup><sup>t</sup>* according to Wahr et al. [36], using load Love numbers corresponding to an elastic Earth represented by the Preliminary Reference Earth Model [37]. For clarity, we refer from now on to this surface-mass density change simply as mass change or surface load. Then, a mask is applied to the spherical harmonic spectrum using the transform method of Martinec [38] in order to reduce the far-field signal and obtain a spectral representation of the mass changes over Antarctica only. The mask is initially designed in space, *M*(*θ*, *ϕ*), and then analysed in terms of spherical harmonics coefficients up to *j*max = 90,

$$M\_{jm} = \int\_0^{2\pi} \mathrm{d}\varphi \int\_0^{\pi} \mathrm{d}\theta \, \sin\theta \, M(\theta, \varphi) \, \mathcal{Y}\_{jm}(\theta, \varphi) \tag{2}$$

According to the relation between coefficients and the spatial function that they represent, as in Equation (1); *M*(*θ*, *ϕ*) is zero at any point that is more than 300 km away from the Antarctic grounding line (from Rignot et al. [39]), one at any point that is less than 200 km away from it, and linearly interpolated in between. The spectral multiplication increases the necessary degrees for the representation of the masked field to *j*max = 180 [38].

Later, in the spectral combination, we will refer to the respective spherical harmonic coefficients of the *∂σ <sup>∂</sup><sup>t</sup>* ensemble member as *<sup>C</sup>*GRACE,*<sup>k</sup> jm* , where *<sup>k</sup>* specifies the ensemble member. From *<sup>C</sup>*GRACE,*<sup>k</sup> jm* , we calculate ensemble mean *C*GRACE *jm* (indicated by the dropped superscript *k*) and ensemble standard deviation Δ*C*GRACE *jm* . For each degree *j*, the respective degree powers (Figure 1) are proportional to *j* ∑ *C*GRACE *jm* <sup>2</sup> and *j* ∑ Δ *C*GRACE *jm* <sup>2</sup> . The magnitudes of means and standard deviations of the

*m*=−*j m*=−*j* individual coefficients are shown in Figure 2. The spatial representations are calculated after the synthesis of the spherical harmonic spectrum of each ensemble member, as per-grid cell rates of mass change *∂σ <sup>∂</sup><sup>t</sup>* and their respective standard deviation <sup>Δ</sup>*∂σ <sup>∂</sup><sup>t</sup>* . Unless stated otherwise, we express the mass changes in the unit kg m−<sup>2</sup> yr<sup>−</sup>1, or equivalently in mm of the water column of density 1000 kg m−<sup>3</sup> referred to as mm water equivalent, per year (mm we yr<sup>−</sup>1).

Note that after isotropic filtering of the ensemble mean spatial field, the typical North-South oriented pattern of uncertainties [40] are still present in the ensemble mean, but are successfully removed by our spectral combination with CryoSat-2 observations (see Section 2.3). Our analysis shows that de-striping the gravity field observation before the combination using the filter of Swenson and Wahr [40] does not markedly change our combination results and can be omitted. Furthermore, we find that accounting for co-variances in the monthly GRACE coefficients' using the *m*-block approximation [41] does not significantly alter the linear-trend estimate. We therefore adopt the variances of GRACE coefficients estimated from the post-fit residual.

**Figure 2.** Spectral magnitude of the mass change coefficients (kg m−<sup>2</sup> yr<sup>−</sup>1) for the ensemble mean of (**a**) CryoSat-2, (**c**) GRACE, and the (**e**) combined solution and the respective ensemble standard deviation in (**b**), (**d**), and (**f**). Shown are coefficients up to degree and order 90. The complimentary uncertainty characteristics of GRACE and CryoSat-2, and the overall reduced uncertainty of the combined fields are visible. Time period is February 2011 to June 2017.

#### *2.2. CryoSat-2 Satellite Radar Altimetry*

The initial measurements of the radar altimeter SIRAL on board of CryoSat-2 are radar echoes, or waveforms, from prominent reflectors in the uppermost part of the firn body, from which we eventually derive elevation rates over the period 2011–2017, following the processing scheme of Helm et al. [42], with some modifications (see Appendix A). We create an ensemble of CryoSat-2 *<sup>∂</sup><sup>h</sup> ∂t* estimates with the aim of representing uncertainties arising from methodical differences, as well as uncertainties from the influence of volume scatter to the elevation estimates. The ensemble consists of seven re-tracker solutions, i.e., algorithms to detect the timing of the incoming wave from the reflector, and thus elevation, as well as four least-square space-time fitting methods for aggregating the elevation measurements into linear rates of elevation change. Permutation of these processing choices gives us in total, 28 independent ensemble members. Formal measurement uncertainties are not considered in the ensemble, as our analysis indicated that uncertainties from the re-tracker and plane fitting choices are dominant. Details on the processing of the CryoSat-2 data, the re-trackers and fitting schemes are provided in the Appendix A.

For the conversion of mass changes *∂σ <sup>∂</sup><sup>t</sup>* in the spatial domain into the spherical harmonic spectrum, all gaps in the respective height change field *<sup>∂</sup><sup>h</sup> <sup>∂</sup><sup>t</sup>* must be filled, due to the global integration over the spherical harmonic base functions (for example Equation (2)). We fill the gaps in the *<sup>∂</sup><sup>h</sup> <sup>∂</sup><sup>t</sup>* fields by interpolating using inverse distance weighting. Other, more elaborate techniques, such as kriging, are rejected here, because they are unlikely to improve the filling of the gaps, which often occur in the Antarctic Peninsula and the Transantarctic Mountains as the complex terrain distorts waveforms, and thus lets the conventional re-tracking algorithms fail [42]. Yet it is worth keeping in mind that the CryoSat-2 field is consequently less accurate in these areas. Surface elevation change outside the grounding line are set to zero, as respective floating ice changes are not visible directly in the gravity field, respectively, or the derived mass change *∂σ <sup>∂</sup><sup>t</sup>* . We do not correct for GIA in the CryoSat-2 data, as bedrock topography changes are well below 5 mm yr−<sup>1</sup> in most places [22], which is small in comparison with the measured elevation rates. Higher values have been reported in the Amundsen Sea Embayment [33], but this is also where the surface elevation rates are highest, and so the relative uncertainty remains very low. We note that the possible bias and uncertainty induced by neglecting GIA-induced crustal displacements in the CryoSat-2 only is around 9 ± 6 Gt yr−<sup>1</sup> [23].

We convert each of the above 28 realizations of CryoSat-2 surface elevation trends *<sup>∂</sup><sup>h</sup> <sup>∂</sup><sup>t</sup>* into trends of mass change *∂σ <sup>∂</sup><sup>t</sup>* , using four different assumptions on the significance of snow/ice processes, which generates a total of 112 ensemble members. Three of these methods are based on grid-based multiplying *∂h <sup>∂</sup><sup>t</sup>* , with the density associated with the assumed most dominant process [15,43]. The fourth method is based on the output of a regional climate model [44]. Figure 1 shows the uncertainty associated with the density models, as wells as with the re-tracker and adjustment method as a degree-power spectrum.

The CryoSat-2 fields, being available at a resolution of 5 km here, would, in principle, allow a spherical harmonic expansion to degree *<sup>j</sup>*max <sup>≈</sup> <sup>20000</sup> *km* <sup>5</sup> *km* <sup>=</sup> 4000. For the sake of efficiency, we opt for a maximum spherical harmonic degree of *j*max = 512, equivalent to about 40 km spatial resolution in latitudinal direction, which is sufficient to demonstrate the feasibility of our approach, however, is somewhat coarser than the spatial resolutions < 32 km, adopted by many continent-wide ice sheet models [45]. We transfer the *∂σ <sup>∂</sup><sup>t</sup>* from the grid equidistant in polar stereographic coordinates to a grid equidistant in latitude and longitude (0.2◦) by bilinear interpolation to the polar stereographic projected latitude and longitude nodes. The spherical harmonics spectra of the 112 CryoSat-2 ensemble members are then generated from these re-gridded fields, according to the relation given in Equation (2). In accordance with our nomenclature for spherical harmonic coefficients of the mass change field *∂σ <sup>∂</sup><sup>t</sup>* from GRACE, we will refer to the respective ensemble member *l*, represented by its coefficients as *C*CS2,*<sup>l</sup> jm* , where superscript 'CS2' stands for CryoSat-2. Likewise, the ensemble mean and standard deviations are *C*CS2 *jm* and <sup>Δ</sup>*C*CS2 *jm* , respectively.

#### *2.3. Spectral Combination*

The combination of GRACE and CryoSat-2 exploits the different noise characteristics of each satellite observation that is dependent on the spatial wavelength of the mass fields. While GRACE uncertainties are known to increase with spatial resolution, due to the ill-posed and unstable nature of the gravimetric inversion problem (for example [46]), uncertainties of CryoSat-2 are expected to be sensitive to large-scale offsets or regional uncertainties in the snow/ice density necessary for the conversion from volume rates to mass rates. The GRACE coefficient at degree and order larger than 50 (spatial resolution of ca. 400 km) are typically dominated by noise, coefficients beyond degree and order ca. 90 (spatial resolution of 220 km) are often not provided in the GRACE gravity field solutions. In this sense, our combination can be interpreted as augmenting the low-frequency GRACE spherical-harmonic spectrum with higher frequencies provided by CryoSat-2, using an uncertainty weighted, optimal blending of both data sets in the spectral range *j* ≤ 90. From a geophysical point of view, our combination is a downward-continuation of the GRACE-measured gravitational perturbation at satellite altitude to the sources of mass change on the Earth's surface with CryoSat-2.

For the combined spectrum of GRACE and CryoSat-2, we find weights for GRACE (*w*GRACE *jm* ) and CryoSat-2 (*w*CS2 *jm* ) spherical harmonic coefficients according to the standard deviation of the ensemble for each mission in each coefficient

$$w\_{jm}^{\text{GRACE}|\text{CS2}} = N\_{jm}^i \left(\Delta C\_{jm}^{\text{GRACE}|\text{CS2}}\right)^{-2} \,' \tag{3}$$

with the normalization factor

$$N\_{jm}^{i} = \left(\left(\Delta \mathbf{C}\_{jm}^{\text{GRACE}}\right)^{-2} + \left(\Delta \mathbf{C}\_{jm}^{\text{CS2}}\right)^{-2}\right)^{-1}.\tag{4}$$

Due to the decreasing signal-to-noise ratio in the GRACE observations with increasing degrees (for example [36]), we set the GRACE weights to zero beyond degree 90, and ensure a smooth transition within 5 degrees by the ad hoc multiplication of Δ*C*GRACE *jm* by - 1 − *<sup>j</sup>*−<sup>85</sup> 5 2 for 85 ≤ *j* ≤ 90 in Equations (3) and (4). For GRACE ensemble member *k* and CryoSat-2 ensemble member *l*, the resulting coefficients of the combined field (superscript 'Comb.') are then given as

$$\mathcal{C}^{\text{Comb.},k,l}\_{jm} = \mathcal{C}^{\text{GRACE},k}\_{jm} \, w^{\text{GRACE}}\_{jm} + \mathcal{C}^{\text{CS2},l}\_{jm} \, w^{\text{CS2}}\_{jm} \,. \tag{5}$$

This means that the weighting factors calculated from (3) are the same for all ensemble members *k* and *l* in the combination (5), respectively. The resulting full spectrum up to degree and order 512 (not shown) is transferred into the spatial domain (see Section 3.2) for respective ensemble mean and standard deviation. Note that due to the noise characteristics of GRACE and CryoSat-2, *w*GRACE *jm* represents a low-pass filter, while *<sup>w</sup>*CS2 *jm* represents a high-pass filter, as seen by the uncertainty characteristics shown in Figures 1 and 2.

#### *2.4. Limitations of the Spectral Combination*

The spectral combination makes use of complementary wavelength-dependent noise characteristics and resolution capabilities of GRACE and CryoSat-2. However, as a downside of the spectral combination, artefacts may appear if both spectral parts are not fully consistent, which is likely, as they are obtained from two observing systems sensitive to different processes. For example, Figure 1 shows lower degree-power for CryoSat-2 in spectral range j < 50 compared to GRACE, which, in our case, translates into a lower magnitude of total mass balance (see Section 3.3). This spectral difference will, in combination with GRACE, lead to an inconsistent spherical-harmonic representation of the true (unknown) mass field. Therefore, signal artefacts may appear, visible in our combined field as minor mass changes over ocean areas that were previously set to zero by masking (Sections 2.1 and 2.2). For example, a slightly negative signal is visible in the ocean part of the Amundsen Sea Sector (see Section 3); here, the high-frequency supplied by CryoSat-2 does not cancel ocean leakage from GRACE completely. Note that similar—however, less obvious—issues arise when inconsistent gridded fields are averaged.

#### *2.5. GRACE and CryoSat-2 Contributions*

We quantify how much signal power GRACE and CryoSat-2 contribute to the combined field at each spatial scale, i.e., up to each harmonic degree *j*. For this, we quantify the cumulative sum of the degree power according to *p* GRACE|CS2 *<sup>j</sup>* ∝ *j* ∑ *j*=0 *j* ∑ *m*=−*j C*GRACE|CS2 *<sup>j</sup><sup>m</sup> <sup>w</sup>*GRACE|CS2 *jm* 2 and evaluate this quantity relative to the cumulative degree power of the mean combined spectrum:

$$P\_j^{\text{GRACE}|\text{CS2}} = p\_j^{\text{GRACE}|\text{CS2}} \ / \left( p\_j^{\text{GRACE}} + p\_j^{\text{CS2}} \right) \tag{6}$$

This choice of relating *P*GRACE|CS2 *<sup>j</sup>* to the overall cumulative power of the combined field has the advantage that *P*GRACE *<sup>j</sup>* + *<sup>P</sup>*CS2 *<sup>j</sup>* = 100%, and the disadvantage that the term *<sup>p</sup>*GRACE *<sup>j</sup>* <sup>+</sup> *<sup>p</sup>*CS2 *<sup>j</sup>* does not fully represent the cumulative power of the mean combined field *p*Comb. *<sup>j</sup>* , due to the quadratic term in the computation of degree power. However, *p*GRACE *<sup>j</sup>* + *<sup>p</sup>*CS2 *<sup>j</sup>* and the actual cumulative degree power of the mean combined spectrum *p*Comb. *<sup>j</sup>* differ by a maximum of ~ 6% of *<sup>p</sup>*Comb. *<sup>j</sup>* (reached at *j* = 90), indicating that our approach is valid. In turn, this means that we can determine the optimum mix of the two sensors' observations for a targeted spatial resolution.

#### *2.6. Basin Averages and Transects*

For comparison with studies providing GRACE only estimates (for example [3]), we provide integrated mass balance for 25 commonly used Antarctic drainage basins, shown in Figure 3 (after Rignot et al. [12]; used in Sasgen et al. [23]). We quantify the mass balance within the basin based on the spatial representation for each *k* (GRACE), *l* (CryoSat-2), and (*k*, *l*) (combined) ensemble member according to *mN* = ∑ *n ∂σn <sup>∂</sup><sup>t</sup> An*, where *n* indicates the running index of grid elements within a certain basins *N*, and *An* is the associated area of this grid element. Based on these ensembles, we compute mean and standard deviations of the integrated basin mass balances for GRACE, CryoSat-2, and the combined solution. In the following, the uncertainties provided represent one standard deviation.

In addition, we assess the spatial resolution and decorrelation effects of our combined estimate by evaluating *∂σ <sup>∂</sup><sup>t</sup>* and <sup>Δ</sup>*∂σ <sup>∂</sup><sup>t</sup>* field along three 2000 km long transects, which run approximately parallel to the grounding line, however shifted inland by approximately 200 km (Figure 3). We have selected three regions of particular interest: (1) Wilkes Land, East Antarctica (transect AA'), where very localized ice dynamic imbalance has been noted at the Totten glacier system (for example [47]). (2) Dronning, Maud, and Enderby Land (transect BB'), where strong accumulation variations are observed [48]. And (3) the Amundsen Sea and Bellingshausen Sea Sectors (transect CC'), where the largest ice dynamic losses for Antarctica are recorded (for example [49]). The transects are chosen approximately across the ice-dynamic flow line to assess whether glacial entities smaller than the typical basin scale can be resolved. In addition, we assess the mass rate fields locally perpendicular to the transects described above, namely along Totten Glacier (aa'), Shirase Glacier (bb'), and Pine Island Glacier (cc'). Crossing the division between the Antarctica continent and the surrounding ocean or ice shelf areas allows us to assess the signal leakage beyond the coastline into the open ocean. Note that we do not attempt to adopt the exact grounding and flow line positions of the ice streams, which is beyond the capabilities of CryoSat-2, and thus, the combined product.

**Figure 3.** Definition of Antarctic drainage basins and transects used for the evaluation of the mass balance fields. We adopt 25 drainage basins merged from the basin division. In addition, we assess our results along three 2000 km long transects AA', BB', and CC', approximately parallel and 200 km inland of the grounding line [34], as well as three 1000 km long transect locally perpendicular to the grounding line (aa', bb', and cc'). The projection is Polar Stereographic centered at 90◦S and 0◦E, with the true latitude of 71◦S (applies to scale) and WGS84 (EPSG:3031).

#### **3. Results**

#### *3.1. Spectral Representation*

The magnitude of the coefficients of the ensemble mean and the ensemble standard deviations of CryoSat-2, GRACE, and the combined solution are shown in Figure 2 for *j* ≤ 90. The magnitude per coefficient is centered close to the zonal coefficients (m = 0), with lower values in the sectorial coefficients (j = m). This is a result of Antarctica's geographical position approximately centered on the South Pole, with data coverage only south of 60◦S. Note that the spectrum of CryoSat-2 is more focused towards the zonal coefficients than GRACE, owing to the mask buffer zone of 300 km adopted for GRACE (Section 2.1) and the North-South oriented noise pattern. The overall magnitude of GRACE coefficients is larger in the low degrees and orders, and noise is starting to dominate around degree and order 50.

The ensemble per-coefficient standard deviations (Figure 2, right panels) confirm the noise structure shown in Figure 1; the variability caused by the choice of the re-tracker and density model in CryoSat-2 creates uncertainties in the lower degree part of the spectrum, similar to the characteristics signal spectrum itself. For GRACE, the well-known increase of the noise with degree and order is visible, suggesting an onset of the noise dominated regime at about degree and order 60. The combined solution retains the spectral magnitude in the low degrees and orders, while reducing noise in the high degrees and orders.

#### *3.2. Spatial Representation*

Figure 4 shows the spatial representation of the CryoSat-2, GRACE and the combined field together with their respective uncertainties. For reference, we have labeled prominent features of mass change in the spatial representation of the ensemble mean CryoSat-2 field (Figure 4). These include the well-known hotspots of ice dynamics losses in the Amundsen Sea Embayment (Figure 4, Label 1; for example [49]), the slowing of Ice Stream C (Label 2; for example [50]), and the Totten glacier system (Label 3; for example [47]). In addition, CryoSat-2 shows a large-scale, low-magnitude mass loss signal in the interior part of Wilkes Land (Label 4) and prominent accumulation driven mass gain is shown along the Antarctic Peninsula (Label 5; [6]).

**Figure 4.** Spatial rate of mass change (kg m−<sup>2</sup> yr<sup>−</sup>1) for the ensemble mean of (**a**) CryoSat-2, (**c**) GRACE, and the (**e**) combined solution and the respective ensemble standard deviation in (**b**), (**d**), and (**f**). Note that the saturation of the color bar in (**a**), (**c**), and (**f**) enhances signals of relatively low magnitude. Note that the GRACE trend in (**c**) is filtered with a Gaussian filter of half-width 1.3◦, to reduce noise for visualization. The time period is February 2011 to June 2017. The projection is Polar Stereographic centered at 90◦S and 0◦E, with the true latitude of 71◦S (applies to scale) and WGS84 (EPSG:3031).

The uncertainty of the CryoSat-2 ensemble mostly correlates with the signal structure, as the variability of the density models has the largest effects where the CryoSat-2 *<sup>∂</sup><sup>h</sup> <sup>∂</sup><sup>t</sup>* signal. However, in the interior of Wilkes Land (Label 4), uncertainties are induced by differences in the re-trackers due to varying backscattering properties of the snow and ice (for example [42]). Another remarkable feature in the CryoSat-2 uncertainty is CryoSat-2's mode mask boundary south of the Filchner Ice Shelf. Note that also some Gibb artefacts (for example [51]) are present, mostly beyond the ice sheet boundaries, for example in the Amundsen Sea, caused by the representation of the *∂*σ/*∂*t field by a finite spherical harmonic expansion and synthesis.

The GRACE *∂*σ/*∂*t field (Figure 4) shows the prominent mass loss in the Amundsen Sea Embayment. For visualization, we present the GRACE trends after smoothing with a Gaussian filter of 1.3◦ (for example [36]), reducing most of the noise and revealing the mass change anomalies. The GRACE noise field is not filtered for correlated north-south striping errors. The uncertainty represented by the ensemble standard deviation shows a striped pattern, caused by the correlation of uncertainties in the GRACE coefficients, the uncertainty due to the Polar gap of ± 0.5◦, as well as an uncertainty increasing with latitude towards the equator, which is due to decreasing ground track density of the GRACE near-polar orbits (for example [52]). It is worth noting that CryoSat-2 and GRACE show very different, and, to some extent, complementary patterns of the uncertainty, for example in the Amundsen Sea Embayment.

In the combined field, some important differences are visible; first, the magnitude of mass losses in the Amundsen Sea Embayment is increased with respect to CryoSat-2 only (Figure 4, Label 1), as the signal magnitude is adjusted towards GRACE based on each sensor's uncertainty. In contrast, the magnitudes for Ice Stream C (Label 2) and Totten (Label 3) are unchanged, suggesting initial consistency between GRACE and CryoSat-2. However, the mass loss signal and its uncertainty in the interior of Wilkes Land (Label 4) is strongly reduced, mitigating the artefacts caused by the re-trackers, and possibly an overestimation of mass loss by CryoSat-2 caused by a depletion in snow. Similarly, the mass increase along the Antarctic Peninsula (Label 5) visible in CryoSat-2 is reduced by combining with GRACE. The height change in CryoSat-2 is likely caused by snow accumulation (more than is assumed in the density models), and thus detected only at lower magnitudes by GRACE. Compared to an individual sensor, the uncertainty of the combined solution (Figure 4) is considerably reduced, removing the sensor-specific patterns of regional or zonal uncertainty.

#### *3.3. Basin Averages*

Next, we evaluate the GRACE, CryoSat-2, and combined fields, as integrated over the 25 Antarctic drainage basins shown in Figure 3, which are considered independently resolvable by GRACE (for example [53,54]). Table 1 (color enhanced) lists the mass balances along with the respective uncertainties (see Section 2.6). Note that the total value for the Antarctic ice sheet of GRACE was estimated, including the buffer zone of the mask described above. The values for individual basins do not include any correction for leakage to the ocean.

**Table 1.** Mass balance of 25 Antarctic drainage basins shown in Figure 3. Listed are basin area, mass balance, uncertainty of mass balance as a result of this study, as well as the mass balance from GRACE Level 3 mascon product of Center for Space Research, University of Texas (CSR RL05 M) and the gridded product of the Climate Change Initiative (CCI) of the European Space Agency (ESA). Color coding is the same for the mass balances and the uncertainties, respectively. Red colors denote rates of mass gain, blue colors rates of mass loss. The color range is mapped to the value range from zero to the largest negative and positive values, respectively.


† January 2011 to June 2017 (same as this study); ‡ January 2011 to June 2016.

Overall, the characteristics of positive and negative mass balances are similar for GRACE and CryoSat-2, and this is preserved in the combination. However, it is apparent that the combination is not merely a weighted average of the individual inputs, as GRACE recovers signals at or below the spatial extent of the basins, while CryoSat-2 de-correlates signals for higher resolutions. This is visible in the relative cumulative contribution of GRACE and CryoSat-2 calculated according to Equation (6) and shown in Figure 5. The combined solution (Figure 4) features the high-resolution characteristics of the CryoSat-2 input field, because short wavelength features are dominated by CryoSat-2. Figure 5 shows that about 74% of the signal power in the combined field, integrated up until degree and order 512, is contributed by the CryoSat-2 data. However, GRACE remains the dominating source for our combined product (88%) if limited to the spatial scale of 500 km (degree and order 40).

For example, as a consequence, mass balance for basin 1 turns positive in the combination, even though GRACE and CryoSat-2 inputs are both negative in sign. Another effect is the localization of the signal (reduction of leakage), visible for basins 21. Here, the combined solution shows higher mass loss (−61.7 Gt yr−1) compared to CryoSat-2 (−52.1 Gt yr−1), which tends to be less negative in the entire Amundsen Sea Embayment, but also as GRACE (−52.6 Gt yr−1), for which some signal is lost due to leakage. Another example is the southern Antarctic Peninsula (basin 24), where strong mass gains inferred from CryoSat-2 (10.5 Gt yr<sup>−</sup>1) are entirely suppressed by the GRACE contribution in the combination (−1.3 Gt yr<sup>−</sup>1), resulting in a combined estimate close to GRACE.

**Figure 5.** Relative cumulative contribution of the GRACE and CryoSat-2 data to the degree power of the combined mass balance field. GRACE poses the dominant contribution for scales up to 500 km (typical for the spatial extent of the 25 basins (Figure 3) or j ≤ 40. Equal contributions are obtained at j ≈ 150 or 133 km, while for features at the spatial scale of the nominal resolution corresponding to our maximum degree j = 512 (40 km), GRACE and CryoSat-2 contribute ca. 26% and 74% of the power, respectively. Time period is January 2011 to June 2017. Note that due to non-zero weights, CryoSat-2 contributes about 10% of the power also in the low degrees j < 32.

In terms of the uncertainty, the combined solution is similar or lower compared to GRACE only, on average about 27 % (basins 20, 21, and 22 excluded). Note that the GRACE uncertainty is underestimated, since signal (and noise) leakage between basins is not accounted for in our uncertainty estimate at basin scale. As a consequence, the combined estimates for basins 20, 21, and 22 show an increased uncertainty compared to the GRACE only estimates, but also significant changes in the mass loss values, particularly basin 23. Also, note that the combined results feature the full resolution of 40 km, meaning that the averages are calculated using the full spectrum up to degree and order 512. Compared to CryoSat-2, the uncertainties reduce for all basins (except basin 19), typically by more than half. The dominant uncertainties for basins 11, 12, and 13, caused by sensitivity of the re-tracker (see Figure 4: uncertainty in Wilkes Land, Label 4), are drastically reduced. The uncertainty of the ice mass change for the entire Antarctic ice sheet is reduced from ± 58.2 Gt yr−<sup>1</sup> for CryoSat-2 only to ± 22.6 Gt yr−<sup>1</sup> for the combined solution (GRACE only is ± 24.9 Gt yr<sup>−</sup>1). This again exemplifies that up to the basin level, GRACE improves CryoSat-2 estimates, and for higher resolutions, it is the other way around.

#### *3.4. Transects*

The evaluation of the spatial fields along the transects shown in Figure 6 demonstrates that our combined field of mass changes provide a much finer resolution than GRACE on its own. For example, the combined field resolves highly localized hotspots of mass change at the Totten Glacier snout (around 1000 km of the tangential profile AA' and around 500 km along the perpendicular profile aa') or along the various Amundsen Sea glaciers (several very prominent spots in CC', and the strong signal along Pine Island Glacier in cc'). The magnitude of the associated peak signals is clearly underestimated by GRACE on its own. However, even smaller and less prominent spots, like the ~75 kg m−<sup>2</sup> yr−<sup>1</sup> mass gain around 500 km of the orthogonal transect at Shirase Glacier

(bb'), far beyond the detection capabilities of GRACE, are clearly resolved by the combination, due to the high-resolution CryoSat-2 input. The transect along Pine Island Glacier (cc') highlights how the combination of GRACE with CryoSat-2 rectifies one of the major shortcomings of GRACE. Early truncation of the GRACE spherical harmonic series results in a smooth decline of the signal close to the grounding-line position (around 500 km in cc') into the ice shelf and open ocean (around 800–900 km in cc'). This leakage is not only problematic at the ice/ocean boundary (see aa', bb' and cc' in Figure 6), but also transverse to the flow (see AA', BB' and CC' in Figure 6), because the mass imbalance should be focused within the shear margins of the ice streams (for example [55]). This is for example visible in transect CC', where between 1000 km and 2000 km, the highly resolved CryoSat-2 only and combined mass losses only peak where the ice flows relatively fast, whereas the GRACE-only signal leaks over a larger part of the section.

The advantage of supporting GRACE with CryoSat-2 for enhancing the level of spatial detail unresolvable with GRACE on its own is complemented by the ability of GRACE to mitigate large-scale ambiguities in the CryoSat-2 data and a smaller uncertainty than a single CryoSat-2 or GRACE solution in most locations. The long-wavelength contribution that comes mostly from GRACE adjusts the regional mean of the CryoSat-2 only solution. For example, the addition of GRACE to the CryoSat-2 signal increases mass loss rates along transect CC' (Amundsen Sea Embayment) from less than 100 kg m−<sup>2</sup> yr−<sup>1</sup> around 0 km up to additional −250 kg m−<sup>2</sup> yr−<sup>1</sup> at the Haynes, Pope, Smith, and Kohler Glaciers glacier systems (HSK) (around 1650 km). The long wavelength offset is also visible in the mass balance of the entire region (basins 20 through 23), increasing the mass balance from <sup>101</sup> ± 10 Gt yr−<sup>1</sup> for CryoSat-2 to 129 ± 4 Gt yr−<sup>1</sup> for the combined field (Table 1). In addition, systematic noise in the GRACE data (striping) carried by individual coefficients is efficiently suppressed by the combination, as seen in transect BB'.

**Figure 6.** Profiles along (left) and across (right) the coastline in (**a**) Wilkes Land, (**b**) Dronning, Maud, and Enderby Land, and the (**c**) Bellingshausen Sea and Amundsen Sea Sector. Shown rates of mass

change (kg m−<sup>2</sup> yr−1) along transects indicated in the map shown in Figure 3, GRACE (unfiltered), CryoSat-2, and the combined field (middle panel). The basin attribution and the surface-ice velocity (m yr−1) are shown in the top panel. The bottom inset of the lower panel shows the same transect evaluated for the GRACE Level 3 data products of the CSR and the CCI of ESA (see main text). Note that the Level 3 curves are offset (right scale applies), the scale however is unchanged, and thus directly comparable to our combined solution. The glacier and ice steams labels refer to: Budd Coast (BUD), Denman Glacier (DEN), Dibble Ice Stream (DIB), English Coast (ENG), Ferringo Ice Stream (FER), Frost Glacier (FRO), Glaciers flowing into Getz Ice Shelf (GET), Haynes Pope Smith and Kohler Glaciers (HSK), Inter-stream ridge (INT), Lidke and other glaciers (LID), Moscow University Ice Shelf (MOS), Pine Island Glacier (PIG), Queen Maud Land (QML), Raynor and Thyer Glacier (RAY), Shirase Glacier (SHI), Thwaites Glacier (THW), Totten Glacier (TOT) and Glaciers flowing into Venable Ice Shelf (VEN). The projection is Polar Stereographic centered at 90◦S and 0◦E, with the true latitude of 71◦S and WGS84 (EPSG:3031).

#### **4. Discussions**

#### *4.1. Comparison with GRACE Level 3 Data*

We compare our combined estimate to the gridded mass rate fields from the Level 3 mascon product of Center for Space Research, University of Texas (CSR RL05 M; [56]) and the gridded product of the Climate Change Initiative (CCI) of the European Space Agency (ESA) [57]. Note that both gridded mass balance products rely on GRACE data only (Level 3 data), with some assumptions on geographic boundaries, as well as signal and noise characteristics. In the logic of the product hierarchy, our combined solution should be considered Level 4 data, as it involves ancillary data compared to GRACE-only mass balance grids. Note that the data sets differ in the underlying GRACE data-CSR solutions for CSR RL05 M and ITSG-Grace2016 [58] for ESA CCI Antarctica, and adopt different corrections of GIA, both of which are part of our ensemble (ICE6G [32] computed by A et al. [59] and IJ05r2 [30], respectively). These post-processing choices may cause considerable differences in the total mass change, but are less important for the decorrelation of basin-scale and local mass rates assessed here. Also, the ESA CCI data set is based on the time span February 2011 to June 2016, while CSR RL05 M is based on the same interval as our data (February 2011 to June 2017). Note that our combined estimate of -178 ± 23 Gt yr−<sup>1</sup> lies well within the range of estimates obtained in the inter-comparison exercise presented in Shepherd et al. [3]; for comparison, we state that the mean and standard of multiple GRACE analysts are −<sup>179</sup> ± 43 Gt yr<sup>−</sup>1.

#### 4.1.1. Basin Averages

Figure 7 and Table 1 present basin-average mass rates for the combined, the GRACE-only, and CryoSat-2 only estimates. The estimates of mass change at basin level of our combined estimate (also GRACE and CryoSat-2 only) and CSR RL05 M and ESA CCI data products are generally in agreement (Table 1). However, the variation of mass change from basin-to-basin is greater in our combined field compared to the GRACE-only estimates (i.e., our GRACE estimates, CSR RL05M, and ESA CCI), suggesting a higher level of decorrelation already at basin-scale level. Differences between GRACE, CryoSat-2, and the combined estimates arise from a stronger decorrelation (for example basin 9 for GRACE), a suppression of uncertainties (for example basin 12 for CryoSat-2), and presumably, a particularly high sensitivity to snow accumulation (for example CryoSat-2 in basin 24). The total mass balance of the Antarctic Ice Sheet of our combined estimate is with −<sup>178</sup> ± 23 Gt yr−1, about 31 Gt yr−<sup>1</sup> more negative than that obtained from CSR RL05M for the same time span (February 2011 to June 2017). The lower mass loss estimates from ESA CCI (−129 Gt yr−1) are most likely a consequence of the shorter time span covered by ESA CCI. Similarly, the distribution of mass change among the drainage basins of our combination is more similar to the CSR RL05 M than to ESA CCI product. However, combined mass loss estimates of the drainage basin 21 (Thwaites glacier; −62 ± 3 Gt yr−1), which shows the strongest mass loss in Antarctica, is in better agreement with ESA CCI. In contrast, our GRACE only estimate of −<sup>53</sup> ± 1 Gt yr−<sup>1</sup> for the same basin is in agreement with CSR RL05M. A similar pattern is observed for basin 18 (Ice Stream C), for which CSR RL05M matches our GRACE only estimate (no leakage correction applied), but is lower in magnitude than both the ESA CCI and our combined estimate. A likely cause for this difference could be signal loss due to leakage, but a more standardized inter-comparison at different processing levels is needed to provide a definitive answer. Note again that ESA CCI adopts the GIA correction IJ05r2 [30], while CSRL RL05 M adopts ICE6G [32] computed by A et al. [59], which produces a 7 to 17 Gt yr−<sup>1</sup> greater apparent mass change [3]. Both GIA corrections are part of our GRACE ensemble. Nevertheless, the comparison shows that basin estimates of our combined field are consistent with other GRACE data sets, with some improvement in the decorrelation of basin-scale estimates and the reduction of signal leakage.

**Figure 7.** Basin integrated rate of mass change (Gt yr−1) for the ensemble mean of (**a**) CryoSat-2, (**c**) GRACE, and the (**e**) combined solution, and the respective ensemble standard deviation in (**b**), (**d**), and (**f**). Numbered labels refer to the basins shown in Figure 3. Note that the saturation of the color bar in (**a**), (**c**), and (**f**) enhances signals of relatively low magnitude. The time period is February 2011 to June 2017. The projection is Polar Stereographic centered at 90◦S and 0◦E, with the true latitude of 71◦S (applies to scale) and WGS84 (EPSG:3031).

#### 4.1.2. Transects

The lower panels of Figure 6 show evaluation of the Level 3 data products along the transect in Wilkes Land, Dronning Maud, Enderby Land, and the Amundsen Sea Embayment. It is visible that the noise present in the GRACE data is successfully reduced and the geographic boundaries (continent/ocean) are implemented, even though the exact location of the coastline differs between the data products. The wavelength of the resolved patterns of the Level 3 products are similar and corresponds to 200 to 400 km, corresponding to our profile based on GRACE only. However, the magnitude of the mass changes is considerably lower than that of our combined solution, and the signals remain highly correlated between independent glacial entities. For example, while the combined solution is able to resolve individual signals for Thwaites (THW) and Haynes, Pope, Smith, and Kohler Glaciers (HSK) in Figure 6 (CC'), these signals are merged into one anomaly for the Level 3 data. This inter-basin leakage is an unresolved problem of the GRACE-only gridded data sets, limiting their use for basins integrals or for assimilation into glaciological models. Even though local mass rates may be well enough recovered with CryoSat-2 data alone, the combination with GRACE leads to reduced uncertainties across all spatial scales.

#### *4.2. Remaining Inconsistencies*

To achieve optimum results, we determine the difference between the combined mass balance, and the uncertainty-weighted mean field for GRACE (including a buffer zone) and CryoSat-2. We apply this value as an ad hoc correction term, distributing the mismatch evenly over the Antarctic ice sheet (−3.4 kg m−<sup>2</sup> yr<sup>−</sup>1), i.e., well below the mean uncertainty of the combined solution of 9.9 kg m−<sup>2</sup> yr−<sup>1</sup> (peak uncertainty is 318.5 kg m−<sup>2</sup> yr<sup>−</sup>1).

In additional analysis, we tried to make use of the spectral mismatch between GRACE and CryoSat-2 to identify the ensemble member (*k*, *l*) of the combination that minimizes the artefacts. Within the range of the GRACE and CryoSat-2 ensemble spread, the artefact could be reduced to some extent (by ca. 10 to 20 %), but could not be removed completely. We infer that the ensemble spread still underestimates the true uncertainty in one or both data sets. Possible candidates for underestimated uncertainties are the influence of far-field signals in the Northern Hemisphere on the GRACE signal over Antarctica, or the range of schemes for converting CryoSat-2 elevation rates to mass rates. For the time being, however, we accept the inconsistency between both data sets and resolve it as stated above. However, improvements to our approach could be made by introducing an a posteriori weighting of the ensemble members, for example according to the inverse of the signal artefacts that are created, instead of adopting an unweighted ensemble mean as we did in this study.

#### **5. Conclusions**

We have presented an approach for combining GRACE and CryoSat-2 data in the spectral domain, resulting in a spatially highly resolved mass balance of the Antarctica ice sheet. We treat the combination as a downward continuation of the GRACE coefficients with CryoSat-2 data, accounting for the respective wavelength-dependent noise characteristics. We obtain a total ice mass balance for Antarctica of −<sup>178</sup> ± 23 Gt yr−<sup>1</sup> for the time period of February 2011 to June 2017; basin-averaged mass rates are presented in Figure 7. Based on the analysis of statistical ensembles, we have shown that GRACE and CryoSat-2 have complementary characteristics regarding the noise power at different spatial wavelengths. Thus, up to degree and order j = 40 (500 km), GRACE contributes about 88 % (CryoSat-2 is 12 %) to the power of the mass rate field, while at the maximum cut-off degree of 512 (40 km), the cumulative GRACE contribution is reduced to about 26 % (CryoSat-2 is 74 %; see Figure 5). The combined mass rate field has the resolution of the CryoSat-2 field (here, 40 km, due to cut-off degree *j* = 512), and therefore provides independent mass rate estimates beyond the typical basin scale (25 Antarctic drainage basins). The combined field exhibits smaller uncertainties compared to estimates based on single sensors for all spatial scales and successfully reduces systematic noise patterns in GRACE and CryoSat-2. Compared to alternative gridded mass products from GRACE data alone (Level 3 data from CSR and ESA CCI), our combined GRACE and CryoSat-2 estimate is higher resolved, more accurate, and largely suppresses leakage to the ocean and between basins. Further developments will help identifying the optimal scheme for converting elevation rates to surface load rates based on the mismatch of the GRACE and CryoSat-2 spectra. Beyond improving ice sheet mass balances, the spectral combination method may offer the possibility to merge GRACE/GRACE-Follow On data and other water storage measurements into a combined Level 4 data product.

**Author Contributions:** I.S. conceived the study with contributions by H.K. and K.G. H.K. programmed most of the spherical harmonic combination algorithm and pre-processed the input satellite data with contributions and supervision by I.S. V.H. calculated the ensemble of CryoSat-2 for different re-trackers and fitting methods. I.S. provided the graphical displays. All authors discussed the results and contributed to the writing of the manuscript.

**Funding:** This research was funded through, and is a contribution to, the Regional Climate Change Initiative REKLIM of the Helmholtz Association.

**Acknowledgments:** We are grateful for F. Simons providing scripts for handling spherical harmonics at http: //geoweb.princeton.edu/people/simons/software.html. We would like to thank the German Space Operations Center (GSOC) of the German Aerospace Center (DLR) for providing continuously and nearly 100% of the raw telemetry data of the twin GRACE satellites, and ESA for providing Level-1B/2I Baseline\_C CryoSat-2 data. In addition, we wish to thank those who have made the following ancillary data available: GRACE Mascons CSR RL05M (http://www2.csr.utexas.edu/grace/RL05\_mascons.html), ESA CCI Antarctica ice mass product (http://esa-icesheets-antarctica-cci.org/), and RACMO2/ANT (https://www.projects.science.uu.nl/iceclimate/ publications/data/2018/index.php).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

#### **Processing of CryoSat-2 Level 1 and 2 data**

We use the Level-1B/2I Baseline\_C CryoSat-2 data provided by the ESA as initial data product, from which we eventually derive elevation rates. For each waveform (i.e., the radar echo detected by the satellite), the range is estimated using four re-tracker algorithms implemented in our processing scheme: TFMRA, AWI\_OCOG, AWI\_ICE2, and EC\_TFMRA, respectively. The three remaining re-tracker solutions are extracted from the Level\_2I (LRM) product ESA\_OCOG, ESA\_ICE, ESA\_OCEAN. In the SARIn case, only one ESA Re-tracker solution is provided, which we used in combination with the three different ESA LRM products.

The AWI\_ICE2 follows the Ice2 re-tracker designed by LEGOS to process ERS1 data over continental ice sheets [60,61], whereas AWI\_OCOG uses a modified version of the algorithm developed by Wingham et al. [62]. The leading edge of the waveform is re-tracked at the first intersection of 30% of the OCOG amplitude. EC\_TFMRA uses the TFMRA solution corrected by the leading-edge width. The leading-edge width of the waveform is estimated as linear regression between 15% and 80% of the leading edge of the first maximum. In general, EC\_TFMRA and TFMRA are less sensitive to volume scattering, followed by AWI\_OCOG and ESA\_OCOG. The three-model based re-tracker solutions AWI\_ICE2, ESA\_OCEAN, and ESA\_ICE are more sensitive to contributions of volume scatter, as they generally re-track at 50% or more of the maximum power. This has been demonstrated for Antarctica by Helm et al. [42] and for Greenland by Nilsson et al. [16].

All seven LRM solutions are further corrected for slope, estimating the point of closest approach (POCA) using the refined relocation method [42,63]. For SARIn, the POCA is determined using the interferometric phase at the re-tracked position (only for the four AWI solutions). For the SARIn ESA solution, we use the POCA given in the L2I product. Each of those seven independent elevation products are finally used to obtain Antarctic-wide *∂*h/*∂*t estimates, using four different least square fitting methods (M1 to M4). In all cases, we apply the fit to all data points of the 2011 to 2017 time series falling within a pixel size of 2 km. This intermediate *∂*h/*∂*t raster is interpolated using inverse distance weighting with a radius of 25 km to obtain the final *∂*h/*∂*t grid with a 5 km pixel spacing. The differences between M1 to M4 are due to the unknown topography within a 2 km pixel, which needs to be considered, as the elevation trend is estimated at the center of each pixel. In case M1, we use an external DEM to estimate the subpixel topography. To estimate *∂*h/*∂*t, the topography is subtracted using bilinear interpolation of a DEM [42], before a linear regression on the elevation residuals is applied. For M2, M3, and M4 the topography is estimated in combination with the elevation trend as a polynomial, linear, and quadratic surface fit, respectively.

#### **Conversion of CryoSat-2 elevation rates into rates of surface loading**

A simple method of converting elevation rates, *<sup>∂</sup><sup>h</sup> <sup>∂</sup><sup>t</sup>* , into rates of surface loading, *∂σ <sup>∂</sup><sup>t</sup>* , is to assume that one can identify the main process causing surface elevation change (ice dynamical imbalance vs. snow fall anomalies) and multiply surface height trends with the respective density (ice vs. snow, for example [15,43]). To account for some of the uncertainties introduced by the assumptions, we utilize this approach in three different realizations: (i) All surface elevation change is due to ice dynamical imbalance (i.e., *∂σ <sup>∂</sup><sup>t</sup>* <sup>=</sup> <sup>910</sup> mm we <sup>m</sup> *<sup>∂</sup><sup>h</sup> <sup>∂</sup><sup>t</sup>* ); (ii) surface elevation change is homogenously 50% due to ice dynamical imbalance and 50% due to snow fall anomaly, and the two processes act in the same direction (in- or deflation of the surface; factor 650 mm we <sup>m</sup> instead of 910 mm we <sup>m</sup> ); (iii) areas of ice dynamical imbalance are associated with fast-flowing or fast lowering regions (factor 910 mm we <sup>m</sup> ), while in remaining areas the density of snow is assumed, modulated by the changes in firn densification (factor 300 mm we <sup>m</sup> to 500 mm we <sup>m</sup> ; [23], Section 3.4.1 and Figure 7 therein). Each of these assumptions will be valid in some places and wrong in others, but the ensemble of all of them together probably contains the actual situation in most places. Note that opposing anomalies of ice-dynamical imbalance and accumulation may lead to apparent density changes beyond the physical range of the respective end members of 300 mm we <sup>m</sup> and 910 mm we <sup>m</sup> .

Additionally, as a fourth method of converting from *<sup>∂</sup><sup>h</sup> <sup>∂</sup><sup>t</sup>* to *∂σ <sup>∂</sup><sup>t</sup>* , we remove modelled trends in surface height due to surface processes (snow fall variability, firn densification, *<sup>∂</sup>h*<sup>S</sup> *<sup>∂</sup><sup>t</sup>* ) over the same period, leaving supposedly ice-dynamical height changes as the residual. After the conversion from height changes to mass changes (factor 910 mm we <sup>m</sup> ), the trends in the mass balance (after removal of the long-term average) over the same period, *∂σ*SMB *<sup>∂</sup><sup>t</sup>* is added back to restore mass variability due to snowfall. Both fields are available locally based on the Regional Climate Model RACMO2/ANT [44]. In this realization, the volume to mass conversion is

$$\frac{\partial \sigma}{\partial t} = 910 \frac{\text{mm}}{\text{m}} \left( \frac{\partial h}{\partial t} - \frac{\partial h\_S}{\partial t} \right) + \frac{\partial \sigma\_{\text{SMB}}}{\partial t} \tag{A1}$$

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **The Rapid and Steady Mass Loss of the Patagonian Icefields throughout the GRACE Era: 2002–2017**

**Andreas Richter 1,2,3,\*, Andreas Groh 1, Martin Horwath 1, Erik Ivins 4, Eric Marderwald 2,3, José Luis Hormaechea 5, Raúl Perdomo <sup>2</sup> and Reinhard Dietrich <sup>1</sup>**


Received: 18 March 2019; Accepted: 10 April 2019; Published: 14 April 2019

**Abstract:** We use the complete gravity recovery and climate experiment (GRACE) Level-2 monthly time series to derive the ice mass changes of the Patagonian Icefields (Southern Andes). The glacial isostatic adjustment is accounted for by a regional model that is constrained by global navigation satellite systems (GNSS) uplift observations. Further corrections are applied concerning the effect of mass variations in the ocean, in the continental water storage, and of the Antarctic ice sheet. The 161 monthly GRACE gravity field solutions are inverted in the spatial domain through the adjustment of scaling factors applied to a-priori ice mass change patterns based on published remote sensing results for the Southern and Northern Patagonian Icefields, respectively. We infer an ice mass change rate of −24.4 ± 4.7 Gt/a for the Patagonian Icefields between April 2002 and June 2017, which corresponds to a contribution to the eustatic sea level rise of 0.067 ± 0.013 mm/a. Our time series of monthly ice mass changes reveals no indication for an acceleration in ice mass loss. We find indications that the Northern Patagonian Icefield contributes more to the integral ice loss than previously assumed.

**Keywords:** ice mass; satellite gravimetry; Patagonia; GRACE

#### **1. Introduction**

The Southern Patagonian Icefield (SPI; 12,500 km2) and Northern Patagonian Icefield (NPI; 4000 km2) constitute, together, the largest temperate ice mass in the Southern Hemisphere. They are aligned in a north–south (N–S) direction along the crest of the Southern Andes (Figure 1). The icefields are strongly influenced by the Westerlies and the transport of wet air mass from the Pacific Ocean, receiving precipitation of up to 10 m/a water equivalent [1]. The quantification of present-day ice mass changes contributes to our understanding of ongoing climate change in the southern mid-latitudes and the southeastern Pacific. An observational assessment of the demise of the two Patagonian Icefields provides robust information for predicting the future of the regional cryosphere and the water cycle, as well as its contribution to the global sea level. Previous works suggest a disproportionally large contribution of the Patagonian Icefields to eustatic sea level rise when compared to their area, for example, 0.067 ± 0.004 mm/a [2], 0.105 ± 0.011 mm/a [3], and 0.059 ± 0.005 mm/a [4]. Bedrock global navigation satellite systems (GNSS) observations adjacent to and within the SPI reveal an uplift as high as 4 cm/a [5–7] related to glacial isostatic adjustment (GIA), which combines the elastic response to

present-day ice load change, and the viscoelastic adjustment to the ice mass gain and loss during and following the Little Ice Age (LIA) [8,9]. We make use of recently determined GIA models, which are fitted to the uplift observations [6] in order to compute the solid Earth contribution to the observed mass changes. The contrast between two recent regional GIA models highlights the degree of uncertainty in the regional ice load history, especially for the most recent time slice 1995–2013 [6] (Table 2).

**Figure 1.** Maps of the region under investigation. (**a**) Location of the Northern (NPI) and Southern Patagonian Icefields (SPI) in southernmost South America. Color scale shows the ocean mass change rates contributing to the relative sea level according to our correction model (see text); yellow circle: data grid node for which time series are shown in Figure 2; red box indicates the extension of the grid used in the computations and corresponds to the map outline of Figure 3. AR—Argentina; CL—Chile; AP—Antarctic Peninsula. (**b**) Detailed map of the NPI and SPI area. Thin polygons on the icefields outline the accumulation and ablation zones of 87 glacier basins. Colored dots show the a-priori mass-rate pattern derived from ice elevation rates [2] at the barycenter coordinates for each zone. Lakes mentioned in the text: B—Lago Buenos Aires/General Carrera; S—Lago San Martín/O'Higgins; V—Lago Viedma; A—Lago Argentino; R—Brazo Rico and Brazo Sur.

Southern Patagonia is characterized by a series of distinct Neoglacial advances and retreats during mid-to-late-Holocene times (0–5 ka before present) [10,11]. The most recent advance corresponds to the regional LIA. This history is inferred from datable moraine materials [11] and is consistent with the regional paleo-climatological record [12,13]. While the climatic origins of these fluctuations are unknown, they are thought to be linked by global teleconnections [14,15]. More importantly, they establish a pattern of natural variability that is now perhaps in sync with anthropogenically driven

warm late-summer conditions [16,17]. The latter exacerbates the negative net mass balance of the mountain glacier systems over the past 30–40 years, and that we know to be global in nature [18]. The future of the Patagonian Icefields may well be determined by the climatic conditions generated by the evolving state of the atmosphere and ocean on their Western Flank, as these are, in turn, linked to changes in both the El Niño and Southern Oscillation (ENSO) and the Pacific Decadal Oscillation (PDO), phenomena that we now suspect have been modulated by anthropogenic forcing since the 1990 s [19].

Our study of the 15-year mass evolution of the Patagonian Icefields provides new clues about the present-day land-ice trajectory of the higher latitude Southern Hemisphere. We discover that glacier mass loss is dominated by a strong, and relatively uninterrupted, linear trend, possibly lending new insights to some of the standing questions regarding the future of the Southern Ocean and its environs [20].

#### **2. Data and Methods**

Natural conditions at the Patagonian Icefields limit the applicability of observational techniques that can be used to determine ice mass changes. Table 1 summarizes the previously published present-day ice mass change rates for the Patagonian Icefields. Conventional satellite radar altimetry is not appropriate because of the large footprint size compared to the narrow, fragmented, and often steep ice surface. Satellite laser altimetry provides insufficient data because of the frequent cloud cover and the unfavorable N–S orientation of the icefields. Recent advances using Cryosat-2 swath radar altimetry [4] are promising, but are limited in time to 2010 onward. The surface mass balance estimates derived from atmospheric modelling [17,21,22], or the changes derived from field glaciological measurements, suffer from a sparse distribution of meteorological stations and sampling sites that do not adequately capture the large variability over very short distances and rugged topography. Remote sensing techniques based on both optical [2,3,23–25] and radar imagery [26–28] provide valuable data for quantifying the ice volume and ice mass changes of the SPI and NPI, in spite of having to deal with cloud cover or uncertain radar signal penetration. All of the remote sensing results (Table 1) have a partially reconcilable agreement on the net ice mass loss to both icefields. However, the published rates can differ by as much as a factor of 2, a discrepancy that cannot be explained by the differing imagery acquisition time spans. Mouginot and Rignot [29] present a comprehensive determination of the ice flow velocities over SPI and NPI based on satellite data covering 30 years (1984–2014). They reveal a heterogeneous behavior over time among the individual glaciers—some of the major glaciers accelerate significantly, others decelerate or show a reversal in acceleration/deceleration over a decade or so.

Time-variable global gravity field solutions provided by the gravity recovery and climate experiment (GRACE; [30]) satellite mission have proven an efficient basis for the quantification of ice mass changes (e.g., [31–35], as well as the references therein). Previous inversions of GRACE data targeted explicitly on SPI and NPI [36,37] had to rely on relatively short time spans compared to the complete mission record through 2017, along with larger uncertainties in the earlier GRACE Level-2 data releases and less refined correction models available at that time (e.g., for correcting the effects of GIA and the surface mass changes, and for improving the low-degree Stokes coefficients, i.e.: C10, C11, S11, C20, C20, and S21). More recent ice mass change rates for southern Patagonia are presented as part of the global inversions that account for all mountain glacier and ice cap contributions to sea level rise [32,38–40].

Here, we derive the ice mass changes of the Patagonian Icefields between April 2002 and June 2017 from the monthly ITSG-Grace2016 [41] gravity field solutions. The monthly SPI and NPI ice mass changes are derived from a GRACE Level-2 data analysis approach similar to the regional point mass inversion presented in Forsberg and Reeh [42] for Greenland. This approach has been refined and modified (e.g., [43–45]), and it has also been used to derive ice-mass change time series for Antarctica and Greenland for a reconciled estimate of ice sheet mass balance [34,35]. In our case, we used a-priori information on the location of the point masses and on their changes with time. This allows for the set-up of a very dense pattern of masses (SPI: 98, NPI: 76, in total 174), and we determined a common scaling parameter for the a-priori mass change rates.

There is no sharp spatial resolution limit of GRACE. The ability to separate the ice mass changes of the SPI and the NPI (with a 320 km distance between their aerial barycenters and a 80 km minimum distance between their areas) depends on a number of aspects, namely: (i) the noise level of the used GRACE products with respect to the specific geographic situation, (ii) the strength of the signal, and (iii) the exploitation of the a-priori information. The a-priori information on the geographic patterns of the mass change have been widely used to separate nearby signals. This has been shown to have great advantages for dealing with coastlines [46]. We explore the separability of the NPI and SPI signals, because each of these three factors may be readily exploited. (i) ITSG-Grace2016 has a reduced noise level w.r.t. to the previous GRACE Level-2 releases [47]. The higher frequency and shorter wavelength noise of the GRACE gravity field solutions is inherently lower for the along-track (N–S) accelerations detected by the K-Band Ranging (KBR) system, relative to the noise in an east–west (E–W) (track-normal) direction, wherein the KBR system does not directly provide gravity information [48]. (ii) The sustained mass change of the Patagonian Icefields is among the largest spatially concentrated long-term signals worldwide. (iii) We can exploit the a-priori information on the distribution of the glacier mass loss of the two icefields along with the extensive a-priori information on the disturbing signals outside the icefields, as explained below and in Appendix A.

**Table 1.** Summary of ice mass change rates. The "comment" indicates the spatial extent for analyses using gravity recovery and climate experiment (GRACE) and Cryosat-2, and the data sources in the case of the remote sensing studies. SPI—Southern Patagonian Icefield; NPI—Northern Patagonian Icefield; SRTM—Shuttle Radar Topography Mission.


\* Converted to an ice mass change rate from the originally published ice volume change rate, applying an ice density of 0.9 g/cm3.

In the following section, our inversion approach is described, and further details of the analysis procedure are given in the Appendix A. We used unconstrained monthly gravity field solutions of release ITSG-Grace2016 provided by TU Graz [41]. These are provided as Stokes coefficients to a spherical harmonic degree and order of 120. This time series comprises 161 monthly solutions that cover the period April 2002–June 2017. We added degree one coefficients (C10, C11, and S11) derived from the ITSG-Grace2016 solutions [49]. The C20-term was replaced by the values obtained by satellite laser ranging [50]. The C21 and S21 coefficients were corrected for the pole tide, as recommended by Wahr et al. [51]. The monthly Stokes coefficients were corrected for GIA by applying the regional model A according to Lange et al. [6] (henceforth referred to as La-A), after removing the elastic contribution to the uplift. The Stokes coefficients were then further corrected for the effect of elastic loading deformation [44] (Equation (2)).

We retrieved GRACE pseudo-observations on a regional grid at the approximate GRACE satellite orbit altitude (500 km) from each monthly set of corrected Stokes coefficients [42–45]. Figure 1a shows the extension of this grid. The spacing is approximately 20 km, with a constant longitude increment and latitude increments that provide for an equal area of all of the grid cells. We chose to generate our pseudo-observations at the orbit altitude in order to avoid the amplification of noise in the downward continuation. No explicit filtering was applied to the GRACE data. For each grid node and month, 3D Cartesian gravity vector components were computed (instead of the scalar gravity disturbances used by Forsberg et al. [44] as pseudo-observations). These monthly sets of 3D gravity vectors were interpreted with respect to the mass redistributions at the Earth's surface. For comparison, an additional mass corresponding to a uniform layer of 1cm water equivalent of global extent at the surface (sea level) implies a change in the length of our gravity vectors at the orbit altitude of +7.22 nm s<sup>−</sup>2.

The monthly gravity vector grids were corrected explicitly for five surface mass variation processes—the Antarctic ice sheet (AIS) [52], ocean mass change, changes in continental water storage [53,54], ice mass change rates in South America outside Patagonia [26], and water mass changes in the great Patagonian lakes [55]. Our ocean mass correction consisted of a gravitationally self-consistent distribution of the mass discharge from five major ice sheets/ice caps (AIS, Greenland ice sheet, Patagonian Icefields, Alaska, and the Canadian Arctic) and continental water storage over the ocean [56] (using ice mass change estimates [52,57] and continental water storage [53] as input), complemented by an empirical regional ocean mass change rate increment. The mass signals from outside our data grid domain may affect our pseudo-observations through leakage [44]. Our approach, with the global extent of the correction models (ocean mass and continental water storage) plus the additional solved-for ocean mass change rate increment, ensures that possible leakage effects are minimized. The time series of the pseudo-observations prior and after the correction, as well as of the individual mass-variation corrections, are shown for one data grid node in Figure 2. The individual impact of the applied corrections on the SPI and NPI ice mass change rate is given in Table 2. Figure 3 shows maps of the gravity rate components prior (top) and after (center) the application of all corrections.

**Figure 2.** Time series of the gravity signal and the applied corrections for a selected data grid node. The relative change in gravity at the approximate GRACE orbit altitude prior (uncorrected, grey) and after (corrected, black) the application of the corrections described in the text is shown in the radial (top), north (center), and east (bottom) vector components. The applied corrections for the five mass-variation processes are shown in different colors. The location of this grid node is indicated in Figure 1 (yellow circle).

**Figure 3.** Map representation of gravity rate vector components at the approximate GRACE orbit altitude. Topocentric vector components are shown: radial (left), north (center), and east (right). Top: gravity rate derived from the original monthly GRACE solutions prior to the corrections; center: gravity rate after application of the corrections described in the text; bottom: residuals after adjustment of the ice mass scale factors.

**Table 2.** Impact of individual corrections applied in the GRACE data analysis scheme on the derived SPI and NPI ice mass change rate. Column 2 is the difference between our solution, including all of the corrections minus a modified solution, including all but the correction (or employing the alternative data set) specified in column 1. Column 3 indicates the percentage of column 2 with respect to the rate including all of the corrections (−24.4 Gt/a). Column 4 shows, for each modified solution, the ratio of the gravity vector RMS over the regional grid between the residuals after the adjustment and the corrected observations (for comparison, this ratio amounts to 0.152 for the solution including all corrections).


We synthesized an a-priori pattern of the ice mass change at SPI and NPI. A scaling factor for this mass change pattern was derived from each monthly field of the corrected gravity vectors, simultaneously with the parameters of the empirical regional ocean mass change rate increment, by a least-squares adjustment without any regularization. In this way, a time series of the integral ice mass change was obtained (Figure 4). First, an SPI and NPI a-priori pattern consisting of 174 point masses was derived from the average ice surface elevation rates for the accumulation and ablation zones of 87 glacier basins [2] (see Appendix A). However, these ice surface elevation rates were based on slightly different imagery acquisition time spans and processing algorithms for NPI [23] and SPI [2]. Therefore, a systematic bias between the icefields cannot be ruled out. We used the GRACE data to optimize the a-priori ice mass change pattern by deriving an individual scaling factor for each of the two subsets of point masses that represent SPI and NPI, respectively. Thus, two separate scaling factors were determined simultaneously, for which our adjustment yields a minimum residual gravity rate RMS. The monthly RMS values of the residual misfit between the gravity effect of the scaled optimum ice mass change pattern and the corrected gravity vector fields are included in Figure 4 (bottom).

A weighted fit of a linear plus periodic (annual and semi-annual sinusoid) model (red in Figure 4) to the time series yields the ice mass change rate for the SPI and NPI region. The GRACE Level-2 data quality degrades after March 2016, especially since the decommissioning of the accelerometer on the GRACE-B satellite in November 2016 [58] (see also Appendix A). We took this heterogeneity into account in the ice mass change rate estimation by an empirical downweightTing of the monthly ice mass change values after March 2016. For this purpose, the uncertainty of the monthly ice mass change estimates was calculated separately until and after that month. We obtained uncertainties (1σ) of ±22.7 Gt and ±53.5 Gt, respectively. This documents a noise increase by a factor of 2.4. The reciprocal of this value was introduced as weights for the monthly ice mass change values after March 2016 in the rate estimation. In addition, the Level-2 data for February 2015 was impaired by a two-day repeat orbital resonance [59]. Therefore, rate and accuracy estimates do not include this month.

**Figure 4.** Time series of the ice mass change in the SPI and NPI region. Black: unfiltered time series of monthly relative ice mass change throughout the GRACE data period; red: linear plus periodic (annual and semi-annual) signal fitted to this time series; grey: low-pass filtered time series applying a running annual mean; green: residual root mean square (RMS) of the monthly gravity vectors after the adjustment of the ice mass scale factors. Dashed vertical line marks the month of March 2016, after which the GRACE data quality deteriorates and the monthly ice mass estimates were downweighted in the ice mass change rate determination.

#### **3. Uncertainty Estimation**

The uncertainty of the derived ice mass change rate is composed of the uncertainty of the individual monthly ice mass estimates and the uncertainties in the applied corrections. The residuals from the linear plus periodic model are used to estimate the precision of the ice mass change values [60]. The individual precision estimates for the two sub-periods, prior and after March 2016, lead to an uncertainty of the derived ice mass change rate of ±2.3 Gt/a.

The effect of all of the individual corrections applied on the ice mass change rate is summarized in Table 2. As a conservative estimate, we assume a 100% relative uncertainty for all of the corrections, that is, the corrections of the low-degree Stokes coefficients (lines 1–3 in Table 2) and the corrections for the surface mass change processes (lines 7–14 in Table 2). The total uncertainty of these corrections and the uncertainty of the monthly ice mass estimates (±2.3 Gt/a) is calculated as the root sum square of the individual components.

A crucial source of the uncertainty for the derived ice mass change rate is the applied GIA model. The GIA model La-A adopted in our correction is constrained by the GNSS observed uplift rates [6,7], and implies a (rock) mass gain equivalent to 9.1 Gt/a (Table 2). Employing the alternative GIA model La-B [6] (which also satisfies the uplift data; 10.5 Gt/a) in our GIA correction results in a decrease in the derived mass change rate by −1.4 Gt/a (Table 2). We adopted the standard deviation between both of the ice mass change rates as an estimate of the systematic impact of the uncertainty in the GIA correction. Adding this uncertainty contribution of ±1.0 Gt/a leads to a total uncertainty (1σ) of the ice mass change rate for the SPI and NPI region of ±4.7 Gt/a.

#### **4. Results and Discussion**

We obtained an ice mass change rate in the SPI and NPI region of −24.4 ± 4.7 Gt/a. The RMS of the residual gravity rate vectors over our grid domain (Figure 3, bottom) amounted to 0.38 nm s−<sup>2</sup> a−<sup>1</sup> compared with the 2.52 nm s−<sup>2</sup> a−<sup>1</sup> RMS of the observed gravity vectors with all of the corrections applied (Figure 3, center). This corresponds to a signal variance reduction of 98% and demonstrates the efficiency of the scaled ice mass change pattern to explain the observed gravity changes. The determination of a common factor for the initial SPI and NPI a-priori pattern as described in Section 2 yields a residual RMS value of 0.44 nm s−<sup>2</sup> a−1. A significant reduction of the RMS (down to the above-mentioned value of 0.38 nm s−<sup>2</sup> a<sup>−</sup>1) is achieved when scaling the a-priori patterns of SPI and NPI separately. It is well-known that GRACE's ability to resolve masses close to each other is limited, and leakage effects may occur. In order to explore the sensitivity of the GRACE data concerning the separation of the SPI and NPI ice mass loss we carried out a forward computation of gravity rate vector components in our data grid according to the a-priori ice mass change pattern derived from Willis et al. [2]. This forward computation is repeated with systematically altered scaling factors for both SPI and NPI. For each pair of scaling factors (corresponding to a pair of mass change rates of the NPI and the SPI), the RMS of the residuals (gravity rate vector observed by GRACE after application of all of the corrections minus the computed gravity rate vector) is calculated. The results are shown in Figure 5. The RMS levels exhibit an elliptical shape with an optimum fit in the center. A movement from the RMS minimum along the direction of the semiminor axis of the ellipses would change the sum of both ice mass change rates and lead to a fast degradation of the fit. It is therefore an indication of the stability of our estimation of the sum of both ice mass change rates. A movement from the minimum along the semimajor axis of the ellipses would keep the sum of both of the mass rates constant, but would change the relation between the rates of the SPI and NPI. Here, a larger range of pairs of mass rates shows a reasonable fit. This demonstrates the degree of separability of the SPI and NPI ice mass change rates.

We interpret the tendency of our optimization, along with computational experiments applying different alternative a-priori ice mass change rate patterns, presented in Table A1 in the Appendix A, as an indication that the ice mass loss rates derived from the remote sensing results [2,26,28] (Figure 5) underestimate the ice loss contribution from NPI. Indeed, for the Benito glacier at the Western Flank of the NPI, a combination of field and satellite-derived elevation data suggests a dramatic surface lowering, attributed to a negative surface mass balance, with a mean rate of −3.0 ± 0.2 m/a between 1973 and 2017 [61].

The time series of the monthly ice mass change in the SPI and NPI region (Figure 4, top) is dominated by a steady decrease and an annual cycle. We estimated the annual and semi-annual amplitudes at 54.6 Gt and 0.8 Gt, respectively. The ice mass time series smoothed by the running annual mean shows small inter-annual variations and is dominated by a linear decrease. The long-term ice mass change is very well approximated by the linear model, in particular, we found no indication for an acceleration of the ice mass loss in the SPI and NPI region throughout the GRACE operation period (Figure 4). In fact, when including, in addition, a quadratic term in the model fitted to the ice mass change time series, this acceleration term was statistically not different from zero, according to a *t*-test with a 95% significance level. On the other hand, minimum observation periods of 10 years and 20 years are necessary for a statistically significant determination of ice mass loss acceleration of the AIS and Greenland ice sheet, respectively [62]. This makes a statistically sound inference of an acceleration of ice loss of the smaller Patagonian Icefields over the 15 year long GRACE observation period rather unlikely.

Prominent features in the time series of the residual gravity misfit (Figure 4, bottom) can be primarily explained by the orbital resonance appearing in the Level-2 solution for the month of February 2015 (RMS of almost 100 nm s<sup>−</sup>2). A slight increase in misfit after March 2016 (on average 23.1 nm s<sup>−</sup>2) compared to previous (on average 16.2 nm s<sup>−</sup>2) months also becomes evident. This, together with the increase in the deviation from the linear plus periodic fit, gives a quantitative measure of the inevitable degradation of the Level-2 GRACE product emerging from the Level-1a and 1b data obtained during the final year of the GRACE mission. In our study, the data quality was interpreted to mean the ability to fit our mass-change pattern, and depends only on those combinations of spherical harmonic

coefficients that dominate shaping the monthly gravity changes within our grid domain. Such mainly short-wavelength signals correspond to a great extent to coefficients of a higher degree and order. A degradation concentrated on low-degree coefficients leads to a long-wavelength bias, which is largely mitigated by our joint adjustment of the empirical ocean mass increment and the scaling factor for the ice mass change pattern with the full exploitation of the directional information of the 3D gravity vectors. A more detailed discussion of the GRACE Level-2 data degradation and its impact on the SPI and NPI ice mass estimation is given in Appendix A.

**Figure 5.** Separation of the ice mass loss at SPI and NPI. After the application of all corrections to the pseudo-observations derived from the GRACE Level-2 data, the residual gravity root mean square (RMS) is computed for pairs of ice mass change rates for SPI and NPI that are varied over plausible ranges. The residual RMS is shown in colors. White circle: ice mass change rates according to our pattern optimization; black symbols: ice mass change rates according to previous works; triangle: Braun et al. [26]; inverted triangle: Jaber [28]; diamond: Foresta et al. [4]; square: Willis et al. [2].

Table 2 shows the impact of the individual corrections applied in the course of our GRACE data analysis on the ice mass change rate for the SPI and NPI region. The GIA correction has by far the largest impact, contributing about one third to the derived ice mass change rate. Further corrections significant for the ice mass change rate are those for the mass changes in the ocean and the AIS. The ice mass loss and elastic solid Earth response at SPI and NPI cause a local lowering of the sea level adjacent to southern Patagonia through a loss of gravitational attraction coincident with the ice mass losses on land (Figure 1a). If this effect is not accounted for, the negative gravity pattern over the nearby ocean is absorbed into the ice mass estimate, while the long-wavelength sea level change is accommodated by the empirical regional ocean mass change rate correction. This effect has also been discussed in the context of the Antarctic Peninsula ice loss [63]. The far-field effect of the ice mass loss of the AIS manifests itself in a decrease in the N–S directed gravity component, with increasing intensity towards the south (Figure 2). Our correction model for the continental hydrology produces a signature of opposite orientation (intensity increasing towards N), yet smaller magnitude. Glaciers in the vicinity of the SPI and NPI are included in the derived ice mass change rate. The ice mass change of the extra-Patagonian glaciers in South America [26] results in a slight decrease in the SPI and NPI ice mass change rate, dominated by the glacial systems further south (Cordillera Darwin, Gran Campo Nevado). Mass variations in the adjacent lakes have a minor influence on the ice mass change rate; but are included because of their impact on the seasonal ice mass signal. The small magnitude of the empirical regional ocean mass change rate increment testifies to a high consistency of our explicit ocean mass correction with GRACE gravity. The last column in Table 2 demonstrates that the omission of any of the significant surface mass corrections degrades the fit of the adjusted model to the observations. The application of an identical data correction and analysis to alternative GRACE Level-2 CSR-RL06 harmonics (Center for Space Research, University of Texas [64]) over January 2003 to August 2016 (maximum degree and order 96) yields an SPI and NPI ice mass change rate within 3% of that derived from the ITSG-Grace2016 harmonics for the same period.

Other GRACE derived ice mass change rates as well as those based on Cryosat-2 swath altimetry, are essentially in agreement with our results (Table 1). A more substantive disagreement beyond the stated error bars is found with loss rates derived from remote sensing, especially for the SPI. The rates determined over the acquisition time spans not overlapping with the GRACE data period [3,25] shall not be discussed here. Furthermore, while remote sensing techniques are spatially limited, with separate results published for the two icefields, the GRACE derived rates represent integrals over a broader region. Therefore, the GRACE derived ice mass change rates can be compared only with those remote sensing results, which consistently sum the contributions of both icefields [2,26,28]. Synthetic-aperture radar interferometry (TanDEM-X) [26–28], in fact, seems to yield systematically smaller ice mass losses than optical imagery (ASTER, SPOT) [2,23,24]. Our results support the larger trend in ice loss as derived by Willis et al. [2] from the ASTER imagery.

#### **5. Conclusions**

Our analysis of the complete GRACE satellite gravimetry record between April 2002 and June 2017 yields an SPI and NPI ice mass change rate of −24.4 ± 4.7 Gt/a. The time series of the monthly ice mass change does not indicate an acceleration in ice mass loss throughout the GRACE period. This ice mass change rate corresponds to an average contribution to the eustatic sea level rise of 0.067 ± 0.013 mm/a, or about 3% of the mass contribution to the sea level implied by our ocean mass correction model. This confirms the disproportionally large impact of the Patagonian Icefields on sea level rise. The almost constant ice mass loss derived from GRACE over 15 years is also interesting in the context of the changes of ice flow velocities in the same time. In light of the very heterogeneous evolution in the glacier flow velocities [29], our result of a stable mass loss rate suggests that the differences among the individual glaciers basins in the non-linear mass balance variations cancel out to a great extent. We found strong indications that the NPI has a larger relative contribution to the total ice mass loss than previous remote sensing results have suggested [2,23,26,28].

We also show that the Level-2 data of the latest months of the GRACE mission provide valuable information on Patagonian ice mass changes. The derived ice mass change rate depends sensitively on the corrections for GIA and non-uniform ocean mass changes. Our new estimate of the SPI and NPI ice mass change rate is a step towards an improvement of the recent ice-unloading history, providing a strong link to the GIA and crustal deformation in this tectonically complex region, and, possibly, to a better determination of the effective viscoelasticity beneath southernmost South America.

Sustained and unabated ice mass loss over the GRACE observing period is remarkable, for there seem to be climate change processes (surface mass balance and ice discharge) that play out over a long time scale, especially when we consider the losses that can be traced into the middle of the 20th Century [65]. This fact forebodes ill for the environment of southernmost South America in nearly every way. The loss of mountain glacier water resources will lead, inevitably, to irreversible changes in the habitability of the regional flora and fauna. This is the most important societal ramification that may be derived from the 15 year-long GRACE record in Patagonia.

**Author Contributions:** Conceptualization, A.R. and R.D.; data curation, A.R., A.G., E.M. and J.L.H.; formal analysis, A.R., A.G. and E.M.; investigation, A.R., A.G., M.H., E.M., J.L.H., R.P. and R.D.; methodology, A.G., M.H., E.I. and R.D.; resources, M.H. and R.P.; software, A.R., A.G. and E.M.; validation, A.G., E.M. and J.L.H.; writing (original draft), A.R.; writing (review and editing), A.G., M.H., E.I., R.P. and R.D.

**Funding:** Part of this research was funded by the German Research Foundation (DFG), grant number RI 2340/1-1. The APC was funded by the Open Access Publication Funds of the SLUB/TU Dresden. Part of the research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration, using ROSES awards 105526-967701.02.01.81 and 105526-281945.02.47.03.86 from the Earth Surface and Interior Program and the GRACE Science Team, respectively.

**Acknowledgments:** We thank Petra Döll for providing updated grids of the continental water storage of the WaterGAP 2.2c model. The lake-level time series of the Patagonian lakes was retrieved from the BDHI data base of the Argentine Subsecretaría de Recursos Hídricos. We thank the German Space Operations Center (GSOC) of the German Aerospace Center (DLR) for providing, continuously, nearly 100% of the raw telemetry data of the twin GRACE satellites.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

#### **Appendix A. Extended Description of GRACE Data Correction and Analysis Methods**

#### *GIA Correction*

The monthly GRACE gravity fields are corrected for GIA according to the regional model La-A [6]. This model has been shown to fit best the surface deformation observed by GNSS [7]. The GIA uplift models are given as regional grids and include the elastic response to present-day ice mass changes [6]. This elastic contribution has to be removed before deriving the GIA correction models for GRACE. We applied surface densities according to the a-priori ice mass change rate pattern [2], as a load model and Green's functions of an elastic Earth model [66]. The elastic contribution to the vertical deformation is calculated for the grid nodes of the GIA model, and is subtracted from the model uplift rates. This reduced GIA model is expanded in spherical harmonics (degree/order 360). The deformation coefficients are converted to Stokes coefficients by an approximation [37] (Equation 9), enforcing mass conservation through a global layer of uniform mass outside Patagonia. These GIA Stokes rate coefficients are truncated to degree/order 2 to 120, multiplied by the time since the initial GRACE epoch (April 2002), and subtracted from the monthly GRACE Stokes coefficients sets.

Lange et al. [6] present, in addition, a second regional GIA model, termed model La-B, which fits, similarly well, the GNSS uplift observations, and has been deemed more plausible by those authors. An alternative GIA correction was derived from this GIA model La-B by the same procedure in order to quantify the impact of the uncertainty of the GIA models on the derived ice mass change (Table 2).

#### *Surface Mass Variation Corrections*

The monthly gravity vector grids are corrected for surface mass-variations of the AIS, the global ocean, continental water storage, and the great Patagonian lakes. In order to remove the far-field effect of ice mass change in Antarctica, the GRACE-derived time series of gridded Antarctic ice mass changes of the AIS\_cci Gravimetric Mass Balance Product [52], generated in the scope of the European Space Agency's Climate Change Initiative project for the Antarctic Ice Sheet (AIS\_cci), are applied. The latest months of the GRACE data period (November 2016 to June 2017) are not included in this product. These time series are therefore extrapolated for each AIS\_cci grid cell, applying a linear plus periodic (periods: annual, semi-annual, and 161 days) model.

Our ocean mass correction consists in an updated solution of the gravitationally self-consistent distribution over the ocean of the water mass exchange with five major ice sheets/ice caps and the global continental hydrosphere [67] (following the pseudo-spectral approach [56]). The monthly ice mass change patterns of the AIS [52], Greenland ice sheet [52], and the Patagonian Icefields (this study, iterative solution), complemented by the ice mass rate patterns for Alaska and the Canadian Arctic [57], as well as the monthly patterns of the continental water storage according to the WaterGAP 2.2c model [53], are used as an input for the solution of the sea level equation. Because of the limited extent of the forcing models [52,53], the monthly ocean water mass distributions were computed through November 2016, and then extrapolated over the last GRACE mission months applying a linear plus periodic (annual, semi-annual) model. On a global average, this ocean mass model implies a relative sea level rise of 2.13 mm/a.

The effects of the changes in the continental water storage are removed by using total water storage time series on a global grid of the WaterGAP 2.2c model [53] with climate forcing WFDEI-CRU [54]. As these time series are available only until December 2015, they were extrapolated through June 2017, applying a linear plus periodic (annual and semi-annual) model. We took particular care of the effects of the hydrological mass variations associated with lake-level changes in Lago Buenos Aires/General Carrera (BAGC), Lago San Martín/O'Higgins, Lago Viedma, and Lago Argentino (Figure 1b). These large lakes receive the meltwater runoff from the adjacent NPI (BAGC) and SPI [55]. Brazo Rico and Brazo Sur of Lago Argentino were treated as an additional individual lake, because of the repeated damming by the Perito Moreno glacier during the GRACE data period. On short and seasonal time scales, we expect these lakes to produce a mass signal opposite to that of the icefields, but too close to be resolved separately by GRACE. We used tide gauge data in the lakes provided by the BDHI data base to derive the time series of the monthly mean lake levels. Before 2008, regular tide-gauge data are available only for Lago Argentino and Brazo Rico/Sur, the time series of the other three lakes were extrapolated backwards in time until April 2002, by stacking the annual lake-level signal. The mass signal of the lake-level changes was subtracted in the corresponding nodes of the global hydrological model grid prior to the application of the corrections to the GRACE data.

All of these surface mass-variation processes are represented by monthly time series of the surface density on global (ocean, hydrology) or regional (AIS, lakes) grids. For each of these processes and for each month, the corresponding 3D gravity vector components are derived for our regional grid and subtracted from the grids of the pseudo-observations.

#### *Empirical Regional Ocean Mass Rate Correction*

Our explicit ocean mass correction accounts for the mass exchange with five major glaciated regions and continental water storage. Additional sources may contribute to ocean mass variations in the region under investigation. Therefore, we apply an empirical, additional correction in order to accommodate the residual imperfections of our explicit ocean mass model. A set of point masses homogeneously distributed over the map extension in Figure 1a, spaced every 0.1◦ in longitude and in latitude increments that provide for an equal area between the point masses, is used for this purpose. Over all of the point masses falling within the ocean, the residual linear mass changes are modeled by three parameters describing a mean mass change rate and mass-rate gradients in a north and east direction, respectively. These three parameters are solved, simultaneously with the monthly scaling factors for the ice mass change pattern, in a joint least-squares adjustment to the monthly gravity vector grids.

In fact, this empiric correction would accommodate long-wavelength effects not only originating from regional deviations from our explicit ocean mass correction, but also due to residual imperfections in other correction models or low-degree GRACE data. The empirical regional ocean mass rate increment produces only a small change in the ice mass change rate (1%, see Table 2), indicating a high consistency between the applied correction models and GRACE data.

#### *A-Priori Mass Change Pattern*

An a-priori SPI and NPI ice mass change pattern (Wi12) is derived based on published remote sensing results [2]. We use ice surface elevation rates dh/dt, determined by differencing ASTER digital elevation models (DEM) from 2012 (SPI) and 2011 (NPI), relative to a DEM derived from the SRTM in February 2000 [2]. The dh/dt are given as the average values for accumulation and ablation zones of 87 (SPI: 49; NPI: 38) glacier basins. We applied an ice density of 0.9 g cm−<sup>3</sup> to convert the dh/dt to mass change rates. Glacier boundary polygons from the GLIMS data base [68] were used together with equilibrium line altitudes [2,23] and a SRTM derived DEM [69] in order to compute the barycenter coordinates for the 174 accumulation/ablation zones. The ice mass change rates are treated as point masses located in the barycenter of the corresponding accumulation/ablation zone (Figure 1b). This preliminary ice mass change rate pattern is then optimized by searching separate scaling factors for SPI respectively NPI that minimize the residual gravity RMS in our GRACE data adjustment.

Our approach is largely insensitive to the choice of the a-priori mass pattern, because the upward continuation to orbit altitude acts as a low-pass filter in the space domain, damping the resolution and detail of the surface mass configuration. This demonstrated computations employing the alternative a-priori patterns Ja16, Br19, and Fo18 (Table A1). All three of these alternative a-priori patterns consist of only two point masses each, in which the ice mass change rates derived from the remote sensing results [26,28] or swath altimetry [4] are concentrated. Although these models represent only 73%, 75%, and 91%, respectively, of the Wi12 total ice mass change rate, its impact on the adjusted SPI and NPI ice-mass rate does not reach 1.5% (Table A1). However, the application of either of the alternative patterns reduces the residuals, despite their coarser resolution. The systematic decrease in the residual gravity RMS among the different a-priori patterns with a decreasing SPI/NPI ice mass rate ratio is interpreted as an indication that the Wi12 model underestimates the ice-mass loss at the NPI relative to that at the SPI, and motivates our optimization of the Wi12 pattern by two separate scaling factors for SPI and NPI.


**Table A1.** A-priori ice mass change rate patterns used to adjust GRACE gravity rate vectors.

#### *Impact of GRACE Level-2 Data Quality Degradation on SPI and NPI Ice Mass Determination*

In order to assess the global temporal behavior of the GRACE data accuracy, we computed the residual equivalent water height anomalies for each spherical harmonic degree and order with respect to a model that includes linear, quadratic, and periodic terms. The standard deviation of these temporal residuals is interpreted as a measure of GRACE observation uncertainty. Figure A1 shows the degree amplitudes of these equivalent water height anomalies for the individual monthly Level-2 solutions. As expected, the maximum uncertainty is found at the highest degrees, while minimum standard deviations are found between degrees 10–30.

Among these degree amplitude curves, two outliers are observed, corresponding to the months of February 2015 (two-day repeat orbit; red) and March 2017. In addition, the deterioration of the data accuracy after March 2016 becomes evident, with degree amplitudes above the median (among all months of the entire GRACE data period), and substantially increased energy on the low degrees (<10).

In order to investigate the sensitivity of the SPI and NPI ice mass time series to this GRACE uncertainty evolution, the SPI and NPI region function (as applied in the classical regional integration approach) is expanded in the spherical harmonic coefficients. Unlike AIS and the Greenland ice sheet, which concentrate much of their signal on low degrees (<10), the small SPI and NPI region requiresa substantially broader band of degrees. Therefore, the uncertainty of those higher degrees is also more important. We derive degree amplitudes of the region functions for SPI and NPI, AIS, and the Greenland ice sheet, which are included in Figure A1. In general, only the portion of the error spectrum retained after filtering is integrated with the region function. In the present case, the filtering corresponds to the effect of using pseudo-observations at the orbit altitude. As the SPI and NPI region function distributes the energy over higher degrees, these are also less damped by the filtering (compared to AIS and Greenland). A comparison of the degree amplitudes of the observation uncertainty and the regional function reveals that the SPI and NPI region function has its strongest energy on the degrees for which the errors are smallest—degrees 10 to 30. Because of the small energy on the low degrees (<10), the accuracy degradation of the solutions after March 2016 has not had as efficient an effect on the SPI and NPI ice mass estimation, as, for example, for AIS or Greenland. Comparing the two outlier months reveals that February 2015 (repeat orbit) results in much larger residuals (green time series in Figure 4) than March 2017. This demonstrates that for SPI and NPI, the uncertainty on the higher degrees is much more important than the long-wavelength errors (unlike AIS or Greenland), but these higher degrees are not as severely affected by the accuracy degradation.

We conclude that the GRACE Level-2 data accuracy degradation affects mainly low degrees (<50), but the small SPI and NPI target is less sensitive to this low-degree degradation. Thus, also the latest months of the GRACE mission hold useful information about the Patagonian ice mass changes. In its exploitation, we took the data quality degradation into account by a cautious empirical downweighTing. An exclusion of the solutions affected by the data quality degradation has little impact on our final result—our time series over the period of April 2002 through March 2016 yields an ice mass change rate for the SPI and NPI region of −23.9 Gt/a.

A strategy is now in place at the Jet Propulsion Laboratory to model each of the thrusting events on the satellite lacking accelerometer data [70]. We therefore anticipate improvements in the Level-2 harmonics upon the next release that contains such modeling, and it will be interesting to further evaluate our scheme used here for the analysis of the late months of the GRACE mission.

**Figure A1.** Degree amplitudes of equivalent water height anomalies for all of the solutions of the GRACE Level-2 data period (left axis). Light grey: monthly solutions from April 2002 through March 2016; orange: from April 2016 through June 2017; red: February 2015 (two-day repeat orbit); magenta: March 2017; dark grey: median among all monthly solutions. Dashed curves show degree amplitudes for filtered region functions (right axis) for SPI and NPI (black); AIS (blue) and Greenland ice sheet (green).

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Downscaling GRACE TWSA Data into High-Resolution Groundwater Level Anomaly Using Machine Learning-Based Models in a Glacial Aquifer System**

#### **Wondwosen M. Seyoum 1,\*, Dongjae Kwon <sup>1</sup> and Adam M. Milewski <sup>2</sup>**


Received: 4 February 2019; Accepted: 2 April 2019; Published: 5 April 2019

**Abstract:** With continued threat from climate change and human impacts, high-resolution and continuous hydrologic data accessibility has a paramount importance for predicting trends and availability of water resources. This study presents a novel machine learning (ML)-based downscaling algorithm that produces a high spatial resolution groundwater level anomaly (GWLA) from the Gravity Recovery and Climate Experiment (GRACE) data by utilizing the relationship between Terrestrial Water Storage Anomaly (TWSA) from GRACE and other land surface and hydro-climatic variables (e.g., vegetation coverage, land surface temperature, precipitation, streamflow, and in-situ groundwater level data). The predicted downscaled GWLA data were tested using monthly in-situ groundwater level observations. Of the 32 groundwater monitoring wells available in the study site, 21 wells were used to develop the ML-based downscaling model, while the remaining 11 wells were used to assess the performance of the ML-based downscaling model. The test results showed that the model satisfactorily reproduces the spatial and temporal variation of the GWLA in the area, with acceptable correlation coefficient and Nash-Sutcliffe Efficiency values of ~0.76 and ~0.45, respectively. GRACE TWSA was the most influential predictor variable in the models, followed by stream discharge and soil moisture storage. Though model limitations and uncertainty could exist due to high spatial heterogeneity of the geologic materials and omission of human impact (e.g., abstraction), the significance of the result is undeniable, particularly in areas where in-situ well measurements are sparse.

**Keywords:** GRACE TWSA; groundwater level anomaly; downscaling; machine learning; boosted regression trees; glacial sediment

#### **1. Introduction**

With continued threat from climate change and human impacts, high-resolution and continuous hydrologic data availability is crucial for predicting trends and water resource availability. Given an increase in global population and water demand, measuring water storage and trends is valuable for water resources management, hazard analysis and mitigation, and food security [1–4]. These issues are exacerbated by the lack of hydrologic data, which leads to increased uncertainty in estimated water resource variables (e.g., river discharge, groundwater recharge) and hampers accurate predictions of water storage and trends [5–7]. Unfortunately, global in-situ data availability is poor due to malfunctioning of existing gauges, economic constraints, and the lack of data sharing among nations [8–11]. Conversely, the capability and demand of satellite applications in hydrology and water resources has grown in the past decades. Particularly, the availability of Terrestrial Water Storage Anomaly (TWSA) data from the Gravity Recovery and Climate Experiment (GRACE) satellite since 2002 is revolutionizing the way we measure Earth's water storage variations.

GRACE has twin satellites that measure gravity. Through a series of processing steps, the gravity anomaly measured by GRACE satellite is converted into monthly TWSA, with a spatial resolution of several hundred kilometers. GRACE has provided unprecedented opportunities for the science community despite the limitations inhibited by its coarse spatial scale (e.g., 200,000 km2). It has been used to understand regional and global trends in water storage changes [12–15], assess hydrologic extremes [16–18], and investigate water deficit, drought management [19–21], and surface reservoirs and lakes [22]. Combined with other satellite data (e.g., water level from satellite altimetry, vegetation from Moderate Resolution Imaging Spectroradiometer (MODIS)), GRACE has been used to understand the water dynamics of several river basins [23,24].

Given the spatial resolution challenges of GRACE, researchers have attempted to downscale the data to local or sub-regional scales using techniques such as data assimilation [25] and empirical methods [26–28]. Sun [26] used GRACE and Artificial Neural Networks (ANN) to predict monthly and seasonal water level changes for wells located in the US. This approach can be used to predict temporal changes in groundwater; however, it has limited skill in predicting groundwater change in space. Seyoum and Milewski [27] developed a robust approach using an ANN model that downscales TWSA from the GRACE scale (~200,000 km2) to a watershed scale (watershed areas range between 5000 and 20,000 km2) in the Northern High Plains Aquifer. They predicted spatial TWSA from GRACE and other hydro-climate variables (e.g., land surface temperature, precipitation, soil moisture, etc.); however, extracting high-resolution groundwater storage information requires further disaggregation of the downscaled product. As a result, the uncertainty of the derivate product (e.g., groundwater storage anomaly) is amplified. In addition, the ability of the ANN downscaling model to capture the spatial variability within the watershed is limited. In both the Seyoum and Milewski [27] and Miro and Famiglietti [28] methods, because the models are empirical, replicating the results to other regions may be limited. Unlike physical models, the models in both the above approaches lack capturing information about the spatial variation of groundwater, which results in lower prediction values of water storage anomaly.

The objective of this study was to test the efficacy of downscaling groundwater level anomaly from GRACE TWSA anomaly and hydro-climate and land surface variables in a glacial system, in order to further improve the approach by Seyoum and Milewski [27], as well as to minimize existing limitations of downscaling methods (e.g., method replicability). This work presents a novel downscaling approach by using multiple variables and a two-stage Machine Learning (ML) method. A two-stage Boosted Regression Tree (BRT) Machine Learning (ML) method was used to develop the downscaling model. BRT is less sensitive to large magnitude differences between input variables, is flexible in working with datasets with missing data (e.g., GRACE), and is suitable for developing multi-stage models and ensembles to create one representative model.

In order to incorporate seasonal variability in both the training and testing sets, the entire dataset, which ranges from 2002 to 2016, was split into two subsets: Training (ranges from 2002 to 2014) and testing (ranges from 2015 to 2016) subsets. This represents an improvement to the previous ANN approach by [27], where the data for training and testing were randomly selected from the entire dataset. Lastly, variable importance and comparison analysis were conducted to understand groundwater variation in the glacial system. Furthermore, a spatial randomization of the GRACE pixels was tested to check the effect of GRACE grid size on the overall accuracy. The results will improve the use of coarse resolution GRACE data in small-scale studies (e.g., groundwater trend analysis in localized aquifer systems) and facilitate integration of GRACE data into local water resources management applications. This is especially important in data-scarce regions where in-situ monitoring of the groundwater system is problematic [11]. Further, promising results have been reported involving assessment of groundwater resources and depletion using global scale hydrologic models and data assimilation [29] such as using the PCR–GLOBWB model [30] and the WaterGAP model [31], and

regional scale groundwater models using MODFLOW [32] in data-scarce regions. Integration of the proposed study in such large-scale models, for example, to improve model performances is vital. In addition, it can be used to improve model performances in data assimilation techniques. GRACE TWSA have proven to be good supplements and calibration data for the models but their use has been restricted due to coarse spatial resolution [33].

#### **2. Methods and Data**

This study presents a machine learning approach to map high-resolution groundwater level anomaly (GWLA) from GRACE TWSA and other land surface and hydro-climate data (e.g., vegetation coverage, land surface temperature, precipitation, streamflow). A two-stage ML model: (1) Individual ML models for each groundwater level measurement station and (2) an ensemble model, which combines multiple ML models representing each well, was designed to predict high-resolution GWLA at a finer resolution than GRACE's current resolution. Various publicly available satellite and model-based datasets were used to set up the downscaling model and were tested in time (2002–2016) and space (at 11 different locations) using in-situ groundwater level measurement data, which were excluded from model training. Predictor variables importance, interactions, and the influence of predictor variables on the predictand were analyzed in this study. In addition, GRACE TWSA averaged over the entire study area, TWSA from each 1◦×1◦ grid, and TWSA average over a 3◦×3◦ moving window were used and spatial randomization of the grids were implemented to test the effect of scale of GRACE data.

Previous research has demonstrated that storage in the vadose zone contributes to the total GRACE signal in some areas [34]; however, due to the poor hydraulic characteristics of the glacial till, we assumed that the TWSA signal from GRACE is primarily dominated by groundwater. Therefore, the GRACE TWSA was directly mapped to groundwater storage anomaly. This minimizes further disaggregation of GRACE TWSA and reduces accumulation of errors associated with processing during the disaggregation of GRACE signal. Further, the downscaling approach was developed only considering the glacial aquifer system found covering the bedrocks in the study area, thus excluding the bedrock aquifers. This is because the bedrock aquifers are deep confined aquifer systems in most parts of the study area with little or no modern recharge, specifically in the central and southern parts [35]. Therefore, the deep bedrock aquifers have less influence on the total water storage change from GRACE compared to the surficial glacial sediments.

#### *2.1. Study Area*

The study area is located in the state of Illinois, a part of the Midwestern U.S., and covers an area of ~150,000 km<sup>2</sup> (Figure 1). The surficial geology is mostly comprised of glacial drift sediments, while the land use is primarily rainfed agricultural row crop, corn, and soybeans. The major soil type in Illinois has a silty loam texture with a porosity of around 50%. Topographically, Illinois is characterized by mainly flat land with alternating ridges and plains (end moraines and till plains) in some areas that formed as a result of the retreating Wisconsin Episode glaciers [36]. A continental climate with cold winters and warm summers characterizes the region. The long-term average annual precipitation of Illinois is ~960 mm, whereas the long-term mean, minimum, and maximum annual temperature of Illinois are 11 ◦C, 5 ◦C, and 17 ◦C, respectively [37].

**Figure 1.** Location map of the study area and location of stream gauging stations and shallow groundwater wells (green: training wells, and red: testing wells) used in the study.

#### *2.2. Data Source and Processing*

Ten predictor variables (e.g., TWSA, precipitation, vegetation cover, etc.) and one predictand variable (well-based groundwater level anomaly) were used to train and test the multi-stage ML models. The variables, data sources, and processing steps are summarized in Figure 2, with details provided in the subsections below. All the data sources, with variable spatial and temporal resolutions were converted into uniform temporal (monthly) and spatial (0.25◦ × 0.25◦) resolutions. Following the uniform resolution process, a distance weighted average scheme, based on the location of the groundwater level measurement stations, was implemented to extract timeseries input data from gridded raster datasets (Figure 3). If a given groundwater level measurement station was within the threshold distance of 0.1◦ from the grid cell node (Figure 3a), the value of that grid cell node was used as input data in to the model. However, if a groundwater measurement station was located outside of the threshold distance from the grid cell node, a distance weighted average scheme was used to get the average of the nearest four grid cell nodes surrounding the station (Figure 3b).


**Figure 2.** Summary of variables, data type and sources, and processing applied on the dataset used in the first and second stages of the downscaling model.

**Figure 3.** Distance-weighted average scheme showing how input data were extracted from grid cell nodes (symbol: red dot) based on the location of groundwater measurement station (symbol: green star) and a distance threshold value of 0.1◦ (**a**) when a groundwater measurement station is within the threshold distance from grid cell node and (**b**) a groundwater measurement station is located outside of the threshold distance from the grid cell node.

#### 2.2.1. Terrestrial Water Storage Anomaly (TWSA)

The GRACE mission, launched in 2002, is designed to track changes in the Earth's gravity field using two identical satellites separated 220 km from each other at a 500 km orbital altitude [38,39]. GRACE detects change in gravity anomaly by using the high precision aboard instrument with the K-band ranging system capable of measuring the disturbance in orbital distances due to gravity variations (or mass changes) with an accuracy of 1 micron [39]. After atmospheric and oceanic effects are accounted for, the remaining signal at a monthly timescale on land is mostly related to variations of terrestrial water storage [40]. An ensemble mean of the latest release version (RL-05 gridded; 1◦ × 1◦) level-3 GRACE product from three processing centers: The Center for Space Research at the University of Texas, Austin (CSR), the Jet Propulsion Laboratory (JPL), and the GeoforschungsZentrum Potsdam (GFZ) were used to ensure the highest level of accuracy [27,40]. To restore the signal attenuated during filtering and truncation of GRACE TWSA observations during processing [40], the GRACE TWSA was multiplied by the scaling factor supplied with the GRACE data. The study time span is restricted by GRACE data availability that ranges from April 2004 to July 2016. GRACE land data are available at http://grace.jpl.nasa.gov, supported by the NASA MEaSUREs Program [40,41].

#### 2.2.2. Precipitation (P)

Precipitation data used in the model were obtained from the Tropical Rainfall Measurement Mission (TRMM) satellite. The TRMM Multisatellite Precipitation Analysis (TMPA) 3B43 product that provides monthly global (between 50 ◦N to 50 ◦S) precipitation intensity at 0.25◦ × 0.25◦ (~27.8 km) spatial resolution [42] was used in the study. The TRMM 3B43 product is available on the web from NASA Giovanni (Geospatial Interactive Online Visualization and Analysis Infrastructure) service (URL: https://giovanni.gsfc.nasa.gov/giovanni/; access date: 26 September 2017).

#### 2.2.3. Stream Flow (Q)

Stream flow data used in the model was collected from the U.S. Geological Survey (USGS) streamflow database, the National Water Information System (NWIS) [43]. The USGS dataRetrieval R package was used for retrieving daily stream flow data from the study area [44,45]. The daily flow rate (cms) for each station was converted into monthly discharges per unit area, with simplified units in mm per month, by considering the contributing drainage area of each gaging station. Further, this calculated point discharge per unit area data was linearly interpolated and resampled to obtain gridded raster discharge data at the appropriate scale. Location of gages used to get stream flow data is shown in Figure 1.

#### 2.2.4. Vegetation Coverage (VEC)

As vegetation plays a role in the terrestrial water storage, the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite MOD13A3 (MODIS/Terra Vegetation Indices Monthly L3 Global 1 km SIN Grid V006) product that provides a monthly normalized vegetation index (NDVI) at a 1 km spatial resolution was used in the study [46]. Four scenes (h10v04, h10v05, h11v04, and h11v05) of the MOD13A3 product covered the study area and were downloaded from the NASA Earth Data Search website (URL: https://search.earthdata.nasa.gov/; access date: 28 August 2017). The mosaicked NDVI images were converted into percent vegetation coverage [%] using a threshold greenness index value of 0.7 [27], percent greenness (NDVI ≥ 0.7) was calculated for each grid cell size of 0.25◦ × 0.25◦.

#### 2.2.5. Land Surface Temperature (LST)

Similarly, the MODIS MOD11C3 (MODIS/Terra Land Surface Temperature and Emissivity Monthly L3 Global 0.05Deg CMG) product was used to obtain monthly land surface temperatures. It has a 5.6 km spatial resolution (0.05◦) [47]. The earlier product versions (V4 and V41) of this data were used over the latest version (V5), because the algorithm used in V5 shows underestimation of

LST under heavy aerosol condition [48,49]. Night and day LST layers of each monthly MOD11C3 were averaged into a single image to represent the mean temperature of a month. The MOD11C3 product is available on the NASA Earth Data Search website (URL: https://search.earthdata.nasa.gov/; access date: 31 August 2017).

#### 2.2.6. Snow Water Equivalent (SWE), Soil Moisture (SM), and Plant Canopy Water (CW)

Data for Snow Water Equivalent, Soil Moisture, and Plant Canopy Water were obtained from the Noah land surface model (LSM) based on the North American Land Data Assimilation System (NLDAS) [50]. The NLDAS-2 monthly climatology Noah model provides information at a monthly time scale with a spatial resolution of 0.125◦ × 0.125◦ for the conterminous US. The products were downloaded from the NASA Giovanni (Geospatial Interactive Online Visualization and Analysis Infrastructure) service (URL: https://giovanni.gsfc.nasa.gov/giovanni/; access date: 6 October 2017).

#### 2.2.7. Drift Thickness (H) and Aquifer Characteristics (K)

Since aquifer heterogeneity is significant in the glacial deposit, integrating subsurface information about the aquifer material in the ML model is crucial. Gridded hydrogeologic properties for the glaciated United States were developed by Bayless, et al. [51] from water-well drillers' records. From this product, hydrogeologic information, such as hydraulic conductivity and thickness of the glacial deposits, was extracted and used in the model. The data can be obtained from the USGS's Science Base website, https://www.sciencebase.gov/catalog/item/58756c7ee4b0a829a3276352; access date: 8 August 2017). A sample gridded map showing all the input variables used in the methodology is shown in Figure 4.

#### 2.2.8. Groundwater Level Data (GWL)

In-situ groundwater level data were used as a target variable (predictand) in the ML model. Continuous groundwater level measurements are available from 32 stations in the study area, and were collected from the Illinois State Water Survey [52]. Daily data were averaged into monthly groundwater level data. Furthermore, to mimic GRACE's long-term anomaly data, the Groundwater Level Anomaly (GWLA) was calculated by subtracting the long-term average (according to GRACE's long-term mean, 2004–2009) from individual GWL observations. The Shallow Groundwater Wells Network data is available on the Illinois State Water Survey website (URL: https://www.isws.illinois. edu/warm/groundwater/: access date: 21 June 2017).

**Figure 4.** Maps showing sample input variables used in the methodology for the month of January 2016. All the data are resampled to 0.25◦ × 0.25◦ grid resolution, except Gravity Recovery and Climate Experiment (GRACE) and Terrestrial Water Storage Anomaly (TWSA) data, which are a 1◦ × 1◦ grid.

#### *2.3. Modeling Algorithms*

Downscaling GRACE to predict a high-resolution groundwater level anomaly and determination of variable importance was accomplished using a Boosted Regression Tree (BRT) also known as a gradient boosting technique. BRT, which is a combination of statistical and machine learning techniques, aims to improve the performance and prediction accuracy of a single tree-based model produced by fitting and combining a large number of tree-based models [53,54]. First, a sequence of several simple trees (weak models) is fitted to the data, and an ensemble (boosting) technique is applied to produce a robust model from several models [53]. A tree-based method is conceptually simple yet powerful [55] and has advantages over other predictive learning techniques in that the tree-based method is easy to interpret, handles missing values, is not affected by outliers and/or does not need prior data transformation, and it has the ability to deal with irrelevant inputs. For further details, refer to Friedman, Hastie and Tibshirani [55] and Elith, Leathwick and Hastie [53]. In addition, BRT works best with continuous and categorical data. The simplified theory of boosted regression method is briefly explained below.

Figure 5 shows how a tree-based regression divides a covariant space by partitioning the predictor variables [56] and creates a simple tree-based model. A dataset space indicates a relationship between predictors (X1 and X2) and predictand (Y) so that the predictand value can be approximated by many rectangular subspaces (Figure 5a) according to values of X1 and X2 and the splitting that forms the tree (Figure 5b). The method builds binary trees that partition the data into two samples at each split node. The variable for splitting and its split value are selected based on the conditions that best fit the data. In the example in Figure 5b, at first, the covariant space can be divided at X1 = 8. This is based on the splitting (X1 ≤ 8 and X1 > 8) that can represent the covariant space most effectively in terms of residual error. Consequently, the second split points will be selected in both the X1 ≤ 8 and X1 > 8 regions. The number of split nodes determines the tree size; the increase in tree size is controlled either by a predetermined tree size or by residual error.

**Figure 5.** A tree-based regression showing (**a**) a rectangular covariant space created based on predictors X1 and X2 and the predictand (target) Y variable, and (**b**) split nodes of the decision tree partitioned based on predictor variables and residual error.

During boosting, a series of several simple trees is built based on the results of the prediction residuals of the preceding tree. Figure 6 shows a schematic of successive regression trees based on residuals. As shown in Figure 6, Decision Tree (2) is built based on a tree (partition) fitted to the residuals from Decision Tree (1), while Decision Tree (3) is a tree (partition) fitted to the residuals from Decision Tree (2). This helps to find a partition (Decision Tree (M)) that will further reduce the residual (error) variance for the data (as shown in the bottom graph of Figure 6). As a result, boosting decision trees improves their accuracy, often dramatically [55]. In boosting, decision trees are fitted iteratively to the training data and the algorithms vary in how they quantify lack of fit and select settings for the next iteration [53]. In this study, LS Boost (least-squares regression) algorithm was used for BRT modeling (see Friedman [57]). Optimization of hyperparameters (e.g., number of learners, maximum number of splits, and learning rate) (see Elith, Leathwick and Hastie [53] for details) and overall model design and simulation was conducted using MATLAB® 2017b. The model was constructed and tested using 10-fold cross-validation to determine the optimal hyperparameters, number of learners, maximum number of splits, and learning rate.

**Figure 6.** A schematic diagram showing the boosting processes in the Boosted Regression Tree (BRT) method.

#### *2.4. Downscaling Model Design*

A two-stage ML approach was used in this study to generate a downscaled GWLA grid from GRACE TWSA. The first stage involved building multiple temporal GWLA prediction ML models (M1, M2, ..., M21) for individual groundwater wells in the study area (top left section of Figure 7). The second stage comprised of integrating these multiple models (from the previous stage), along with additional spatial information about the glacial deposit (e.g., hydraulic conductivity), into a single spatial downscaling model. In the second stage, a single spatial GWLA prediction model (M ) was produced for the entire study area (bottom left of Figure 7) and used to generate a downscaled GWLA at 0.25◦ x 0.25◦ resolution (the right section of Figure 7). As shown in the figure, the GWLA at a given month can be predicted (1) using GWLA predicted based on individual temporal models of training wells and water-budget variables (X) and (2) using two dimensional GWLA pattern generation based on intermediate variables (Y') from the first models and spatial characteristics (Z) for each grid cell.

The GWLA prediction models in the first stage, in addition to GRACE TWSA, use terrestrial water cycle variables (e.g., vegetation coverage, LST, P, Q, SWE, plant canopy water, and soil moisture) (the specific variables used in each stage are shown in Figure 2) as predictors, and in-situ-based GWLA as a predictand. While in the second stage, predictor variables are restricted to GRACE TWSA and aquifer characteristics (e.g., glacial thickness, hydraulic conductivity). Consequently, the spatial pattern of GWLA can be explained by glacial deposit characteristics in the second stage model.

The model was tested using data independent of training (calibration). Eleven groundwater well stations, excluded from the training sets, were used to test the performance of the spatial downscaling model (M ). The strength of the relationship between individual predictor variables and predictand was analyzed using variable importance (VI). The measure of variable importance is based on how many times the variable is selected for splitting and how much the model is improved as a result of the splitting (see Friedman and Meulman [58] for details). Relative VI is calculated for each input variable and converted to a 0 to 100 scale. In addition, the sensitivity of each predictor variable was assessed using partial dependence plots that show the dependence between the predictand and a given predictor variable, while averaging the influence of all other predictors [58].

**Figure 7.** The conceptual design of the downscaling model. The process begins at the top left side of the figure (represents models in the first stage), followed by the bottom left (the spatial downscaling ensemble model in the second stage), and the right side explains how the prediction of the downscaled and gridded GWLA product is produced.

During training and testing error metrics such as Mean (residual) Error (ME), Mean Absolute Error (MAE), Root-Mean-Square Error (RMSE), correlation coefficient (R), and Nash-Sutcliffe Efficiency (NSE) were used to evaluate the performance of the models by comparing model-simulated results with the observed GWLA.

#### **3. Results**

The results are presented according to the approaches applied in the downscaling procedure (2.4). Various GRACE grid sizes and spatial randomization of the GRACE grids were tested to standardize input from GRACE data and to test its effect on model accuracy. The results showed there is no clear improvement in the downscaling model. This could be due to the data where adjacent GRACE time series grids are highly correlated. As a result, the grid size and location have no impact in the model accuracy of the downscaling model. Lastly, the limitations of the method were discussed with respect to the actual condition in the study area; implications for future studies are also suggested.

#### *3.1. Model Simulation Results*

From the total of 32 groundwater level measurement stations in the study area, individual ML models were constructed for 21 of them, which were selected based on their spatial distribution. As the sample size, which is limited by GRACE sample size, was small to train the models, a ten-fold cross validation method was used. This method divided the entire data into ten subsets. Each subset was used nine times as a training set and one time as a test set. A model was fitted to the training set and evaluated based on the test set. As a result, the error estimation for model training was provided averaged over all trials to get the total effectiveness of the model. Table 1 shows the overall statistics for model training of individual models (in the first stage ML models) and the ensemble downscaling model in the second stage. Generally, the results showed that the models exhibited excellent performances. The average ME, RMSE, R, and NSE values are 4.5 mm, 306 mm, 0.91, and 0.82, respectively. Similarly, the ensemble downscaling model shows an excellent performance with ME, RMSE, R, and NSE values of 0.0 mm, 192.0 mm, 0.98, and 0.96, respectively. Wells with good performance metrics (e.g., M7 and M10) imply the predictor variables explained well the observed GWLAs. On the other hand, wells with low performance metrics (e.g., M6, M12, M15, and M19) indicate that the predictor variables were short of fully explaining the observed GWLA. Note that the units are in mm for GWLA.

**Table 1.** The overall statistics showing model performance by comparing model-predicted GWLA with the in-situ-derived GWLA data (data ranges from 2002 to 2016) for the training (calibration), M1-M21 include models in the first stage, and M' is the ensemble downscaling model.


Figure 8 shows selected time series model-simulated GWLA vs in-situ-based GWLA. The models predicted the variation in GWLA very well for most of the groundwater measurement locations in the study area (e.g., Figure 8a). However, a few wells showed a slight underprediction (or overprediction) of the GWLA; specifically, underprediction in the picks of positive anomalies and overprediction in the picks of negative anomalies (Figure 8b). Other exceptions include GWLA models for Galena (Figure 8c) and Kilbourne (Figure 8d) groundwater wells. Where the model for Galena station overpredicted the peak GWLA, whereas the model for Kilbourne station fairly captured the long-term GWLA variability; however, the model produced time-varying seasonal signal compared to the observed GWLA data. This is attributed to the fact that the Kilbourne well is located in an irrigated agricultural region of Illinois. Due to the lack of data, human impacts (e.g., pumping) were not considered as an input in this study. As shown in the figure, the hydrograph from the in-situ groundwater level data exhibited typical characteristics of pumping water level where little or no seasonal fluctuations were observed.

**Figure 8.** Time series graphs showing model-simulated and in-situ-derived GWLA for selected groundwater measurements stations: (**a**) Bondville, (**b**) Fermi, (**c**) Galena, and (**d**) Kilbourne. The blue line shows model-simulated while red is observed GWLA.

#### *3.2. Variable Contributions and Sensitivity*

Several studies indicated the GRACE signal is dominated by sub-surface storage (e.g., groundwater storage) [59,60]. The variable importance corroborated that TWSA from GRACE is an influential predictor variable of the predicted GWLA. Discharge is the second most important predictor variable. All the models (in both the first and second stages) indicated TWSA as a primary predictor variable (Figure 9a,b) for the downscaled GWLA. Figure 9a shows the percentage contribution of each predictor variable across all models in the first stage where individual models have been built for each groundwater level measurement station. Figure 9b shows the percentage contribution of predictors in the second stage of the downscaling approach. In the first stage, stream discharge and soil moisture storage are the second and third most influential predictor variables, respectively. This can be explained by the glacial aquifer system in Illinois, which is mainly characterized by a shallow groundwater level, which influences baseflow contributions to the streams. The remaining predictor variables (e.g., precipitation (P), land surface temperature (LST), snow water equivalent (SWE)) are less and equally influential predictor variables. The relationship of these variables, though they directly or indirectly affect the amount of recharge to the groundwater, may not be simple and direct. In the second stage (downscaling model), the GRACE TWSA is relatively the most influential predictor variable, followed by a few predicted GWLA from the models in the first stage (Figure 9b).

**Figure 9.** Percentage contribution of each predictor variable. (**a**) Boxplots for each predictor across all the models in the first stage and (**b**) bars for each predictor in the second stage, downscaling model.

Figure 10 displays the Partial Dependence (PD) plots for selected models, which show the effect of a given variable on the response (predictand) after accounting for the average effect of all other predictor variables. Generally, there is a non-linear relationship between the predictor variables and the predictand (GWLA). The partial responses show a direct relationship between GWLA and GRACE TWSA, stream discharge, precipitation, and soil moisture. However, the magnitude effect of these predictor variables on the GWLA is different (see the y-axis of the graphs in Figure 10). The GRACE TWSA and soil moisture control a wider range of the GWLA. The GWLA has an inverse relationship with land surface temperature and plant canopy water, as these predictor variables increase as the GWLA decreases. LST and CW favors evapotranspiration while reducing recharge. No clear relationship exists with vegetation coverage (greenness index) or snow water equivalent. The relationship between vegetation coverage and GWLA is not direct (Figure 10f), where GWLA responds to little to no vegetation and high plant periods.

**Figure 10.** Partial dependence plots for predictor variables in selected GWLA prediction models (**a**–**h**). The cross marks indicate distribution of the data points.

#### *3.3. Testing and Spatial Accuracy*

This study presents a novel approach that allows simulating higher resolution GWLA at different times and locations in space. The method was verified by comparing groundwater level data that was independent of model construction. Figure 11 displays the distribution of test wells (red stars) and their statistical values (labeled next to the stations, the circles are also drawn to scale). Generally, the testing results showed that the statistical values are within the acceptable ranges. Half of the test wells have NSE values from 0.5 to 0.6, the remaining have values between 0.23 and 0.48. One well, Big Bend station, has shown a very low NSE value of −1.37; however, it has a high correlation coefficient value of 0.75. This well is located within the floodplain of the Rock River where the groundwater interacts with the river via baseflow. The direction of seasonal fluctuation (indicated by high correlation coefficient) can be simulated well by the downscaling model mainly influenced by stream discharge. However, the model falls short of predicting the magnitude of the GWLA compared to the observed GWLA data, which is different from the magnitude of discharge. Generally, the correlation coefficients for testing are within the range of 0.66 to 0.82. The test results show that the downscaling approach presented here satisfactorily simulates the GWLA data at 0.25◦ × 0.25◦ spatial and monthly temporal scale from GRACE. The output scale is defined based on the scale of input data. TRMM has the lowest resolution of the input data at 0.25◦ × 0.25◦. Though not tested in this study, it is possible to downscale further to a higher resolution using high-resolution input data.

The background image in Figure 10 shows a sample-predicted GWLA (for the month of December 2007) for the study site at 0.25◦ × 0.25◦ resolution. We can see that there is significant GWLA spatial variation during that month. A negative groundwater level anomaly of more than 500 mm (areas in yellow and red) was observed in the southern and northwestern parts of the study site, whereas the north and northeastern parts gained a positive groundwater level anomaly up to 500 mm (areas in blue). Overall, the similar degree of heterogeneity in groundwater behavior is observed in other months from model output compared to that of GRACE data.

Figure 12 shows the timeseries model-predicted versus in-situ-derived GWLA for each test well in the study area. Differences in model performances are observed between different test wells; good predictive performances are observed in, e.g., test wells at Fairfield and Freeport stations (Figure 12c,d) and poor performance in the test well at Big Bend station (Figure 12b). As seen in Figure 12, Big Bend station data availability is low, resulting in poor NSE and R values for this station due to sample size. In some of the test wells, the model underestimates the GWLA (e.g., Figure 12a,d,e). The reverse, an overestimate of the GWLA, is observed in the St. Peter test well data (Figure 12g). Generally, the model predicted the timing and seasonal variability (direction) of the GWLA. However, in some instances (as described above), the model less accurately predicted the magnitude of the high and low groundwater anomalies. This is due to the skill of the models mainly controlled by input data, specifically, data for the predictor variables. The authors believe the variances in the predictor variables may not be capable of fully explaining the magnitudes of variability in the predictand variable (GWLA). Despite some limitations, as indicated by the statistical measures above (Figure 11), the model predicts the monthly variations in the groundwater level moderately well and better predicts the long-term (seasonal to interannual) variability in the groundwater level. For example, in stations such as Barry, Big Bend, Freeport, Good Hope, and SWS (Figure 12a,b,d,e,i), the model predicted the dry anomalies (e.g., periods from 40 to 60 and 120 to 140) and wet anomalies (e.g., periods from 80 to 120).

**Figure 11.** Map showing predicted GWLA at 0.25◦ × 0.25◦ scale, overlain circles (drawn to scale) with performance metrics (NSE: Nash-Sutcliffe Efficiency and R: Correlation coefficients) are the locations of in-situ groundwater level measurement stations. Performance metrics indicate the comparison between model-predicted and in-situ GWLA.

**Figure 12.** Monthly timeseries model-predicted (blue line) vs. in-situ-derived GWLA (red) for test wells in the study area: (**a**) Barry, (**b**) Big Bend, (**c**) Fairfield, (**d**) Freeport, (**e**) Good Hope, (**f**) St. Charles (**g**) St. Peter, (**h**) Stelle, and (**i**) SWS stations.

#### **4. Discussion**

Overall, the approach developed in this study demonstrates the integration of satellite- and model-based hydrological variables along with GRACE data to predict high-resolution groundwater level variation in a glacial aquifer system. The results clearly demonstrate that downscaled GWLA could be predicted at a resolution of 0.25◦ × 0.25◦ in new periods and/or space in the study area, a limiting factor in previous studies. Additional tests have been conducted to increase replicability of this study, such as testing the spatial scale of input variables (e.g., GRACE grid). Using the methodology developed in this study (a generic code is available from the authors upon request), end-users can implement it elsewhere to predict GWLA, given all the assumptions are valid. The two main assumptions are (1) GRACE TWSA is mainly controlled by the groundwater storage variation in the shallow aquifer system and (2) human influence (e.g., groundwater abstraction for irrigation from the aquifer) is not considered as an input variable in the models.

The purpose of using publicly available data (e.g., satellite data) is to establish the applicability of this study in data-scarce regions where hydrologic variables are limited, such as lack of groundwater recharge rate and poorly distributed well data in space and time. However, significant uncertainty is expected from these data sources. For example, GRACE data has its own uncertainty that arises from GRACE processing. Other input data, some model-based and satellite-based, have their own contribution to the total input data uncertainty and potential error propagation. These errors are mitigated by using ML methods, as they do a good job of handling errors associated with input data.

Only the shallow groundwater system of the glacial sediment was considered in this study by assuming the GRACE signal is strongly related to shallow systems. As a result, shallow bedrock aquifers overlain by no or thin layers of glacial sediment, specifically in the northern part of the study area, were not included. The authors believe that the poor performance of the ML model in this part of the study region (e.g., at Crystal Lake and Fermi stations (see Table 1 and Figure 1)) could be due to exclusion of the shallow bedrock. Further, exclusion of human impact as a predictor variable has also affected the performance of wells located in the irrigated regions of Illinois (e.g., Kilbourne and Sincarte wells) where the main source of irrigation water is groundwater. As it has been seen in the hydrograph of Kilbourne stations (Figure 8d), the in-situ groundwater data shows more of a flat declining or rising groundwater level anomaly; however, the model that simulated the natural seasonal fluctuations was probably influenced by seasonal variability of precipitation and other predictor variables. Further, the spatial pattern of groundwater level anomalies may not be fully captured due to heterogeneity of the glacial system. The predictor variables used in this study may not be used to sufficiently describe this heterogeneity in order to predict the GWLA accurately in these instances. Adding more variables (e.g., storage characteristics of the aquifer, including the bedrock), as well as improved resolution of the input variables, potentially minimizes the bias and improves the prediction capacity of the ML model.

The methodology developed in this study can be directly applied in shallow sedimentary aquifer systems with similar settings where (1) GWLA response to climate variables is relatively quick and (2) anthropogenic influences (e.g., groundwater abstraction) are not extensive. For example, the downscaling model can be applied in the Chad Basin, the Congo Basin, and the Rift Valley Basin in Africa. The central part of the Chad Basin has a Quaternary unconfined sediment aquifer system made up of fluvio-lacustrine deposits and Aeolian sands and covers an area of 500,000 km2 [32]. This basin is one of the most poorly gauged basin in the world but provides fresh water resources for the people of several nations in the region. Similarly, it can also applied in the fluvio-lactustrine sediment aquifers covering the Rift Valley in East Africa and the Cenozoic sediment aquifers in the Congo Basin, among others.

To apply the methodology in a different setting, it is possible to omit or include predictor variables. For example, snow water equivalent and vegetation coverage are insignificant in arid regions. As a result, researchers can omit these predictor variables from the downscaling model. Likewise, if additional variables are necessary to explain the predictand variable, they can be easily included as predictor variables. For example, if the geological setting is a fractured (Karstic) bedrock aquifer system, it is possible to include additional predictor variables, such as fracture density and fracture spacing, to explain the predictand variable, the GWLA. In addition, as BRT works well with both categorical and numerical variables, it is possible to include categorical data (e.g., yes or no type data or qualitative data) by assigning numeric values such as 0 (no) and 1 (yes).

#### **5. Conclusions**

With continued threat from climate change and human impacts, high-resolution and continuous hydrologic data availability is crucial for predicting trends and water resource availability. GRACE has provided unprecedented opportunities to assess regional and global water resources, despite the limitations due to its coarse spatial scale. This study presented a two-stage machine learning approach to map high-resolution groundwater level anomaly (GWLA) from GRACE TWSA and other publicly available land surface and hydro-climate data (e.g., vegetation coverage, land surface temperature, precipitation, streamflow, etc.) in a glacial aquifer system in Illinois. In both stages, the models were validated and tested. The study finds GRACE to be the most influential predictor variable in the ML-based models, and the scale of GRACE or spatial location of GRACE grids have little to no impact on the performance of the ML-based downscaling model. This is due to the characteristics of GRACE grids, where they have a coarse resolution and where adjacent GRACE grids are spatially highly correlated. Stream flow and soil moisture storage are the second and third most influential predictor variables in the model.

Generally, the model training and testing results, in the first stage, showed excellent performances. The average ME, RMSE, R, and NSE values are ~4.5 mm, 306 mm, 0.91, and 0.82, respectively. Similarly, the ensemble downscaling model (in the second stage) shows an excellent performance with ME, RMSE, R, and NSE values of 0.0 mm, 192.0 mm, 0.98, and 0.96, respectively. Further, the downscaling model was tested satisfactorily using the monthly in-situ-based GWLA data excluded during model construction. A few test wells exhibit relatively poor statistical performances, which is attributed to the lack of anthropogenic influences (e.g., pumping) in groundwater-irrigated regions and insufficient representation of aquifer heterogeneity in the predictor variables. Various sources of uncertainties in the model and downscaled product are expected, such as uncertainty from the input satellite and model-based data and uncertainty from the model assumption (e.g., exclusion of the shallow bedrock aquifer system located in the northern part of the study area). However, noises due to input data were often reduced (or minimized) during the machine learning process.

**Author Contributions:** W.M.S. was responsible for conceptualization and design, overseeing the project, analysis and interpretation of the results, and manuscript preparation. D.K. involved in data processing, analysis, and interpretation of the result, and manuscript preparation. A.M.M. aided conceptualization of the project, interpretation of the results, and reviewing the manuscript.

**Funding:** This research was funded by the College of Arts and Sciences, Illinois State University. The APC was funded by the Powell fund, Department of Geography, Geology, and the Environment and the Office of Research and Graduate Studies, Illinois State University.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Hydrologic Mass Changes and Their Implications in Mediterranean-Climate Turkey from GRACE Measurements**

**Gonca Okay Ahi 1,\* and Shuanggen Jin 2,3,\***


Received: 14 November 2018; Accepted: 7 January 2019; Published: 10 January 2019

**Abstract:** Water is arguably our most precious resource, which is related to the hydrological cycle, climate change, regional drought events, and water resource management. In Turkey, besides traditional hydrological studies, Terrestrial Water Storage (TWS) is poorly investigated at a continental scale, with limited and sparse observations. Moreover, TWS is a key parameter for studying drought events through the analysis of its variation. In this paper, TWS variation, and thus drought analysis, spatial mass distribution, long-term mass change, and impact on TWS variation from the parameter scale (e.g., precipitation, rainfall rate, evapotranspiration, soil moisture) to the climatic change perspective are investigated. GRACE (Gravity Recovery and Climate Experiment) Level 3 (Release05-RL05) monthly land mass data of the Centre for Space Research (CSR) processing center covering the period from April 2002 to January 2016, Global Land Data Assimilation System (GLDAS: Mosaic (MOS), NOAH, Variable Infiltration Capacity (VIC)), and Tropical Rainfall Measuring Mission (TRMM-3B43) models and drought indices such as self-calibrating Palmer Drought Severity (SCPDSI), El Niño–Southern Oscillation (ENSO), and North Atlantic Oscillation (NAO) are used for this purpose. Turkey experienced serious drought events interpreted with a significant decrease in the TWS signal during the studied time period. GRACE can help to better predict the possible drought nine months before in terms of a decreasing trend compared to previous studies, which do not take satellite gravity data into account. Moreover, the GRACE signal is more sensitive to agricultural and hydrological drought compared to meteorological drought. Precipitation is an important parameter affecting the spatial pattern of the mass distribution and also the spatial change by inducing an acceleration signal from the eastern side to the western side. In Turkey, the La Nina effect probably has an important role in the meteorological drought turning into agricultural and hydrological drought.

**Keywords:** terrestrial water storage (TWS); GRACE; GLDAS; TRMM; drought; ENSO; NAO; Turkey

#### **1. Introduction**

Traditional methods of monitoring hydrological processes (e.g., in situ measurements of precipitation and soil moisture content) have generally been inadequate to characterize extreme hydrologic events [1,2]. Their temporal and spatial resolutions are not good enough to characterise water mass variations at a regional or global scale. In order to improve our knowledge to predict and monitor these water mass changes in the scope of drought analysis, flood potential assessment, groundwater changes, soil moisture analysis, etc., there are an increasing number of available datasets being produced, especially from remote sensing techniques. These techniques offer information

on vegetation, precipitation, surface water storage, evapotranspiration, soil moisture, groundwater, and snow components. Tropical Rainfall Measuring Mission (TRMM) [3], TRMM Multisatellite Precipitation Analysis (TMPA), Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) [4], the CPC MORPHing technique (CMORPH) [5], Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) [6], Global Satellite Mapping of Precipitation (GSMaP) [7–9], and Global Precipitation Measurement (GPM) [3] are some of the available satellite precipitation missions offering multiple products in real-time. While missions like Soil Moisture Active Passive (SMAP) provide satellite soil moisture values, the U.S. Department of Agriculture's (USDA) global reservoir and lake monitoring service provides information on satellite surface water levels by using near-real-time radar altimeter data. Finally, to study satellite surface/subsurface water, since 2002, the Gravity Recovery and Climate Experiment (GRACE) has provided entirely new observations suitable for quantifying and monitoring continental or regional TWS changes [10–13], groundwater changes [14,15], drought monitoring [16–19], and flood potential assessments [20] at a spatial resolution of a few hundred kilometers with uniform data coverage. The approaches used for estimating groundwater storage variations with the main applications of GRACE data for groundwater monitoring can be found in [21]. Moreover, since 2018, GRACE Follow-On (GRACE-FO) satellite gravity mission has been established to continue tracking Earth's water movements at different spatial scales. The results retrieved from the satellite gravity measurements are independent of the in situ data and might be interpreted independently. However, to increase the accuracy and make a better interpretation, the trend needs to be assessed based on a combination of additional data sets provided by other remote sensing technics. Moreover, as observed in many studies, the comparison of global hydrological models' results with GRACE data supports the analyses. The Global Land Data Assimilation System (GLDAS) [22], which is developed jointly by NASA and NOAA, simulates Terrestrial Water Storage through four main land surface models: VIC [23], NOAH [24], Mosaic [25], and CLM [26]. The Climate Prediction Center (CPC) model [27], Land Dynamics model (LadWorld) [28], WaterGAP Global Hydrology Model (WGHM) [29–31], and Organizing Carbon and Hydrology in Dynamic Ecosystems (ORCHIDEE) Land surface model [32] are examples of such global hydrological models [33], which provide a general overview on the use of global hydrological models' results (e.g., water storage change) as a reference to calibrate/validate GRACE data. Additionally, [33] found inconsistencies in the previous studies between hydrological model simulation results and GRACE-based observations and provided possible explanations for these inconsistencies.

In Turkey, besides traditional hydrological studies, TWS is poorly investigated at a continental scale with the new satellite techniques (e.g., GRACE). Previous studies in Turkey revealed about 0.7 cm/year the TWS variation [34]. Among them, the study of TWS variations in Turkey [35] with GRACE and GLDAS (NOAH) data sets [22] for the period from 2003 to 2009, showed a significant decrease of up to a rate of 4 cm/year for both data sets in the southern part of the central Anatolian region. This decrease present in both datasets is explained by decreasing groundwater variations confirmed by the existing well in the above mentioned regions. More recently, the effect of drought and water extraction on groundwater storage in central Turkey are also described [36]. They also showed how the groundwater storage can affect the TWS. In addition, long-term TWS changes in Turkey during the 2004–2014 period by associating GLDAS/NOAH data are studied and accounted for TWS variation between −17 and 16 cm in amplitude, with an important decrease in 2008 [37]. The requirement of further studies to isolate model errors and anthropogenic effects for Turkey, in order to explain the GRACE signal, which points out a robust acceleration in TWSA is emphasized also by other studies [38]. These prior studies point out the need for a summarizing and extended up-to-date study which accounts for a longer time period, associating different auxiliary data and parameters to understand and interpret the TWS change mechanism and possible drought events at a national scale.

In this paper, the water storage variation is studied over Turkey at a seasonal time scale for the period from April 2002 to January 2016 using monthly GRACE land mass grids (Level 3-RL05) from CSR. To compare the results, monthly grids of GLDAS data with a 1◦ × 1◦ resolution and three GLDAS hydrological models: MOS, NOAH, VIC, are also analyzed. To estimate precipitation over Turkey within the studied period, the TRMM-3B43 model is also considered, with a 0.25◦ × 0.25◦ resolution. In addition, ENSO, the self-calibrating Palmer Drought Severity Index, and NAO are compared with the satellite-derived GRACE TWS data.

#### **2. Data and Methods**

#### *2.1. Studied Area and Its Hydrological Characteristics*

Turkey is geographically located at approximately 36–42◦ N and 26–45◦ E, with an approximate area of 783.562 km2. A glance at a digital elevation model of Turkey reveals that mountains encircle the peninsula of Anatolia in four directions (see Figure 1).

**Figure 1.** Digital Elevation Model of Turkey.

In Turkey, the yearly mean precipitation rate is approximately 643 mm (501 billion m3 water). A total of 274 billion m3 of that water is transferred to the atmosphere by evaporation from soil, water, and vegetation surfaces. Additionally, 158 billion m3 flow to the sea and lakes in closed basins through streams. Furthermore, 28 billion m<sup>3</sup> out of the remaining 69 billion m<sup>3</sup> water feeding the groundwater contributes to the surface water. Besides, an additional water input of about 7 billion m3 comes from neighboring countries. Thus, the surface water potential of the country is 193 billion m3, or 234 billion m3 by adding the 41 billion m3 of water contributing to the groundwater [37,39].

Important and severe drought events experienced in Turkey are recorded between 1971–1974, 1983–1984, 1989–1990, 1996, 2001, and 2007–2008 [40,41]. There are also other studies investigating droughts in Turkey revealing an agricultural and hydrological drought starting from November/December 2006 to December 2008, so the drought period is recorded as 2007–2008 [42]. Especially in 2008, there was no variation in snow and precipitation for nine to 10 months [43].

#### *2.2. Terrestrial Water Storage (TWS) from GRACE*

Three processing centers, including CSR (Center for Space Research, Texas), JPL (Jet Propulsion Laboratory, California), and GFZ (GeoForschungsZentrum, Potsdam), provide official releases of GRACE gravity data at three different levels (Level 1, 2, 3), depending on the expertise and needs of the users for both the time-averaged and time-variable fields. In this study, Level 3 (RL05) land data of the CSR processing center have been used, which were ready to use as many necessary preprocessing steps had already been applied (removal of atmospheric pressure/mass changes, replacement of the C20 (degree 2 order 0) coefficients with the solutions from Satellite Laser Ranging [44], the estimation of the degree-1 coefficients (geocenter) from [45], the correction of glacial isostatic adjustment (GIA), destriping [46], Gaussian filtering). Level 3 data are in the form of GRACE-derived mass grids expressed as the TWS function of the gravity fields with 1 degree in both latitude and longitude (approx. 111 km at the equator) spatial sampling and estimates over land from the gravity coefficient anomalies for each month (Δ*Clm*, Δ*Slm*) [47], as below:

$$\Delta\eta\_{\rm land}(\theta,\phi,t) = \frac{a\rho\_{\rm dve}}{3\rho\_{\rm w}}\sum\_{l=0}^{\infty}\sum\_{m=0}^{l} \tilde{P}\_{\rm lm}(\cos\theta)\frac{2l+1}{1+k\_{\rm l}}(\Delta\mathcal{C}\_{\rm lm}\cos(m\phi) + \Delta\mathcal{S}\_{\rm lm}\sin(m\phi))\tag{1}$$

where *ρave* is the average density of the Earth, *ρ<sup>w</sup>* is the density of fresh water, *a* is the equatorial radius of the Earth, *P lm* is the fully-normalized Legendre associated function of degree l and order m, *kl* is the Love number of degree l [48], *θ* is the spherical co-latitude (polar distance), and *φ* is the longitude. All grids are obtained from the following link: https://grace.jpl.nasa.gov/data/get-data/monthly-massgrids-land/.

The first, additionally applied data processing, also recommended by the processing center for the Level 3 land data grid, in order to prevent possible attenuation of the surface mass variations due to the sampling and post-processing of GRACE observations (destriping, gaussian) and to regain part of the information loss in prior data processing, is the multiplication of one for each 1-degree land grid by a set of provided scaling coefficients, as shown below in Equation (2):

$$\mathbf{g}'(\mathbf{x}, \mathbf{y}, t) = \mathbf{g}(\mathbf{x}, \mathbf{y}, t) \times \mathbf{s}(\mathbf{x}, \mathbf{y}) \tag{2}$$

where *x* is the longitude index, *y* is the latitude index, *t* is time (month) index, *g*(*x*, *y*, *t*) is the grid node, *s*(*x*, *y*) is the scaling grid, and *g* (*x*, *y*, *t*) is the gain-corrected time series. Moreover, additionally applied data processing, leakage error correction (residual errors after filtering and rescaling), has been performed (as below in Equation (3)) with the provided file obtained from the following link: ftp:// podaac-ftp.jpl.nasa.gov/allData/tellus/L3/land\_mass/RL05/netcdf/, containing scaling coefficients (as mentioned previously) and leakage error estimates.

$$\mathbf{g}'\_{leak\\_corr}(\mathbf{x}, \mathbf{y}, \mathbf{t}) = \mathbf{g}'(\mathbf{x}, \mathbf{y}, \mathbf{t}) + \text{leakage\\_err}(\mathbf{x}, \mathbf{y}) \tag{3}$$

where *x* is the longitude index, *y* is the latitude index, *t* is the time (month) index, *g* (*x*, *y*, *t*) is the gain-corrected time series, *leakage\_err*(*x*, *y*) is the leakage error estimates, and *g leak*\_*corr*(*x*, *y*, *t*) is the scaled and leakage error corrected time series. In this study, monthly mass grids of GRACE land data (Level-3 RL05) from the CSR processing center concerning the period from April 2002 to January 2016 with a 1◦ × 1◦ spatial resolution have been used after applying these additional processing steps.

#### *2.3. Global Land Data Assimilation System (GLDAS) Models Data*

GLDAS has been developed jointly by scientists at the National Aeronautics and Space Administration (NASA)-Goddard Space Flight Center (GSFC) and the National Oceanic and Atmospheric Administration (NOAA)-National Centers for Environmental Prediction (NCEP). GLDAS is a global, high-resolution, offline (uncoupled to the atmosphere) terrestrial modeling system that incorporates satellite- and ground-based observations in order to produce optimal fields of land surface states (e.g., soil moisture, snow water equivalent, and canopy water storage...) and fluxes (e.g., rainfall, snowmelt, evapotranspiration...) in near–real time [22]. Currently, GLDAS drives four land surface models: MOS, NOAH, the Community Land Model (CLM), and the VIC. In this study, GLDAS version1 (GLDAS-1) monthly data of the four land surface models are downloaded from https://disc.gsfc.nasa.gov/datasets?keywords=gldas&page=1, with a 1◦ × 1◦ spatial resolution, concerning the period from January 2002 to January 2016. In these data sets, the GLDAS provides a time series of land surface states and fluxes (25 variables), which can be used to study water storage. The anomalies corresponding to the major part of the signal to TWS can be assumed to arise from the change in soil moisture (kg/m2), snow water equivalent (kg/m2), and canopy water storage (kg/m2). Hence, firstly, these land surface state variables are derived from the file covering Turkey and then, the TWS from GLDAS models is calculated, as shown by Equation (4):

$$TWS\_{GLDS} = \Delta SM + \Delta SWE + \Delta CWS \tag{4}$$

where, TWSGLDAS is the change in terrestrial water storage from GLDAS, Δ*SM* is the change in soil moisture, Δ*SWE* is the change in the snow moisture equivalent, and Δ*CWS* is the change in canopy water storage. Soil moisture values are averaged before integrating them into the TWS calculation, according to the three-layer model for VIC and MOS, and the four-layer model for NOAH. In this study, CLM models have not been used.

#### *2.4. Tropical Rainfall Measuring Mission (TRMM) Data*

TRMM is a joint mission between NASA and the Japan Aerospace Exploration (JAXA) Agency in order to study rainfall for weather and climate research (1997–2015). With the help of several space-borne instruments, TRMM satellite data allow precipitation from diurnal to interannual time scales to be measured, which led to improving our understanding of tropical cyclone structure and evolution, including important variability associated with the Madden-Julian Oscillation and with El Nino Southern Oscillation (ENSO), convective system properties, lightning-storm relationships, climate and weather modeling, and human impacts on rainfall. The data also supported operational applications such as flood and drought monitoring and weather forecasting (https://trmm.gsfc.nasa. gov/).

In our study, we used the TRMM-3B43 Level 3 gridded monthly satellite-gauge (SG) combination data set with a 0.25◦ × 0.25◦ degree spatial resolution downloaded from http://mirador.gsfc.nasa. gov/# to estimate the precipitation variations over Turkey.

#### *2.5. In-Situ Precipitation Data*

The Turkish state meteorological service provides an annual cumulative rainfall distribution (1981–2010) map produced on a GIS platform by kriging the in-situ rainfall data of 255 meteorological stations. This map has been downloaded from the following website: https://mgm.gov.tr/eng/ forecast-cities.aspx, and used further to compare/validate TRMM data and rainfall from ground meteorological stations.

#### *2.6. Self-Calibrating Palmer Drought Severity Index (SCPDSI) Data*

The SCPDSI [49] can estimate the departure relative to normal conditions in the surface water balance by using a hydrological accounting system [50,51]. The PDSI is primarily considered a meteorological drought indicator, and sometimes, an agricultural drought indicator [52]. The needed drought index data was downloaded from the following website: https://crudata.uea.ac.uk/cru/data/ drought/, as global land data covering the time period from 1901 to 2016 with a 0.5◦ latitude-longitude spatial resolution. Then, grids corresponding to Turkey and the time period from 2002 to 2016 were extracted from global data.

#### *2.7. El Niño–Southern Oscillation (ENSO) Index Data*

ENSO is described as warming on the ocean surface, or above-average sea surface temperatures (SST), in the central and eastern tropical Pacific Ocean. This is one of the most important climate phenomena on Earth due to its ability to change the global atmospheric circulation, which in turn, influences temperature and precipitation across the globe. The magnitude of the ENSO is often expressed by the Niño SST3.4 index, derived from the normalized Sea Surface Temperature (SST). El Nino (warm phase) and La Niña (cold phase) are two contrary phases of ENSO [53]. In order to understand the magnitude of ENSO which influences precipitation and to improve our understanding of the occurrence of drought events at a national scale, SST data were downloaded from the following

website: http://www.cpc.ncep.noaa.gov/data/indices/, where monthly ERSSTv5 (1981–2010 base period) Niño 1 + 2 (0–10◦ South) (90◦ W–80◦ W) Niño 3 (5◦ N–5◦ S) (150◦ W–90◦ W) Niño 4 (5◦ N–5◦ S) (160◦ E–150◦ W) Niño 3.4 (5◦ N–5◦ S) (170 W–120◦ W) is available. The time period (from 2002 to 2016) was extracted from global data.

#### *2.8. North Atlantic Oscillation (NAO) Index Data*

NAO is the variability in atmospheric mass circulation especially observed in the cold season months (November–April) over the middle and high latitudes of the Northern Hemisphere (from central North America to Europe and much into Northern Asia). The understanding of its mechanism on the surface temperature, storms, precipitation, ocean, and ecosystem results in understanding global climate change [54]. Strong positive phases (+) of the NAO tend to be associated with below-average precipitation over southern and central Europe. Conversely, above average temperature and precipitation anomalies are typically observed during strong negative phases (-) of the NAO (https://www.cpc.ncep.noaa.gov/data/teledoc/nao.shtml). The monthly mean NAO index data from 2002 to 2017 were downloaded from the following website: https://www.cpc.ncep.noaa.gov/ products/precip/CWlink/pna/nao.shtml.

To conclude the data and methods section and proceed with the results and analysis section, Table 1 below summarizes the studied research topics, used input data, methodology, and additional processing applied in this paper.



Interested time (~2002–2016) and land grids (Turkey) are extracted for all global data sets. Rainfall rate, Evapotranspiration, Soil moisture. P(t) = Precipitation from TRMM. R(t)Runoff derived from MOS, NOAH, and VIC models.

#### **3. Results and Analysis**

#### *3.1. Drought Analysis from Time Series*

In order to understand important drought periods, TWS time series derived from GRACE data and from GLDAS models were analyzed in the studied time period from 2002 to 2016. Figure 2 indicates residual (GRACE TWS-mean GRACE TWS) monthly TWS variations in Turkey (cm) according to GRACE and GLDAS models (MOS, NOAH, VIC).

**Figure 2.** Residual monthly TWS variations in Turkey (cm) according to GRACE (GRACE TWS-mean GRACE TWS) (in blue with circle), GLDAS-MOS TWS—mean (GLDAS MOS TWS) (in orange), GLDAS NOAH TWS—mean (GLDAS NOAH TWS) (in red), and GLDAS VIC TWS—mean (GLDAS VIC TWS) (in green) models.

Figure 2 indicates that GRACE TWS time series and GLDAS models are consistent. According to the GRACE TWS signal, the studied time period shows some sudden TWS decreases in September 2004 and 2008, which falls after dry summer periods, and in October 2014. However, to be prudent, the decrease in September 2004 may be an over estimation of GRACE solutions. Additionally, there are some significant increasing and decreasing trends in TWS during some time intervals. Firstly, GRACE TWS time series indicate an important decreasing trend period from February 2006 to November 2008 [42]. This underlines the beginning of an agricultural and hydrological drought, which occurred due to the below-average precipitation [55] from November/December 2006 to December 2008. In this context, GRACE time series provide an earlier warning (~9 months before) for the beginning of a decreasing trend in TWS. Briefly, GRACE indicates that 2008 is a remarkable year in terms of drought records. This finding is also supported with no snow and precipitation variation in 2008 [43]. After this decreasing time interval, Turkey is exposed to an increasing trend in TWS from November 2008 to March 2010. According to Figure 2, the second decreasing period begins in March 2010 and lasts until October 2014. During this time interval, historical records indicate the beginning of a meteorological drought in 2012, intensified with dry summers, as is usual for the Mediterranean climate. Even in 2013, the observed above-average precipitation levels (32 cm in 2013 where average is 27 cm) could not stop the decreasing trend in TWS because of the successively observed below-average precipitation that occurred in 2014 (17 cm) [56]. To resume, the analysis of GRACE TWS time series and the results of previous works revealed that the GRACE signal is mostly sensitive to agricultural drought (insufficient soil moisture content) and hydrological drought (significant reduction in winter precipitation) (2006–2008) compared to meteorological drought (little rain combined with increased temperature and lower humidity) (2012–2014) for the studied region. Following this prior understanding about the important increasing and decreasing trends in TWS, leading to important drought events in the time period from 2002 to 2016, we now focus (Section 3.2) on the spatial mass distribution of the TWS and its causes at the national scale.

#### *3.2. Spatial Mass Distribution and Its Causes*

The TWS variations have significant seasonal signals. As we are interested in seasonal variations with significant annual and semiannual periods, a mathematical function/model (Equation (5)), which includes the annual and semi-annual variations with linear trend terms, has to be used to fit the TWS time series [57]:

$$M(t) = a + bt + \sum\_{k=1}^{2} c\_k \sin(\omega\_k (t - t\_0) + \phi\_k) + \varepsilon(t) \tag{5}$$

where *M*(*t*) is the time series; *t* is the time; *t*<sup>0</sup> is a reference time; *a* is the constant; *b* is the trend; *ck*, *φk*, and *ω<sup>k</sup>* are annual amplitude, phase, and frequency, respectively; k = 1 is for the annual variation and k = 2 is for the semi-annual variation; and *ε*(*t*) is the un-modeled residual term. During the analysis of time series, strong annual signals are found, andk=1 year is used here. Using the least-squares method to fit the time series of GRACE data at each point, the annual amplitude, annual phase, and trend terms of TWS are estimated. Figures 3 and 4 show the annual amplitude (cm) and annual phase (degree) map of TWS variation of Turkey from GRACE data and GLDAS models (MOS, NOAH, VIC) concerning the 2002–2016 period, respectively. The TWS variation estimates of both methods show a good agreement in annual amplitudes and phases. The mean annual amplitude is 11.06 cm and 11.19 cm from GRACE and GLDAS models (MOS), respectively. Additionally, the mean annual phase is 21.90 and 22.98◦ from GRACE and GLDAS models (MOS), respectively. The larger annual amplitude values >20 cm (in red, Figure 3a,d) are observed in the eastern parts of Turkey. However, significant amplitude values also appear at shorelines (also Black Sea, Aegean, Mediterranean). The smaller annual amplitude of TWS variations of nearly 2.89 cm is seen in the middle of Turkey (central Anatolia, Figure 3a–d). According to phase plots, there is lateral variation (especially concerning GRACE phase plot, Figure 4a) from eastern part to the western part of Turkey, with a small increase through to the west side contrary to the amplitude plots. This means that lower phase values are observed in the eastern parts of Turkey, where larger amplitude values have been previously estimated. Table 2 indicates the mean annual amplitude, phase, and trend variations in Turkey according to GRACE and GLDAS models.

**Figure 3.** Annual amplitude (cm) of TWS variation of Turkey from (**a**) GRACE data and GLDAS models, (**b**) MOS, (**c**) NOAH, and (**d**) VIC concerning the 2002–2016 period.

**Figure 4.** Annual phase (degree) of TWS estimated from (**a**) GRACE data and GLDAS models, (**b**) MOS, (**c**) NOAH, and (**d**) VIC concerning the 2002–2016 period. In order to see small variation, the color bar is limited.



Finally, Figure 5 shows the trend (cm/yr) of TWS of Turkey according to GRACE and GLDAS models (MOS, NOAH, VIC) concerning the 2002–2016 period.

**Figure 5.** Linear trend of TWS variation of Turkey according to (**a**) GRACE data and GLDAS (cm/yr) models, (**b**) MOS, (**c**) NOAH, and (**d**) VIC concerning the 2002–2016 period.

Trend plots of Turkey according to GRACE data (Figure 5a) also show lateral increasing variation from the eastern part to western part, as observed in phase plots (Figure 4a), while GLDAS models plots also show lateral variation, but in a mostly heterogeneous way (Figure 5b). Distinctive trend values (≥1 cm/yr) appear in the eastern part of Turkey (Figure 5a,b), in accordance with amplitude values (Figure 3a,d).

Even GRACE and GLDAS 2D plots agree with each other and show the spatial distribution of TWS; however, they do not explain the reason for the above-presented spatial distribution. To study the possible reasons for this, GRACE 2D plots are compared with the TRMM-derived precipitation data shown here as 2D sections. Figure 6 shows the (a1) annual amplitude (cm) of TWS variation from GRACE data and (b1) annual amplitude of precipitation estimated from TRMM model, (a2) annual phase of TWS (degree) estimated from GRACE data and (b2) annual phase of precipitation (degree) estimated from TRMM model, (a3) annual trend of TWS variation from GRACE data (cm/yr), (b3) annual trend of precipitation (cm/yr) estimated from the TRMM model between 2002–2016 in Turkey, and (c1) in-situ meteorological rainfall data.

**Figure 6.** (**a1**) Annual amplitude (cm) of TWS variation from GRACE data and (**b1**) annual amplitude of precipitation estimated from TRMM model (scales are different), (**a2**) annual phase of TWS (degree) estimated from GRACE data, (**b2**) annual phase of precipitation (degree) estimated from TRMM model, (**a3**) annual trend of TWS variation from GRACE data (cm/yr), and (**b3**) annual trend of precipitation (cm/yr) estimated from TRMM model between 2002–2016 in Turkey. (**c1**) Turkish state meteorological service's annual cumulative rainfall distribution (1981–2010) map produced by kriging the in-situ rainfall data of 255 meteorological stations on a GIS platform. Scales are different.

Figure 6a1,b1 show larger GRACE-TWS (in red, ≥20 cm) and TRMM anomalies (~7 cm) in the Eastern part of Turkey (see also (Figure 3b–d). The smallest TRMM-precipitation anomalies are observed in central Anatolia (in blue, ≤3 cm), in agreement with GRACE TWS (Figure 6a1). In addition, the TRMM 2D amplitude (Figure 6b1) and phase plot (Figure 6b2) reveal the input of the water because of the precipitation observed on the shorelines of Turkey (Black Sea, Aegean, Mediterranean). This result shows agreement with the GLDAS models' TWS annual amplitude plots (Figure 3b–d) and also with the in-situ annual cumulative rainfall distribution map of the Turkish state meteorological service (Figure 6c1). Trend plots in Figure 6a3,b3 indicate a positive trend through the western part, especially to the Aegean and Mediterranean shorelines of Turkey. Previous studies focused on groundwater loss and reservoir/lake storage change [58,59] also support an acceleration signal in GRACE analyses over the western part of Turkey [38], which has been spatially mapped in this paper. This acceleration seems to be related to the precipitation patterns or flow (from the precipitation-rich eastern part to precipitation-lacking Central Anatolia). However, this might not be the only reason for this and has to be studied in more depth (e.g., human-induced groundwater withdrawal, as mentioned in [36]). After studying the spatial mass distribution and its causes, we will now have a general look and try to focus on the analysis of the nature of the long-term mass change in TWS in Turkey (Section 3.3).

#### *3.3. Long-Term Mass Change*

The long-term variations of TWS are estimated and investigated according to GRACE and GLDAS models (MOS, NOAH, VIC) in Turkey. Figure 7 indicates residual mean monthly TWS variations in Turkey (cm) from GRACE and GLDAS models between 2002 and 2016. The mean TWS values of a specific month are first calculated by taking the average of all grids (77 grids) with different geographic coordinates in Turkey for each month of a specific year (e.g., 2002/04, 2002/05), and then all corresponding months of a specific year are averaged between them (January 2002, January 2003...). According to GRACE data, Figure 7 indicates an increasing behavior of the TWS variation from January to April. The monthly maximum of mean TWS at about 10 cm is reached in April. The TWS variation is negative from April to September, with a minimum of −12.88 cm in September. This corresponds to possible dry summer periods, as is normal for the Mediterranean climate. This decrease is followed by a gradual increase from September to December, when we expect more precipitation or snow for the central and eastern parts of the country. The seasonal variations of TWS estimated from GLDAS models are in good agreement with the results from GRACE.

**Figure 7.** Residuals mean monthly TWS changes for Turkey land calculated from GLDAS-MOS (in orange), GLDAS-NOAH (in red), and GLDAS-VIC (in green) models, and from GRACE data (blue circled) for 2002–2016.

In Figure 8, the long-term variation of mean monthly TWS in the spring, summer, autumn, and winter periods from 2002 to 2016 is studied from GRACE-derived TWS data (cm/month). The larger amplitudes of the long-term seasonal TWS signal are observed in spring (max: 20.64 cm, mean: 12 cm), corresponding to April (Figure 7). These orders of amplitudes are followed by the winter period, which is larger than the summer period. Surprisingly, the weaker TWS values (max: 3 cm, mean: −7 cm) are observed in autumn, corresponding to September (Figure 7), instead of summer, as one might expect. This can be explained by the emphasized drought conditions, arising after dry summers (see Section 3.4.1), which are systematically observed in September almost every four years during the studied time period (September/2004, September/2008, October/2014 see also Figure 2). For all seasons, an important decrease is observed in August 2008. This corresponds to the severe drought events experienced in Turkey, especially in 2008 (also Figure 2). To conclude, there is a major decreasing trend close to −1 cm/yr in Turkey for the studied period which has to be taken into account. The exhibition of the negative trend signal of GRACE time series can be attributed to drought conditions and groundwater withdrawal [35] for the years 2003–2008. The proper study of the long-term change of signal can reveal a drought prediction with the amplitude of a decreasing trend (cm/yr), as shown in our results, and can lead to a drought preparedness. In the next section, we will study the parameters (e.g., precipitation, rainfall rate, evapotranspiration, soil moisture) which might have influenced/contributed to the amplitude of the above-presented spatial distribution (Section 3.2) and long-term mass change (Section 3.3).

**Figure 8.** Long-term variation of mean monthly TWS of Turkey concerning (**a**) spring, (**b**) summer, (**c**) autumn, and (**d**) winter period from 2002 to 2016.

#### *3.4. Impacts on TWS Variations*

In this section, we performed statistical analysis with IBM SPSS25 software in order to understand descriptive measures and correlations between studied variables and GRACE TWS.

#### 3.4.1. Impact of Precipitation

In order to understand the impact on TWS variations, we used estimated precipitation (cm/month) from TRMM, which is an average rate over a month, and compared the result to the GRACE data. The comparison is shown in Figure 9, which is concerned with the mean monthly precipitation from the TRMM model and the mean monthly TWS derived from GRACE. The TRMM-derived precipitation ranges from 0.71 (July 2015) to 15.36 cm/month (December 2012), while the GRACE-derived TWS ranges from −17.48 (August 2008) to 20.64 cm (April 2006). The amplitude of the TRMM is smaller compared to GRACE data. This can be explained by the fact that GRACE data, displaying more of

an increasing and decreasing trend, is not only affected by the precipitation, but also by the other parameters, as we further investigate in this paper.

**Figure 9.** Mean monthly precipitation (cm/month) from TRMM model (in red) and mean monthly GRACE TWS (cm, in blue with circle) of Turkey during 2002–2016.

To give numerical orders of magnitude concerning the correlation between TRMM and GRACE TWS time series, firstly, with the Kolmogorov-Smirnov method, the data is checked to assess whether the distribution is normal or not. The results show that the TRMM distribution is not normal (p = 0.036 < 0.05). Hence, Spearman's rho method is preferred to more appropriately study the correlation between variables. As seen Table 3, there is a significantly positive correlation between TRMM and GRACE TWS with 0.34.


**Table 3.** Correlation between TRMM precipitation and GRACE TWS according to Spearman's rho method.

\*\* Correlation is significant at the 0.01 level (2-tailed).

#### 3.4.2. Impact of Rainfall Rate

Rainfall rate values (kg/m2) are available in the GLDAS models data files. We extracted corresponding values in the studied region, and converted (cm/month), averaged, and calculated the residual. Figure 10 shows the residual mean monthly rainfall rate (cm/month) extracted from GLDAS models (MOS, NOAH, VIC), from TRMM and residual mean monthly TWS variation from GRACE data from 2002 to 2016 for Turkey. The rainfall data changes between −3.5 and 6.5 cm/month (MOS) and −5 and 9.5 cm/month (TRMM). In Figure 10, a good agreement between all models of GLDAS (MOS, NOAH, VIC) and TRMM data is observed. The correlation between GRACE TWS, TRMM precipitation, and GLDAS models' rainfall rate is given in the Table 4.

**Figure 10.** Residual mean monthly rainfall rate (cm/month) from GLDAS-MOS (in orange), GLDAS-NOAH (in red), and GLDAS-VIC (in green) models, and from TRMM (in red striped) and residual mean monthly TWS variation from GRACE data (in blue with circle) from 2002 to 2016 for Turkey.

**Table 4.** Correlation between GRACE TWS, TRMM precipitation, and GLDAS models (MOS, NOAH, VIC) rainfall rate according to Spearman's rho method.


<sup>\*\*</sup> Correlation is significant at the 0.01 level (2-tailed).

There is a positive correlation between GRACE TWS and the rainfall rate derived from GLDAS models (r = ~0.24) and a strong positive correlation between TRMM precipitation and the rainfall rate derived from GLDAS models (r = ~0.82).

#### 3.4.3. Impact of Evapotranspiration

The estimation of evapotranspiration (ET) has been performed by using the water balance equation [60,61] as is traditional for a closed basin area, in the case of available observed streamflow data (see Equation (6)):

$$ET(t) = P(t) - R(t) - T\%S\_{GRACE} \tag{6}$$

where *t* is the time; *TWSGRACE* is the water storage derived from GRACE data; and *P*(*t*), *R*(*t*), and *ET*(*t*) are the precipitation provided from TRMM, runoff extracted from available GLDAS files of different models, and evapotranspiration, respectively. Figure 11 compares the residual mean monthly evapotranspiration values calculated from Equation (6) and GRACE TWS data in order to understand

the role of the evapotranspiration effect on the TWS parameter. Evapotranspiration values range from −3.8 to 8.6 cm/month (VIC). This finding shows that both data (GLDAS evapotranspiration and GRACE TWS) are in-phase. The amplitudes of evapotranspiration seem to impact the rainfall rate equally to the TWS amplitudes.

**Figure 11.** Mean monthly evapotranspiration (cm/month) calculated from precipitation (TRMM) minus runoff (from model (MOS, NOAH, VIC)) minus GRACE TWS variations between 2002 and 2016 for Turkey.

#### 3.4.4. Impact of Soil Moisture

First, for each GLDAS land model, soil moisture values (kg/m2) available in the GLDAS data files are extracted from the global grids, and then reduced to the studied region scale. In addition, according to the models, soil moisture values are summed with respect to the number of model layers, as follows: three-layer model for MOS and VIC and four-layer model for NOAH are converted to cm and averaged, and the residual mean monthly variations are finally calculated. Figure 12 shows the comparison between residual mean monthly soil moisture variation (cm) obtained from GLDAS models and residual mean monthly GRACE TWS of Turkey from 2002 to 2016. Soil moisture values ranges from −15.4 to 18.9 cm (MOS), while GRACE TWS data ranges from −16 to 20 cm. According to Figure 12, it can be concluded that for Turkey, the most efficient parameter producing the important part of the GRACE TWS signal is the soil moisture. This statement is supported not only by the MOS model, but also by other GLDAS models (NOAH, VIC). Table 5 shows the correlation coefficients between GRACE TWS and soil moisture obtained from GLDAS models (MOS, NOAH, VIC).

**Figure 12.** Residual mean monthly soil moisture variation (cm) of Turkey from GLDAS MOS (in orange), GLDAS-NOAH (in red), and GLDAS-VIC (in green) models, and residual mean monthly TWS variation from GRACE data (in blue with circle) from 2002 to 2016.


**Table 5.** Correlation between GRACE TWS and soil moisture derived from GLDAS models (MOS, NOAH, VIC) according to Spearman's rho method.

\*\* Correlation is significant at the 0.01 level (2-tailed).

The correlation coefficient between GRACE TWS and soil moisture derived from the MOS model with r = 0.84 indicates a strong positive relation between these two times series and supports that the GRACE TWS signal is mostly dependent on soil moisture content in the studied region. To sum up, soil moisture is a key parameter in terms of drought monitoring because, as mentioned previously in Section 3.1, the insufficiency of soil moisture is a turning point indicating the change from classical drought, observed because of the lack of precipitation, to the agricultural drought. GRACE TWS time series are very sensible to agricultural drought (2006–2008).

#### *3.5. Understanding the Drought and Its Relation with Climatic Change*

In Section 3.4, we investigated the impact and correlation of different parameters (e.g., precipitation, soil moisture, etc.) on the signal amplitude, specifically on the TWS time series. We found that the precipitation is an important parameter which governs the pattern of spatial mass distribution (Section 3.2) and soil moisture produces the most important part of the GRACE TWS signal (Section 3.4.4). In this part, we decided to combine the available data and our findings with the self-calibrating Palmer Drought Severity Index, ENSO, and NOA index, which use also precipitation, temperature, soil moisture, and so on [51,52] to understand the type, the variability, and the severity of a drought event.

#### 3.5.1. Self-Calibrating Drought Severity Index (SCPDSI)

In Figure 13, non-seasonal GRACE TWS data are compared with the SCPDSI drought severity index.

The analysis of the SCPDSI time series shows a good agreement in terms of the increasing and decreasing trend of the signal with GRACE-derived residual TWS. From a general point of view, in Figure 13, except for some small variations, SCPDSI time series show that progressive decreasing periods (2002–2008, 2010–2014) and increasing periods are (2008–2010, 2014–2016) followed up with the GRACE TWS. Nevertheless, the SCPDSI index is not as sensitive as GRACE data to small variation, especially during 2002–2006 (miss the increase 02/2003–02/2004). September 2004 (may be also an overestimation of GRACE solution), 2008, and October 2014 appear as dramatic drought cases for both time series.

Statistically speaking, according to the Kolmogorov-Smirnov method, p is equal to 0.2 and the Pearson Correlation can be used here to study the correlation coefficients between GRACE TWS and the self-calibrating Palmer Drought Severity Index (SCPDSI), as shown in Table 6.

A positive correlation is found to be r = 0.25. To sum up, according to GRACE data, the SCPDSI index, and historical data, the reason for the drought can be primarily categorized as meteorological. In this step, we decided to extend our results and to more deeply study the underlying processes of the drought event, i.e., the causes resulting in a lack of precipitation from the point of view of climatic

change. For this reason, in the next section (Section 3.5.2), GRACE TWS anomalies are compared with the ENSO and NAO index, improving our understanding of the occurrence of drought events.

**Figure 13.** Mean monthly residual TWS variation from GRACE data (blue circled line, seasonal signal removed) and mean monthly self-calibrating Palmer Drought Severity Index (orange dotted line) from ~2002 to 2016 of Turkey.



\*\* Correlation is significant at the 0.01 level (2-tailed).

3.5.2. El Niño Southern Oscillation (ENSO) and North Atlantic Oscillation (NAO) Indices

Figure 14 compares the SST3.4 and NAO time series with the non-seasonal GRACE TWS anomaly time series.

**Figure 14.** Mean monthly residual TWS variation from GRACE data (blue circled line, seasonal signal removed) compared with the ENSO index Niño SST3.4 (El: El Niño and La: La Niña; Warm: orange shading and Cool: cyan shading) Niño SST3.4. (prepared as in [62]) and with monthly mean NAO index (in black).

According to [63], ENSO also affects the Mediterranean winter climate. During El Niño events, the Mediterranean cyclone track is shifted northward, which affects precipitation. Moreover, less precipitation in southwestern Europe, as well as the Black Sea area, during cold events, but more precipitation in the same regions during warm events are founded [64]. Figure 14 reveals that cold events (La Nina: cyan shading) corresponding to La Nina effect occur between 2006–2009 and 2010–2014. This can be interpreted as a climatic impact creating a lack of precipitation in Turkey. This statement is also validated by the decreasing trend of GRACE TWS and SCPDSI (Figure 13) time series. This lack of precipitation due to the possible effect of the La Nina phase first results in a meteorological drought (Figure 13), which then turns into agricultural and hydrological drought, as mentioned in [55]. Spatially speaking, as a possible proof of the impact of ENSO, as mentioned in [64], Black sea coasts in Figure 6b1 show a low (≤3 cm) annual amplitude of precipitation estimated from the TRMM model, unlike the other coasts.

According to Figure 14, the warm phase El Nino (orange shading) is observed between 2002 and the beginning of 2007 (with some interaction of the cold phase creating instantaneous drops) and 2014–2016. The climatic impact of the warm phase, resulting in an increase in precipitation, seems to appear as the increase of GRACE TWS and ENSO, while SCPDSI is more sensitive to the decrease of signal amplitude in this period due to the interaction of the cold phase (Figure 13). GRACE TWS anomalies (Figure 14) and rainfall rate (Figure 10) are important for these time intervals cited above. Spatially speaking, the bigger annual amplitudes of precipitation (≥7 cm) estimated from the TRMM model (Figure 6b1) are observed in Southeastern, Mediterranean, and Aegean parts of Turkey.

Concerning the NAO index, the time series do not show any strong positive or negative anomalies or trends. As a reminder, strong positive phases (+) tend to be associated with below-average precipitation, while strong negative phases (-) are related with the above average temperature and precipitation anomalies. In this case, the amplitude of the NAO time series is small. It can be concluded that NAO does not have any significative impact on the studied area. Table 7 shows the correlation between GRACE TWS with ENSO and the NAO index. There is a positive correlation in the order of r = 0.3 between GRACE TWS and the ENSO index. Additionally, GRACE TWS and the NAO index show a small negative correlation (r = −0.05).


**Table 7.** Correlation between residual GRACE TWS and ENSO Index.

\*\* Correlation is significant at the 0.01 level (2-tailed).

#### **4. Discussion**

Our results provide a broad context for the current hydrological status in Turkey by combining various external data sets (e.g., hydrological models, remote sensing techniques, drought indices) and reveal new drought events, spatial extension of the mass change, long-term variation, impacts on TWS, and the effect of climatic change, in addition to the previous studies within the GRACE mission operation time. The limitations of the work are related to the spatial resolution of the GRACE mission, which does not allow monitoring of the very high resolution surface mass change (e.g., dams, reservoirs); the lack or inaccessibility of the in-situ rainfall data. Which could provide information about groundwater withdrawal; and additionally and more specifically, the limited satellite mission

lifetime that ended in 2017. Even though there has been a new following mission, "GRACE-FO", which started operating on 22 May, there is a data gap between the two missions. For this reason, as future research to generate a new perspective to drought analysis, there is the aim to develop statistical modelling from GRACE time series. This connection between GRACE and GRACE-FO revealed by statistical modelling will be valuable in terms of the continuity of drought monitoring and prediction, especially in the case of missing data within the GRACE-FO operation period. We also demonstrated that GRACE is more sensitive to agricultural and hydrological drought and less sensitive to meteorological drought, which occurs in the case of a lack of precipitation, increase of temperature, and decrease in humidity. GRACE data might be combined with the datasets derived from remote sensing techniques that measure above-cited external data sets to conduct a more sensitive analysis and to predict meteorological droughts [65]. The combined solutions can be assessed in different regions to test the sensibility of the GRACE data to differentiate different types/states of drought.

#### **5. Conclusions**

According to the drought analysis studied in this paper from GRACE-derived TWS time series, Turkey experienced dramatic drought events in 09/2004 (may also be an overestimation of GRACE solution), 09/2008, and 10/2014. Moreover, TWS decreasing periods are recorded as follows: 04/2002–09/2004; 02/2006–09/2008; 03/2010–10/2014. In terms of assessment of the drought, GRACE can help to better predict the possible drought (starting from 02/2006) nine months before, with a decreasing trend observed in GRACE TWS times series compared to previous studies which do not take satellite gravity data (see only since 11/2006) into account. Moreover, the GRACE signal is more sensitive to agricultural and hydrological drought compared to meteorological drought. Spatially, mass amplitudes are larger (20 cm) in the eastern part compared to central (2 cm, smaller) or Aegean parts, related to received precipitation levels. Shorelines also show distinctive values compared to the central part. There is an acceleration signal from the eastern side to the western side, which is related to the precipitation. Concerning long-term mass change, Turkey experiences a decreasing trend in the order of 1 cm/yr. Rainfall rate, evapotranspiration, and precipitation constitute a small part of the signal, while soil moisture is the parameter most affecting the GRACE signal in the studied region according to soil moisture values derived from GLDAS models and GRACE TWS results having a strong correlation (r = 0.84). Precipitation has a specific impact on the pattern of the spatial mass distribution. In Turkey, we observed a meteorological drought turning into agricultural and hydrological drought due to the climatic impact of the La Nina effect (cold phase) resulting in a lack of precipitation in Turkey. The GRACE signal is very sensitive to this climatic change. It is worth mentioning that the NAO index does not show any meaningful anomalies and correlation with GRACE TWS (r = −0.05). Finally, in order to real-time monitor and estimate possible drought conditions in the future, either in Turkey or in another region, we propose the combination of the new and up-to-date satellite gravity mission data (GRACE-FO), offering more accurate measurements and providing information about mass decreasing and increasing trends; the precipitation to understand the spatial mass distribution patterns; soil moisture data (models and also in-situ) to monitor the occurrence of a possible agricultural drought; the ENSO index to predict possible excess or deficiency in precipitation; and the drought indices, which provide information about the type and the variability of the drought.

**Author Contributions:** Data curation, G.O.A.; Funding acquisition, S.J.; Methodology, G.O.A. and S.J.; Resources, G.O.A.; Software, G.O.A.; Supervision, S.J.; Writing – original draft, G.O.A.; Writing – review & editing, G.O.A. and S.J.

**Funding:** The work is supported by the Strategic Priority Research Program of Chinese Academy of Sciences (Grant No. XDA23040102) and Startup Foundation for Introducing Talent of NUIST (Grant No. 2243141801036).

**Acknowledgments:** We thank the following organizations for providing the data used in this work: the GRACE Project and CSR (Center for Space Research, Univ. Texas), the TRMM projects, and a Global Land Data Assimilation System (GLDAS) developed jointly by scientists at the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center (GSFC) and the National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Prediction (NCEP). We are very thankful to three anonymous reviewers who

helped us in improving the quality of our manuscript and also to Assist. Kamil Teke, Assoc. Prof. Semra Türkan from Hacettepe University, and Mustafa Serkan I¸sık from Istanbul Technical University for their help.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **GOCE-Derived Coseismic Gravity Gradient Changes Caused by the 2011 Tohoku-Oki Earthquake**

#### **Xinyu Xu 1,2, Hao Ding 1,\*, Yongqi Zhao 1, Jin Li <sup>3</sup> and Minzhang Hu <sup>4</sup>**


Received: 11 April 2019; Accepted: 25 May 2019; Published: 30 May 2019

**Abstract:** In contrast to most of the coseismic gravity change studies, which are generally based on data from the Gravity field Recovery and Climate Experiment (GRACE) satellite mission, we use observations from the Gravity field and steady-state Ocean Circulation Explorer (GOCE) Satellite Gravity Gradient (SGG) mission to estimate the coseismic gravity and gravity gradient changes caused by the 2011 Tohoku-Oki Mw 9.0 earthquake. We first construct two global gravity field models up to degree and order 220, before and after the earthquake, based on the least-squares method, with a bandpass Auto Regression Moving Average (ARMA) filter applied to the SGG data along the orbit. In addition, to reduce the influences of colored noise in the SGG data and the polar gap problem on the recovered model, we propose a tailored spherical harmonic (TSH) approach, which only uses the spherical harmonic (SH) coefficients with the degree range 30–95 to compute the coseismic gravity changes in the spatial domain. Then, both the results from the GOCE observations and the GRACE temporal gravity field models (with the same TSH degrees and orders) are simultaneously compared with the forward-modeled signals that are estimated based on the fault slip model of the earthquake event. Although there are considerable misfits between GOCE-derived and modeled gravity gradient changes (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*), we find analogous spatial patterns and a significant change (greater than 3σ) in gravity gradients before and after the earthquake. Moreover, we estimate the radial gravity gradient changes from the GOCE-derived monthly time-variable gravity field models before and after the earthquake, whose amplitudes are at a level over three times that of their corresponding uncertainties, and are thus significant. Additionally, the results show that the recovered coseismic gravity signals in the west-to-east direction from GOCE are closer to the modeled signals than those from GRACE in the TSH degree range 30–95. This indicates that the GOCE-derived gravity models might be used as additional observations to infer/explain some time-variable geophysical signals of interest.

**Keywords:** coseismic gravity gradient changes; gravity field model; GOCE; GRACE

#### **1. Introduction**

Because satellite gravity observations are not limited to Earth's surface conditions, and can cover the whole Earth quickly, they provide an independent way to detect the coseismic effects of large earthquakes, which is a good complement for other earthquake measurements (e.g., surface deformations), and are of great scientific significance.

A new generation of satellite gravimetry missions, including CHAMP (CHAllenging Minisatellite Payload) [1], GRACE (Gravity field Recovery and Climate Experiment) [2] and GOCE (Gravity field and steady state Ocean Circulation Explorer) [3], have been successfully implemented to detect global static and time-variable gravity signals. A large number of real observations and derived products from the CHAMP, GRACE and GOCE missions have been applied to Earth-science research, especially temporal gravity signals derived from the GRACE mission, which are used to monitor mass transport in the Earth system. The detection of coseismic gravity change signals using GRACE data has been widely studied. Generally, monthly gravity field models from GRACE before and after the earthquake were used to derive coseismic gravity or gravity gradient changes. Long-to-medium-wavelength coseismic and post-seismic gravity changes from large-scale earthquakes (e.g., the 2004 Sumatra-Andaman Mw = 9.1, 2010 Maule Mw = 8.8, 2011 Tohoku-Oki Mw = 9.0) have been adequately detected by the GRACE mission [4–17]. Wang et al. [13] and Li and Shen [17] also determined the coseismic gravity gradient changes caused by the 2011 Japan Tohoku-Oki earthquake. The GOCE mission has been proven successful for constructing regional geoid models combining with the EGM2008 and terrestrial gravity datasets [18,19], and can be used to study the lithospheric modeling, dynamic topography, and glacial isostatic adjustment [20–23]. However, a few studies have focused on inferring the coseismic gravity change signals from simulated data [24–26] or real GOCE observations [27–29].

The main scientific objective of the GOCE mission is to recover static gravity field models up to degree and order (d/o) 200 by using SGG (Satellite Gravity Gradient) observations [3]. Because of the lower sensitivity of the gradiometer on board of the GOCE satellite at lower frequencies, the gravity gradient observations are not sensitive to temporal signals, the main power of which is at long wavelengths (commonly, at spatial resolutions higher than 1500 km) [30,31]. However, the coseismic gravity change signals from the 2011 Japan Tohoku-Oki earthquake were still detected by the GOCE mission [27,28]. Garcia et al. [27] showed that GOCE's gradiometer, acting similarly to a seismometer in orbit, recorded the sound waves from the 2011 Tohoku earthquake. Fuchs et al. [28] combined a new vertical gravity gradient *V zz* from the diagonal components (*Vxx*, *Vyy*, and *Vzz*) along the orbit. The coseismic radial gravity gradient changes Δ*Vzz* in Fuchs et al. [28] were derived by subtracting the vertical gravity gradients computed by a reference model GOCO03s [32] from *V zz* with an along track and spatial filter applied to deal with the colored noise. Fuchs et al. [28] mentioned that they did not use the spatiospectral localization method like Han and Ditmar [26] to process geopotential coefficients, because there were no released global gravity field models (GGMs) from the post-earthquake GOCE data that could be used to detect coseismic gravity gradient changes.

In this paper, we choose a different approach than that proposed in Fuchs et al. [28]. First, we recover the GGMs (SH coefficients) before and after the earthquake from GOCE gravity gradient observations (*Vxx*, *Vyy*, and *Vzz*) along the orbit, based on the least-squares method. Note that, Fuchs et al. [28] did not estimate the SH coefficients of the pre- and post-earthquake global time-variable gravity field models. Then, we estimate the gravity changes (Δ*g*) and the gravity gradient changes (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) from the differences between two recovered global gravity models. In order to weaken the influences of colored noise and the polar gap problem from GOCE SGG observations on the recovered model, we use a tailored spherical harmonic (TSH) coefficients to recover the coseismic gravity change. Finally, we compare the results obtained from forward-modeled signals with the GRACE monthly gravity field models. Compared to Fuchs et al. [28], our approach can determine the coseismic gravity changes and the coseismic gravity gradient changes of other gravity gradient tensor (GGT) components in addition to Δ*Vzz*. The content of this manuscript is organized into four sections, beginning with a description of the methodology of GOCE data processing, forward modeling coseismic gravity gradient changes, and the computation of TSH coefficients in Section 2. The results and analysis are presented in Section 3. A summary and concluding remarks are provided in Section 4.

#### **2. Methodology**

#### *2.1. Recovering Gravity Field Model from GOCE SGG Observations*

In this paper, we derive the coseismic gravity and gravity gradient changes by evaluating the differences between a post-earthquake gravity field model and a pre-earthquake one from GOCE data; namely, we first recover gravity field models from the Gravity field and steady state Ocean Circulation Explorer Satellite Gravity Gradient (GOCE SGG) observations before and after the 2011 Tohoku-Oki earthquake. Here, we provide a brief description of the data processing strategies of recovering a gravity field model from GOCE SGG observations (more details can be found in Xu et al. [33]).

The component *Vij* of the second-order gravity gradient tensor (GGT) is normally defined in the Local North-Oriented Frame (LNOF) with the *x*-axis pointing north, the *y*-axis pointing west and the *z*-axis up, as follows [33]:

$$V\_{ij}(r,\theta,\lambda) = \sum\_{m=0}^{\infty} \left[ A\_m^{ij}(r,\theta) \cos m\lambda + B\_m^{ij}(r,\theta) \sin m\lambda \right], \quad \begin{matrix} A\_m^{ij}(r,\theta) \\ B\_m^{ij}(r,\theta) \end{matrix} = \sum\_{n=m}^{\infty} H\_{mn}^{ij}(r,\theta) \left\{ \frac{\overline{\mathbb{C}}\_{nm}}{\overline{\mathbb{S}}\_{nm}} \right\} \tag{1}$$

where the indices *i* and *j* define the gravitational gradient components (*xx*, *yy*, *zz*, *xy*, ... ) with respect to the LNOF axes (*x*, *y*, *z*); the indices *n* and *m* are the degree and order, respectively, of the spherical expansion; and (*r*, θ, λ) are the spherical coordinates in the Earth's Fixed Reference Frame (EFRF), where *r* is the geocentric radius; and θ and λ are the spherical colatitude and longitude, respectively. *Cnm* and *Snm* are the (fully normalized) geopotential cosine and sine coefficients.*Aij <sup>m</sup>*(*r*, <sup>θ</sup>), *Bij <sup>m</sup>*(*r*, θ), and *Hij nm*(*r*, θ) are the Fourier coefficients and transform coefficients, respectively; for more details, please refer to Koop [34].

Based on Equation (1), the functional and statistical models of the gravitational field recovered from the Satellite Gravity Gradient (SGG) data are defined as a standard Gauss-Markov model. Then, we can determine the geopotential coefficients *Cnm* and *Snm*, exploiting the least-squares (LS) method. However, the observed GGT of the GOCE mission is given in the Gradiometer Reference Frame (GRF), so we should transform the observations from the GRF to the LNOF by using the rotation matrix **R**L <sup>G</sup> [33]. However, the accuracies of the *Vxy* and *Vyz* components are lower than those of the other components, so they will contaminate the high-precision components (*Vxx*, *Vyy*, *Vzz*, and *Vxz*) in the transformation. To avoid this situation, we transform the base functions (*Hij nm*(*r*, θ) cos *m*λ sin *m*λ ) in Equation (1) instead of transforming the GGT observations in GRF. In particular, we multiply the matrix **R**<sup>L</sup> <sup>G</sup> and its transposed matrix **<sup>R</sup>**<sup>G</sup> <sup>L</sup> on both sides of the base functions in Equation (1) at every epoch. Hence, we have

$$\mathbf{V}^{\rm CRF} = \mathbf{R}\_{\rm L}^{\rm G} \mathbf{V}^{\rm LNOF} \mathbf{R}\_{\rm G}^{\rm L} \tag{2}$$

where **V**GRF and **V**LNOF represent the gravitational gradient tensor in the GRF and LNOF, respectively.

Because the power spectral density (PSD) of the trace of the GGT represents the total error of the summation of the diagonal components (*Vxx*, *Vyy*, *Vzz*), the SGG observations are high-precision only within the designed measurement bandwidths (MBW) from 0.005 to 0.1 Hz according to Figure 1 [35]. The noise outside of this MBW increases with the decreasing frequency for frequencies below 0.005 Hz, and the increasing frequency for frequencies above 0.1 Hz, especially for the 1/*f* behavior at low frequencies, which show the character of the colored noise of the gradiometer [35]. To handle this colored noise in the SGG data, we apply a bandpass Auto Regression Moving Average (ARMA) filter with a pass-band of 0.005–0.041 Hz to both sides of the linear observation equation, these being Equation (2) and Equation (1) [36]. The maximum frequency (0.041 Hz) of the pass-band approximately corresponds to the maximum degree (220) of the recovered gravitational potential model based on the following formula:

$$f\_{\text{max}} = N\_{\text{max}} / T\_r \tag{3}$$

where *Tr* = 5383 s is one satellite orbital revolution (cf. [37]).

According to Figure 1, the noise level of the components *Vxx* and *Vyy* is about 10 mE/ √ Hz (1 mew = 10−12/s<sup>−</sup>2), that of *Vzz* is about 20 mE/ √ Hz, which is consistent with Rummel et al. [38]. Thus, to combine the diagonal components (*Vxx*, *Vyy*, and *Vzz*) in the LS, we set the ratio of the standard deviation factors of *Vxx*, *Vyy*, and *Vzz* to 1:1:2.

**Figure 1.** Power spectral densities of the diagonal components (*Vxx*, *Vyy*, and *Vzz*) and trace of the Gravity field and steady state Ocean Circulation Explorer (GOCE) gravity gradient tensor.

#### *2.2. Tailored Spherical Harmonic Coe*ffi*cients*

Based on the data processing strategies in Section 2.1, the post-earthquake and pre-earthquake gravity field models can be recovered from GOCE SGG data. However, the lower and higher frequency bands of the recovered models are heavily influenced by the 0.005–0.041 Hz band-pass ARMA filter. Moreover, the polar gap problem of GOCE's orbit mainly affects the low-order spherical harmonic (SH) coefficients. We perform a numerical simulation to show how the recovered model's SH coefficients are affected by the band-pass filter and the polar gap problem. First,a1s sampling of 61-day GOCE satellite orbits are produced from the released reduced-dynamic orbit data (SST\_PRD\_2), with 10 s sampling by polynomial interpolation. Then, we simulate the GGT observations along the orbit by using the EGM2008 model up to d/o 220. We add simulated colored noise according to the prior PSD of the GGT's trace, as given by Cesare [35], to the simulated GGT data. Finally, by using these simulated observations, we recover the gravity field model based on the LS approach with the 0.005–0.041 Hz band-pass filter applied. The absolute errors of the recovered model's SH coefficients are obtained by calculating the difference between the model and the EGM2008 model. Figure 2 displays the error spectra of SH coefficients in log10 scale. According to this figure, SH coefficients up to d/o 160 with large errors are mainly located in the arrow-shaped region, which is framed by the red lines. Referring to the conclusions by Sneeuw [39], the errors of the near-zonal coefficients (m ≤ 10) are mainly caused by the ill-posed problem occasioned by the 96.7◦ inclination of the GOCE satellite orbit. The other coefficients with large errors in the arrow-shaped region are due to the lower frequency limit of the band-pass filter. The coefficients outside the arrow region are called the tailored spherical harmonic (TSH) coefficients here, which correspond to two symmetrical quadrilaterals. These TSH coefficients will be used to show the coseismic gravity and gravity gradient changes.

As shown in Figure 2, the degree of the middle corner coefficients in the arrow-shaped region is approximately 30, which is very close to the degree 27 estimated by Equation (3), according to the minimum frequency of the 0.005–0.041 Hz band-pass filter. Thus, the minimum degree of the TSH coefficients is set to 30 herein. The vertices of the lower-left and lower-right corners of the arrow region approximately correspond to a degree of 80. Fuchs et al. [28] presented their results according to three bandwidths (0.005–0.05, 0.0039–0.03, and 0.00475–0.0175 Hz), and the bandwidth of 0.00475–0.0175 Hz was determined by the matched filter approach, which maximizes the expected signal. The degree range 30–95 approximately corresponds to the bandwidth of 0.005–0.0175 Hz [28]. So, we also choose 95 (see the red dashed line in Figure 2) for the maximum degree of the TSH coefficients. Moreover, the maximum degree of 95 is very close to the maximum d/o of the SH coefficients of Gravity field Recovery and Climate Experiment (GRACE)'s time-variable gravity field model, so we can compare the results from GRACE with those from GOCE at the same frequency band.

**Figure 2.** Error spectra of the recovered spherical harmonic (SH) coefficients in log10 scale compared to EGM2008.

#### *2.3. Post-earthquake Gravity Changes from GOCE*

Based on the theory and data processing strategies described above, we estimated two GGMs (the pre-earthquake model and the post-earthquake model) up to d/o 220 from the released GOCE's calibrated and corrected gravity gradients in the GRF (EGG\_NOM\_2 products) (GGT and IAQ) and precise science orbits with quality report (SST\_PSO\_2 products) (PRD and PRM) [40] before and after the 2011 Japan Tohoku-Oki earthquake. The IAQ products are the GRF to IRF attitude quaternions, and the PRM products are the Earth Fixed Reference Frame (EFRF) to Inertial Reference Frame (IRF) quaternions. We selected the data period from the 1st of November 2009 until the 28th of February 2011 for the pre-earthquake time span, and the period from the 15th of March 2011 until the 31st of May 2012 for the post-earthquake time span. Only the high-precision diagonal components (*Vxx*, *Vyy*, and *Vzz*) of the GGT are selected for modeling. Before forming the observation equation, some data preprocessing tasks were performed, such as data interpolation and outlier detection [33]. For a more detailed description of GGT data processing, please refer to Xu et al. [33]. To reduce the influence of high-frequency errors, a Gaussian filter with the smoothing radius of 210 km is also applied to the TSH coefficients. A radius of 210 km for the Gaussian filter is approximately determined with the formula 20000 km/*Nmax*, corresponding to a maximum degree *Nmax* of 95.

#### *2.4. Post-earthquake Gravity Changes from GRACE*

For comparison with GOCE, we also derived the coseismic gravity changes and gravity gradient changes from the Release05 (RL05) GRACE time-variable monthly gravity field models from the Center for Space Research (CSR), which were downloaded from the website http://icgem.gfz-potsdam.de/ ICGEM/. The maximum d/o was 96.

The TSH coefficients that corresponded to the SH degree range 30–95 were chosen to maintain consistency with the processing of the GOCE models, and a Gaussian filter with the smoothing radius of 210 km was also applied to the TSH coefficients. The differences in the coefficients between the averages of the monthly models for one year before and after the earthquake were used to calculate the coseismic gravity changes and gravity gradient changes on a sphere with a height of 260 km.

#### *2.5. Forward Modeling Coseismic Gravity and Gravity Gradient Changes*

The PSGRN/PSCMP code for the modeling co- and post-seismic response of the Earth's crust to earthquakes from Wang et al. [41] is used to compute the coseismic gravity changes of the 2011 Tohoku-Oki earthquake based on a five-layer half-space Earth model and the fault slip model from Caltech provided by Wei and Sladen [42]. The layer depths, densities, and seismic velocities are extracted from the CRUST2.0 global tomography model [43], which are derived from the epicenter (38.1◦N, 142.8◦E) grid cell layer parameters, and are shown in Table 1. The upper four crustal layers in the model are treated as elastic materials. The bottom half-space mantle layer is treated as biviscous materials, i.e., a Burgers body with a transient (Kelvin) viscosity and a steady state (Maxwell) viscosity (Details about the Burgers viscosities will be described in Section 2.6). The free-air corrections, as calculated by the vertical surface displacements, are added to the calculated coseismic gravity changes from the PSGRN/PSCMP program [9]. A Bouguer layer seawater compensation effect is also corrected according to the vertical displacements with the consideration of the land-ocean differentiation [44]. Since Broerse et al. [44] pointed out that simply considering a global ocean for the seawater correction would lead to non-negligible errors (up to a few 10% for the 2011 Tohoku-Oki earthquake), using a practical land-ocean mask ensures a more reliable seawater effect correction for the model prediction in our study.

**Table 1.** The 5-layer half-space Earth model used in the prediction of co- and post-seismic gravity changes.


The computation region is from 28◦N to 48◦N latitude and from 132◦E to 152◦E longitude. The epicenter (38.1◦N, 142.8◦E) is located at the center of this region. The cell size of the grid is set to 0.1◦ × 0.1◦. The global gridded coseismic gravity changes on a sphere are formed by the forward-modeled coseismic gravity changes, as well as filling in zero values outside the computational region. Based on the global gridded coseismic gravity changes, we estimate the gravitational potential spherical harmonic (SH) coefficients up to d/o 250, by using the classical spherical harmonic analysis approach [45]. Then, based on the estimated gravitational potential SH coefficients, we calculate changes of both the coseismic gravity and radial gravity gradient on the Earth's surface and on a sphere with a height of 260 km above the WGS84 reference ellipsoid, which are shown in Figures 3 and 4. According to the maximum d/o 250 of the SH coefficients, we use a Gaussian filter with a radius of 110 km here. The epicenter is also shown as a black star in these figures. The units of the gravity and the gravity gradient are mGal (1 mGal = 10−<sup>5</sup> ms−2) and mE, respectively. According to Figures 3 and 4, the spatial patterns of the coseismic gravity changes and gravity gradient changes show classic dipole characteristics in the nearly west-to-east direction. Additionally, the attenuation effect of the coseismic gravity change signal at the satellite height is very clear.

The amplitudes of the coseismic gravity changes and coseismic radial gravity gradient changes are reduced by more than a factor of 10 and 20, respectively.

**Figure 3.** Coseismic gravity changes (**a**) on the Earth's surface and (**b**) on a sphere with a height of 260 km, when using forward-modeled spherical harmonic coefficients up to d/o 250 with a Gaussian filter applied (radius of 110 km). The units are μGal.

**Figure 4.** Coseismic radial gravity gradient change (**a**) on the Earth's surface and (**b**) on a sphere with a height of 260 km when using spherical harmonic coefficients up to d/o 250 with a Gaussian filter applied (radius of 110 km). The units are mE.

#### *2.6. Modeling of Post-seismic Gravity Changes*

For the GOCE and GRACE observations, the coseismic gravity field changes are estimated by using the differences in coefficients between the pre-earthquake model and the post-earthquake model. Here we note that taking the one-year mean field differences as the coseismic signals will inevitably bring in the impact of the post-seismic effects, which are likely to affect the peak-to-peak range, as well as the spatial pattern of the extracted signals [46]. Han et al. [46] revealed that a dominant post-seismic positive gravity change signal is visible surrounding the epicenter within a couple of years after the 2011 Tohoku-Oki earthquake, with the amplitude up to around +6 μGal under a 500 km spatial scale comparable as GRACE observations. Moreover, they gave the results from aviscoelastic relaxation model and an afterslip model, respectively, and showed that the observed vertical deformation at the coast and offshore agrees better with the viscoelastic relaxation model than the afterslip model. Therefore, we choose the viscoelastic relaxation model to estimate the post-seismic gravity change effect here. Based on our 5-layer half-space model in Table 1, we use Burgers viscosities including a transient viscosity of 10<sup>17</sup> Pa s, and a steady state viscosity of 1018 Pa s, for the biviscous mantle (with a ratio value of 1 between Kelvin and Maxwell rigidity). Monthly gravity field changes after the Tohoku-Oki earthquake were then calculated according to our biviscous post-seismic model. The model predicted post-seismic gravity changes are comparable with those reported in Han et al. [46] in both amplitude and spatial pattern under a 500 km spatial scale, even though we use different viscosities for the biviscous mantle. Our Burgers viscosities used in the post-seismic model are smaller than those from Han et al. [46] (a transient viscosity of 10<sup>18</sup> Pa s and a steady state viscosity of 1019 Pa), which might be due to the distinct differences in layer depths between our model and the one used in Han et al. [46].

We converted the gridded gravity changes into geopotential coefficients up to d/o 96, and derived the 1-year mean field after the earthquake to estimate the impact from post-seismic changes.

#### *2.7. Computation of Coseismic Gravity Changes from the Hydrological and Oceanic Mass Redistributions*

According to [40,47], the temporal corrections (direct tide, solid Earth tide, ocean tide, pole tide and non-tidal correction) have already been removed from the released GGT observations at each epoch. But only SH coefficients up to d/o 20, estimated from atmospheric and oceanic mass variations, and from seasonal variations of hydrology, are used to model the non-tidal signals [47]. Since the coefficients with the d/o lower than 30 are not included in TSH coefficients, the non-tidal GGT corrections certainly have no contribution to the derived gravity gradients in the manuscript. In order to evaluate the influences of the hydrological and oceanic mass redistributions on coseismic gravity changes, we calculated the pre-earthquake models (data from November 2009 to February 2011) and post-earthquake models (data from March 2011 to May 2012) up to d/o 96 from the ECCO-OBP (Ocean Bottom Pressure from Estimating the Circulation and Climate of the Ocean) [48,49] and GLDAS (Global Land Data Assimilation Systems) [50] models, respectively.

#### **3. Results**

#### *3.1. Coseismic Gravity and Gravity Gradient Changes from the forward-modeled TSH Coe*ffi*cients*

Based on the forward-modeled TSH coefficients, the coseismic gravity changes Δ*g* and gravity gradient changes of the components (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) in the LNOF on a sphere with a height of 260 km above the WGS84 reference ellipsoid were computed, which are shown in Figures 5 and 6. Figure 5 shows that the magnitude of the forward-modeled coseismic gravity changes is approximately 1.2 μGal. The maximum magnitude of the components (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) is approximately 0.1 mE, which corresponds to the radial component. Compared to Figures 3 and 4, the spatial patterns of Figures 5 and 6 are no longer dipole patterns, and instead show multiple extrema. Additionally, the gravity changes Δ*g* and the changes in the gravity gradient components Δ*Vyy* and Δ*Vzz* show multiple rings. The spatial patterns of Δ*g* and Δ*Vzz* are very similar and nearly isotropic. According to Figure 6, the components (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) have different spatial patterns. The components Δ*Vxx* and Δ*Vxz* have similar spatial patterns, namely, nearly west-to-east stripes and multiple poles in the North-South direction. The Δ*Vyy* component has a spatial pattern with nearly North-South stripes and multiple poles in the west-to-east direction. Moreover, we also use two other fault slip models, provided by the GSI (Geospatial Information Authority of Japan) and the USGS (United States Geological Survey), to compute the coseismic gravity gradient changes from the TSH coefficients(see the supplementary materials). Compared with the Caltech fault slip model (as mentioned in Section 2.5), the GSI and USGS models have different depth extends. The slips extend to 70 km and 58 km in depth for the GSI and USGS models, respectively, while for the Caltech model, the maximum depth of slip is 47 km. The dip angles of the three slip models are very close, all around 10◦. Although calculated with different fault slip models, the spatial patterns of the gravity and gravity gradient changes are similar to the signals plotted in Figures 5 and 6.

**Figure 5.** Gravity changes on a sphere with a height of 260 km computed by the tailored spherical harmonic coefficients from the forward model with a Gaussian filter applied. The SH degree range is 30–95 and the radius of the filter is 210 km. The unit is μGal.

**Figure 6.** Gravity gradient changes (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) in Local North-Oriented Frame on a sphere with a height of 260 km computed by the tailored spherical harmonic coefficients from the forward-modeled coseismic signals with a Gaussian filter applied. The SH degree range is 30–95 and the radius of the filter is 210 km. The units are mE.

#### *3.2. Post-seismic Gravity Changes from the Viscoelastic Model*

We compute the 1-year mean post-seismic radial gravity gradient changes (Δ*Vzz*) based on the TSH coefficients method from the computed viscoelastic relaxation models in Section 2.6, which are shown in Figure 7. A Gaussian filter with the smoothing radius 210 km is applied to the TSH coefficients. The maximum and minimum values of the radial gravity gradient changes from the viscoelastic relaxation model are 0.0095 mE, and -0.0047 mE, respectively. This 1-year mean field will be used in removing the post-seismic effect, and deriving the coseismic gravity signals both from the GOCE and GRACE data (in Sections 3.4 and 3.5). Comparison between our viscoelastic relaxation results and those from Han et al. [46] under the SH degree 60 truncation has also been provided in the supplementary of the manuscript.

**Figure 7.** Radial gravity gradient changes (Δ*Vzz*) in Local North-Oriented Frame on a sphere with a height of 260 km computed by the tailored spherical harmonic coefficients from the viscoelastic relaxation model with a Gaussian filter applied. The SH degree range is 30–95 and the radius of the filter is 210 km. The units are mE.

#### *3.3. Coseismic Gravity Changes from the Hydrological and Oceanic Mass Redistributions*

Based on the TSH coefficients method, we compute the coseismic radial gravity gradient changes (Δ*Vzz*), which are shown in Figure 8. According to Figure 8, the radial gravity gradient changes derived from the hydrological and oceanic mass redistributions are very small. The maximum and minimum value of the radial gravity gradient changes from both of GLDAS and ECCO-OBP are 0.0022 mE, and −0.0027 mE, respectively. Nevertheless, the contribution from the hydrological and oceanic mass redistributions will be removed in deriving the coseismic gravity signals both from GOCE and GRACE data.

**Figure 8.** Radial gravity gradient changes (Δ*Vzz*) in Local North-Oriented Frame on a sphere with a height of 260 km computed by the tailored spherical harmonic coefficients from GLDAS (**a**), ECCO-OBP (**b**) and both of GLDAS and ECCO-OBP (**c**) with a Gaussian filter applied. The SH degree range is 30–95 and the radius of the filter is 210 km. The units are mE.

#### *3.4. GRACE-derived Coseismic Gravity Changes and Gravity Gradient Changes*

The coseismic gravity changes and gravity gradient changes from GRACE are shown in Figures 9 and 10. It should be noted that before computing the coseismic gravity changes, we removed the post-seismic effects from the one-year mean post-seismic model, and the contribution from the hydrological and oceanic mass redistributions. According to Figures 9 and 10, the coseismic gravity and gravity gradient changes from GRACE have nearly the same spatial pattern, showing only clear multi-pole characteristics in the North-South direction alongside west-to-east stripes. Only the gravity gradient changes Δ*Vxx* and Δ*Vxz* have similar spatial patterns compared to the forward-modeled coseismic signals plotted in Figure 6. Additionally, the gravity changes and the gravity gradient components Δ*Vyy* and Δ*Vzz* have very different spatial patterns compared to those from the forward-modeled coseismic signals. The spatial stripes for the component Δ*Vyy* are oriented west-to-east, which is completely different from the North-South stripes of the forward-modeled

coseismic signal. The reason for this situation should be that GRACE's satellite is in a polar orbit with an inclination of 89.5◦, so the inter-satellite range-rate observations along the orbit are almost in the North-South direction. Thus, these inter-satellite range-rate observations are not sensitive to signals in the west-to-east direction.

**Figure 9.** Gravity changes on a sphere with a height of 260 km computed by the tailored spherical harmonic coefficients from GRACE (the Gravity field Recovery and Climate Experiment). The SH degree range is 30–95. The unit is μGal.

**Figure 10.** Gravity gradient changes (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) in Local North-Oriented Frame on a sphere with a height of 260 km computed by the tailored spherical harmonic coefficients from GRACE. The SH degree range is 30–95. The units are mE.

Furthermore, when comparing the forward-modeled coseismic results in Figures 5 and 6 with those from the GRACE data in Figures 9 and 10, the magnitudes of the coseismic gravity changes from GRACE are smaller than those from the forward model.

#### *3.5. GOCE-derived Coseismic Gravity Changes and Gravity Gradient Changes*

We use the same method of processing the forward-modeled coseismic signals to compute the coseismic gravity and gravity gradient changes from GOCE, which are shown in Figures 11 and 12. The post-seismic effects and the contribution from the hydrological and oceanic mass redistributions are also removed like GRACE-derived coseismic gravity signals. A Gaussian filter with a radius of

210 km was also applied. Figure 11 shows that the magnitude of the forward-modeled coseismic gravity changes is approximately 2.4 μGal. The maximum magnitude of the components (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) is approximately 0.15 mE, which corresponds to the radial component.

**Figure 11.** Gravity changes on a sphere with a height of 260 km computed by the tailored spherical harmonic coefficients from the GOCE observations with a Gaussian filter applied. The SH degree range is 30–95, and the radius of the filter is 210 km. The unit is μGal.

**Figure 12.** Gravity gradient changes (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) in Local North-Oriented Frame on a sphere with a height of 260 km computed by the tailored spherical harmonic coefficients from the GOCE observations with a Gaussian filter applied. The SH degree range is 30–95, and the radius of the filter is 210 km. The units are mE.

#### **4. Discussion**

When compared to the forward-modeled coseismic signals in Figures 7 and 8, the spatial patterns of the coseismic signals in Figures 11 and 12 from the GOCE observations, although noisy, are somewhat similar, exhibiting multiple extrema. The spatial patterns of the gravity changes Δ*g* and gravity gradient changes of the radial component Δ*Vzz* exhibit multiple rings. According to Figure 12, the components Δ*Vxx* and Δ*Vxz* show similar spatial patterns, i.e., the west-to-east stripes and multiple poles in the North-South direction. Note that, there are more gravity gradient change peaks compared to the modeled gravity gradient changes in the area of interest, which is the same situation with Figure 8 in Fuchs et al. [28]. Referring to Fuchs et al. [28], we also use the averaged accuracy of gravity gradients before the 2011 earthquake to represent the accuracy of the derived coseismic gravity gradients. The deviations of the mean (1σ) in 0.5◦ grid cells for the differences of gravity gradients (Δ*Vxx*, Δ*Vyy*, Δ*Vzz* and Δ*Vxz*) between the GOCE-derived pre-earthquake model and the GOCO03S [32] are computed, which are 0.018, 0.022, 0.036 and 0.024 mE, respectively. According to Figure 12, we could see that there is a significant change (greater than 3σ) in gravity gradients between pre- and post-earthquake gravity gradients.

We note that the uncertainties (σ) might be underestimated because the two models are not completely independent and the errors are not stationary, but we will not do further assessment here. Additionally, according to Figures 5, 6, 11 and 12, the amplitudes of the coseismic gravity gradient changes are larger than the forward-modeled coseismic signals, and the geographical positions of the multi-poles are different, which is similar to what is seen in Fuchs et al. [28]. The geographic positions of these multi-poles from the GOCE data are located approximately 200 km to the northeast of the modeled results. The reasons for these discrepancies are still not very clear. According to Fuchs et al. [28], Heki et al. [51] and Feng et al. [52], the differences between the results from the forward model and GOCE data might be caused by systematic errors in the SGG observations and the weak sensitivity of SGG observations to the location of the earthquake. In addition, the large uncertainties of fault slip models also contribute to the discrepancies. Dai et al. [16] show that the fault slip model still leads to around 40% relative difference of gravity changes compared to the GRACE observations, even inverted with a multiple data source, which is likely, because the current commonly-used dislocation models are not sophisticated enough. The noticeable differences between model prediction and observation need to be investigated by further studies. According to Figures 7, 10 and 12, the maximum and minimum values from the viscoelastic relaxation model are nearly 15–21% of the ones (max: 0.031 mE, min: −0.046 mE) derived from the GRACE one-year mean field differences, and nearly 3–6% of the ones (max: 0.161 mE, min: −0.167 mE) derived from the GOCE one-year mean field differences. According to Figures 8 and 12, the maximum and minimum value of the radial gravity gradient changes from both of GLDAS and ECCO-OBP are nearly two orders of magnitude smaller than the GOCE-derived coseismic gravity gradient signals.

In order to show the gravity change time series, we processed the GOCE SGG data to obtain the monthly gravity field models with the time period from November 2009 to May 2012. The data processing strategies are the same with that determining the pre-earthquake and post-earthquake models above. According to the GOCE daily reports EGG [53], there are a lot of special events, several data gaps and frequent calibrating operations (about once a month), from November 2009 to May 2012. Therefore, we are unlikely to be able to derive continuous monthly time-variable gravity field models due to the practical data quality. In addition, the solutions after the earthquake have larger oscillations, which agree with the fact that there are more special events in SGG data after the earthquake than those before the earthquake for most months [53]. Figure 13a,b show the radial gravity gradient changes at A (40.6◦, 141.2◦) and B (40.2◦, 147.2◦) points on Figure 12 derived from the monthly time-variable gravity field models before and after the earthquake, while Figure 13c,d shows the average radial gravity gradient changes at a negative area [(40◦~41◦), (140.5◦~142.5◦)] around A and a positive area [(39◦~40.5◦), (147~148.5◦)] around B.

According to Figure 13, although the radial gravity gradient change time series contain some significant oscillations, they are still comparable with the co-seismic gravity change time series from GRACE for some smaller earthquake (Mw ≈ 8.5) given by some previous studies (see as Figure 2 of Han et al. [54], and Figures 6 and 7 of Chao and Liau [55]). In order to show the changes more clearly, we use a 1D median filter (with 5th order) to process those four time series, and the obtained filtered results denoted by the red curves in Figure 13. Those filtered curves show obvious steps before and after (gray areas in Figure 13) the earthquake. We further use step functions to fit the original observations (the obtained fitting curves are also plotted in Figure 13; blue curves), and the step values can be estimated at the same time (see Figure 13). Here we use a bootstrap procedure (a Monte Carlo process, see Efron and Tibshirani [56]) to estimate the uncertainties for those step values. This method has been widely used for the uncertainty estimations in different geophysical research [57–62]. Here we give a simple example to explain the error estimation process (more details can be found in Shen and Ding [60]):


From Figure 13, we can see that the step values are clearly over three times their corresponding uncertainties (the double uncertainties are denoted by the blue areas in Figure 13 as the corresponding two standard deviations). Statistically, we may suggest that those steps represent the coseismic gravity gradient change signals caused by the 2011 Tohoku-Oki earthquake.

**Figure 13.** Radial gravity gradient changes at A (**a**), B (**b**) points and two selected areas around A and B (**c**,**d**) before and after earthquake. The red curves denote the filtered results after using a 1D median filter to the original time series, the blue curves are the fitted step functions. The double uncertainties are denoted by the blue areas.

To reveal how the spatial patterns of the GOCE and modeled gravity gradient changes agree with each other, we computed the correlation coefficients (see Table 2) of the gravity changes and gravity gradient changes (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, Δ*Vxz* and Δg) between the observed models (GOCE and GRACE) and the forward model for the TSH coefficients (see Figure 2). The Root Mean Square (RMS) of observed (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, Δ*Vxz* and Δg) of GOCE/GRACE, and the differences between the observed and forward modeling signals, are also computed and shown in Table 2. According to Table 2, both of GOCE and GRACE do not perform well, because all the correlation coefficients are less than or equal to 0.85. Although most correlation coefficients corresponding to GOCE are lower than those of the GRACE time-variable gravity field models, the correlation coefficient corresponding to the

Δ*Vyy* component from GOCE is slightly larger than the one of GRACE. Thus, for the SH degree range 30–95, the GOCE mission could reveal a greater signal of coseismic gravity gradient changes in the west-to-east direction than GRACE, because GRACE has sensitive observations along the North-South direction. The coefficient for Δ*Vzz* is very close to the value in Fuchs et al. [28]. All RMS of observed signals are reduced after subtracting the forward-modeled coseismic signals from the GOCE-derived results. However, because the amplitude of the results from GRACE is only about half of the one from the forward model, the RMS of GRACE's Δ*Vyy* increases when the modeled signal is subtracted from the observed signal and the RMS of Δ*Vyy* doesn't change. Of course, compared to GOCE, GRACE performs relatively well in the SH degree range 30–95, especially for the Δ*Vxx* component, which has the maximum correlation coefficient because only inter-satellite range-rates are observed along its orbit.

**Table 2.** Correlation coefficients, RMS of observed gravity and gravity gradients (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, Δ*Vxz* and Δg), RMS of the differences between the observed and forward modeling signals. The observed signals are from GRACE and GOCE. The positions for the computation are located in the region of 28◦N–48◦N, 132◦E–152◦E. The unit of Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz* is mE and the unit of Δg is μGal.


#### **5. Conclusions**

We employed the least squares method with a bandpass auto regression moving average filter, to recover two global gravity field models up to degree and order 220, from GOCE satellite gravity gradient data from the 15 March 2011 to the 31 May 2012 after the 2011 Japan Tohoku-Oki earthquake event. Then, we used the recovered models to estimate the coseismic gravity and gravity gradient changes that were caused by the earthquake by subtracting the pre-earthquake model from the post-earthquake model. This approach is different from that in Fuchs et al. [28]. They used the diagonal components (*Vxx*, *Vyy*, and *Vzz*) along the orbit to construct the new vertical gravity gradient *Vzz*, which was used to present the coseismic gravity gradient changes. To extract the coseismic gravity signals from the recovered global gravity field models, we proposed TSH coefficients according to the influences of colored noise in the SGG data and the polar gap problem on the recovered models.

The gravity changes Δ*g* and the gravity gradient changes (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) were computed by the GOCE-derived tailored spherical harmonic (TSH) coefficients, GRACE-derived TSH coefficients and the forward-modeled TSH coefficients. In data processing, the degree range 30–95 was used to obtain the TSH coefficients. A Gaussian filter with a radius of 210 km was also applied to the TSH coefficients according to the maximum frequency of 95. When comparing the coseismic gravity signals from GOCE, the forward model, and GRACE, the spatial patterns of the coseismic gravity and gravity gradient changes from GOCE were analogous to those from the forward model. There is a significant change (greater than 3σ) in the gravity gradients between the pre- and post-earthquake gravity gradients, which is associated with the earthquake. Moreover, the radial gravity gradient changes from the derived monthly time-variable gravity field models before and after the earthquake show obvious steps before and after the earthquake, whose amplitudes are at a level over three times that of their corresponding uncertainties, and thus significant. Statistically, it is reasonable to infer that these steps represent the coseismic gravity gradient change signals caused by the 2011 Tohoku-Oki earthquake. For the GGT components (Δ*Vxx*, Δ*Vzz*, and Δ*Vxz*), the correlation between the GOCE-derived coseismic gravity signals and the forward-modeled coseismic results was weaker than that from GRACE. However, the correlation coefficient that corresponded to the Δ*Vyy* component from GOCE is larger than that from GRACE. The component Δ*Vyy* had spatial patterns that included North-South stripes and multiple poles in the west-to-east direction. This situation means

that the GOCE mission might reveal more coseismic gravity signals in the West-East direction than GRACE in the SH degree range 30–95. The estimated time series of radial gravity gradient changes (see Figure 13) suggest that the coseismic gravity changes can be reproduced from GOCE observations through the proposed method.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2072-4292/11/11/1295/s1, Figure S1: Gravity gradient changes (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) in LNOF on a sphere with a height of 260 km computed by the TSH coefficients from the forward-modeled signals of the GSI fault slip model with a Gaussian filter applied, Figure S2: Gravity gradient changes (Δ*Vxx*, Δ*Vyy*, Δ*Vzz*, and Δ*Vxz*) in LNOF on a sphere with a height of 260 km computed by the TSH coefficients from the forward-modeled signals of the USGS fault slip model with a Gaussian filter applied, Figure S3: The observation (the fitted step has been removed) and the inputted noise (a), and their Fourier power spectra (b) (logarithmic scale in dB), Figure S4: Postseismic gravity changes (Δ*g*) on the ground from the spherical harmonic coefficients up to d/o 60 of the viscoelastic relaxation model computed in the paper (left) and provided by Han (right).

**Author Contributions:** Conceptualization X.X.; methodology, X.X. and H.D.; investigation, X.X. and H.D.; resources, J.L. and M.H.; data curation, Y.Z.; writing—original draft preparation, X.X.; writing—review and editing, X.X., H.D., and J.L.

**Funding:** This research was financially supported by the National Natural Science Foundation of China (Grant No. 41574019, 41774020, 11873075). DAAD Thematic Network Project (57421148). The Major Project of High resolution Earth Observation System. The Natural Science Foundation of Shanghai (17ZR1435600).

**Acknowledgments:** The authors thank Shin-Chan Han for the help in our post-seismic effect evaluation and providing the viscoelastic and afterslip models of the Tohuku-Oki earthquake for the verification of our post-seismic models. We thank Wenbin Shen for useful discussion. We also acknowledge the European Space Agency for providing the GOCE data. We are also grateful to the three anomalous reviewers who have provided constructive comments and suggestions to improve our work.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Remote Sensing* Editorial Office E-mail: remotesensing@mdpi.com www.mdpi.com/journal/remotesensing

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18