**1. Introduction**

The ensemble four-dimensional variational (En4DVar) hybrid data assimilation (DA) approach incorporates the advantage of flow-dependent characteristic of ensemble Kalman filter (EnKF) into the 4DVar DA approach. It has become popular in major operational centers of the world and shown great potential for further improving numerical weather prediction (NWP) skills [1–5]. The En4DVar approach typically uses flow-dependent information extracted from the ensemble forecasts to help estimate the background error covariance (BEC) for 4DVar. When applying this approach to global NWPs, the ensemble size is much smaller than the dimensionality of the state variables of the prediction model

**Citation:** Zhu, S.; Wang, B.; Zhang, L.; Liu, J.; Liu, Y.; Gong, J.; Xu, S.; Wang, Y.; Huang, W.; Liu, L.; et al. Assimilating AMSU-A Radiance Observations with an Ensemble Four-Dimensional Variational (En4DVar) Hybrid Data Assimilation System. *Remote Sens.* **2023**, *15*, 3476. https://doi.org/10.3390/rs15143476

Academic Editors: Xiaolei Zou, Guoxiong Wu and Zhemin Tan

Received: 30 May 2023 Revised: 2 July 2023 Accepted: 3 July 2023 Published: 10 July 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

due to the limitations of computational resource. The limited ensemble size may easily result in spurious correlations between two grids that are far apart in the BEC matrix (Bmatrix). Localization techniques [6–10] can effectively mitigate such spurious correlations and thus improve the analysis quality and forecast skill.

The main idea of localization is to restrict the analysis at a specific grid point to be influenced only by observations within its surrounding local region. Houtekamer and Mitchell [6] performed observation selection by setting a cutoff radius, thus excluding the influence of observations outside the cutoff radius on the analysis at the specific grid point. Houtekamer et al. [11] further proposed a localization scheme using a compactly supported GC function [12] that decreases monotonically with distance, which is realized as a Schür product between the ensemble B-matrix and the localization correlation matrix. Localization can be achieved by assimilating observations one by one to update the analyses within the local region, or implicitly by simultaneously assimilating all observations within the local region around the analysis at a specific grid point [13]. However, for ensemblebased non-sequential DA approaches, using the localized covariance directly in model space can easily lead to large computational costs if one wants to solve directly in the global space. Implementing localization in observation space may be a more economical choice. Approaches that use orthogonal functions (e.g., empirical orthogonal function and sine function) to decompose the localization correlation matrix can also further reduce the localization cost in observation space [8,10,14–17].

The elements of the localization correlation matrix in observation space are dependent on the observation coordinates, which are easy to be calculated for conventional observations with well-defined positions. Following the development of satellite technology, the rapidly increasing radiance observations have significantly improved the medium-range forecasts and have greatly reduced the gap of forecast skills between the northern and southern hemispheres [18]. It is noted that radiance observations are non-local due to the sampling of multiple atmospheric layers, and different satellite channels are sensitive to different atmospheric layers. Therefore, defining the vertical coordinates of radiance observations is an unavoidable challenge in the use of observation space localization. Houtekamer et al. [11] used the pressure at the peak of the weighting function to define the vertical coordinates of radiance observations from the Advanced Microwave Sounding Unit-A (AMSU-A) instruments in the EnKF system. Fertig et al. [19] selected radiance observations within the local region to be assimilated into the LETKF system if a weight above the cutoff value is signed to any model state in the local region. This cutoff-based selection method is significantly different from the abovementioned peak-based selection method of Houtekamer et al. [11] when the cutoff value is small and the weighting function is broad. In addition, it allows a wider range of influence for non-local observations than for local observations. Miyoshi et al. [20] used the normalized weighting functions of satellite channels to provide weights for error covariance localization in the LETKF system.

Based on these studies, the EnKF class approaches have significantly benefited from assimilating radiance observations. In particular, the effective assimilation of AMSU-A radiance observations in the DA system plays important role in improving forecast skills. The 4DVar and En4DVar approaches usually use model space localization method for radiance observations and their DA schemes are globally solved. In contrast, the EnKF class approaches generally adopt the observation space localization method that uses vertical coordinates defined by the pressure at peak weight to achieve the effective assimilation of radiance observations and is solved in local regions. Moreover, Campbell et al. [21] pointed out that when the number of satellite channels is large enough and the observation error is very small, the observation space localization is difficult to recover the true state. Thus, investigating the differences between the effects of model space and observation space localization techniques on the assimilation and forecast performances when incorporating radiance observations is beneficial for efficiently using radiance observations in ensemblebased data assimilations.

The purpose of this study is to use the weighted average pressure to define the vertical coordinates of AMSU-A radiance observations in the En4DVar system so as to investigate the contributions of these observations to the improvements of analysis quality and forecast skill. In Section 2, a brief description of the DA methods used in this study, including En4DVar and its 4DVar and 4DEnVar components, the observation space localization scheme and the vertical positioning method for radiance observations, are presented. Section 2 also displays the DA configurations, experimental details and observations, followed by the analysis and forecast results in Section 3. Finally, the conclusions and discussions are provided in the last section.

#### **2. Materials and Methods**

### *2.1. A Brief Description of DA Methods*

The 4DVar system used here is based on the incremental 4DVar scheme [22], which obtains the optimal analysis of atmospheric state on a low-resolution grid by combining forecast and observation information. It adopts a highly parameterized climatological B-matrix: **B***<sup>c</sup>* = **UU**<sup>T</sup> [4,23], and relies on the adjoint model (ADM) in the minimization process of solving the optimal analysis.

The ensemble covariance for the En4DVar system is provided by the 4DEnVar system [24]. The 4DEnVar system is established using the dimension-reduced projection four-dimensional variational (DRP-4DVar) method [25]. This method uses the ensemble samples to project the minimization problem of 4DVar in the original model space onto the reduced-dimensional subspace, and to avoid using the ADM in the minimization process.

The En4DVar system used in this study consists of two components including the abovementioned 4DVar and 4DEnVar systems [26]. It uses a hybrid BEC (**B** = *γc***B***<sup>c</sup>* + *γe***B***e*) achieved through the extended control variable approach [27]. Here, **B***<sup>e</sup>* is the ensemble covariance produced by the 4DEnVar component and **B***<sup>c</sup>* is the climatological covariance in the 4DVar component. The variables *γ<sup>c</sup>* and *γ<sup>e</sup>* represent the scalar weights of the climatological and ensemble covariances, respectively. Unlike other variants of 4DEnVar, the hybrid BEC used here consists of a three-dimensional (3D) climatological covariance from 4DVar and a 4D ensemble covariance from 4DEnVar. In addition, calculating the gradient of cost function in the ensemble component does not contain the ADM but uses the same statistical relationship as in the 4DEnVar system. More details about the En4DVar system can be found in Zhu et al. [26].

#### *2.2. Localization*

#### 2.2.1. Observation Space Localization

In the En4DVar system, the localization for the climatological covariance is contained in its square root **U**, while the localization for the ensemble covariance is the same as in the 4DEnVar system. The traditional localization scheme based on the Schür product between the high-dimensionality ensemble B-matrix **B***<sup>e</sup>* = **pxp**<sup>T</sup> **<sup>x</sup>** and the high-dimensionality correlation matrix **C** may lead to large computational costs. Approximately decomposing the correlation matrix [8,10] and ignoring the time-variation of localization, the localization can be economically achieved in observation space by the Schür products between a finite number of observational perturbation samples and localization leading eigenvectors:

$$\begin{cases} \mathbf{E}\mathbf{p}\_{\mathbf{x}} = \left[ \left( \mathbf{p}\_{\mathbf{x},1} \circ \mathbf{p}\_{\mathbf{x},1} \circ \dots \circ \mathbf{p}\_{\mathbf{x},1} \circ \mathbf{p}\_{\mathbf{x},L} \right) \circ \dots \circ \left( \mathbf{p}\_{\mathbf{x},N} \circ \mathbf{p}\_{\mathbf{x},1} \circ \dots \circ \mathbf{p}\_{\mathbf{x},N} \circ \mathbf{p}\_{\mathbf{x},L} \right) \right] \\ \mathbf{E}\mathbf{p}\_{\mathbf{y}} = \left[ \left( \mathbf{p}\_{\mathbf{y},1} \circ \mathbf{p}\_{\mathbf{y},1} \circ \dots \circ \mathbf{p}\_{\mathbf{y},1} \circ \mathbf{p}\_{\mathbf{y},L} \right) \circ \dots \circ \left( \mathbf{p}\_{\mathbf{y},N} \circ \mathbf{p}\_{\mathbf{y},1} \circ \dots \circ \mathbf{p}\_{\mathbf{y},N} \circ \mathbf{p}\_{\mathbf{y},L} \right) \right] \end{cases} \tag{1}$$

Here, **px**(*i* = 1, 2, ··· , *N*) and ρ**x**,*j*(*j* = 1, 2, ··· , *L*) denote the initial perturbation samples and the leading eigenvectors in model space, and **py**(*i* = 1, 2, ··· , *N*) and ρ**y**,*j*(*j* = 1, 2, ··· , *L*) denote the corresponding observational perturbation samples and leading eigenvectors. To further reduce the cost, the leading eigenvectors are selected based on the cumulative contribution of variance, and each leading eigenvector has three components in zonal, meridional and vertical directions, respectively. The empirical orthogonal function [8] and

the sine function [10] are used for the zonal and vertical components, and the meridional component, respectively. Thus, the ensemble component of the En4DVar system is solved in the subspace consisting of the extended perturbation samples that are generated by the abovementioned Schür products. For more details refer to Zhu et al. [24].

For local observations, the approximation in the extended perturbation samples introduces little error. In contrast, it is more complicated for non-local observations, such as radiance observations, which does not have explicit vertical coordinates. In order to implement the vertical localization, we need to properly define the vertical coordinates of radiance observations.

#### 2.2.2. Vertical Positioning of AMSU-A Radiance Observation

As mentioned earlier, each radiance observation depends on the atmospheric states at multiple vertical layers. Therefore, its vertical coordinate cannot be given explicitly as conventional observations. The weighting function of the radiance observation at a specific horizontal position reflects the contribution of the observation to the atmospheric state at different vertical layer [20]. The weighting function is typically calculated by the vertical difference of the transmittance of the satellite channel, which is dependent not only on the satellite channel but also on the atmospheric profile. In this study, we proposed the weighted average hypsometry to define the vertical coordinates of AMSU-A radiance observations:

$$P\_{jpr,lch} = \frac{\sum\_{k=1}^{K} w\_{jpr,lch,k} \times p\_{jpr,k}}{\sum\_{k=1}^{K} w\_{jpr,lch,k}} \tag{2}$$

Here, the subscripts *jpr*, *lch* and *k* denote the atmospheric profile, the satellite channel and the atmospheric vertical layer; *K* is the number of atmospheric vertical layers; *P* denotes the vertical coordinate of radiance; *w* and *p* represent the weight of the satellite channel and the pressure of the atmospheric profile.

In practice, when the radiance observations are assimilated, the Radiative Transfer for TOVS (RTTOV) model calculates the transmittance for each satellite channel. Therefore, the transmittance can be obtained directly from the RTTOV model for the calculation of weighting functions.
