*Article* **Improved Estimation of O-B Bias and Standard Deviation by an RFI Restoration Method for AMSR-2 C-Band Observations over North America**

**Wangbin Shen 1, Zhaohui Lin 2,3, Zhengkun Qin 1,3,\* and Xuesong Bai <sup>1</sup>**


**Abstract:** Spaceborne microwave radiometer observations play vital roles in surface parameter retrievals and data assimilation, but widespread radio-frequency interference (RFI) signals in the C-band channel result in a lack of valuable data over large areas. Establishing repaired data based on existing observation information is crucial. In this study, Advanced Microwave Scanning Radiometer (AMSR)-2 C-band data affected by RFI were accurately repaired through the iterative principal component analysis (PCA) method in 2016 over the U.S. land area. The standard deviation (STD) and bias characteristics of the brightness temperature in the C-band vertical polarization channel were compared and analyzed before and after the restoration to verify the assimilation application prospect of the repaired data. Not only was the spatial continuity of the microwave imager observations significantly improved following restoration; the STD and bias of the observation minus background (OMB) of the restored data were basically consistent with those of the RFI-free data. The STD of OMB exhibited obvious seasonal variations, which were approximately 4.0 K from January to May and 3.0 K from June to December, whereas the biases were near zero in winter but negative (approximately −2.0 K) in summer. The surface type and terrain height also critically affected the STD and bias. The STD decreased with increasing terrain height, whereas the bias exhibited the opposite trend. The STD was largest in low-vegetation areas (4.0 K) but only approximately 2.0–3.0 K in pine forest and brush areas. These results show that the restored data have a high prospect for retrieval application and assimilation, and the STD and bias estimation results also provide a reference for land-based AMSR-2 data assimilation.

**Keywords:** AMSR-2; radio frequency interference; PCA iterative restoration; community radiative transfer model; bias correction

**1. Introduction**

A number of low-frequency microwave radiometers have been put into use (e.g., AMSR-2, Advanced Microwave Scanning Radiometer 2, etc.), which have offered opportunities for the derivation of more direct surface parameter estimations [1–5]. Modern numerical weather predictions (NWPs) rely on assimilating these satellite observations and retrievals to initialize the current state of the land surface accurately [6–11].

The continuous improvement of the assimilation effect has always been the goal of AMSR-2 data assimilation research [12,13]. During the data assimilation process, appropriate adjustment of the background field is determined by the observation error characteristics of the observation data and the background field, as well as some physical mechanisms. Due to the lack of true values, the STD of OMB is often used to characterize the observation

**Citation:** Shen, W.; Lin, Z.; Qin, Z.; Bai, X. Improved Estimation of O-B Bias and Standard Deviation by an RFI Restoration Method for AMSR-2 C-Band Observations over North America. *Remote Sens.* **2022**, *14*, 5558. https://doi.org/10.3390/rs14215558

Academic Editor: Steven C. Reising

Received: 2 September 2022 Accepted: 31 October 2022 Published: 4 November 2022

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

error in data assimilation studies. Therefore, accurate STD estimations of OMB have an essential impact on the effect of data assimilation [14–16].

Bias estimation also plays a crucial role in satellite data assimilation (DA), since it is assumed that the differences between the background and observations satisfy an unbiased Gaussian distribution. In DA theory, systematic bias between satellite-observed and model-simulated radiances should be removed as a necessary condition for meeting this requirement [17,18]. Furthermore, the corrected brightness temperatures are also essential for other steps within DA, for example, cloud detection [19], which depends on the observation-minus-background (OMB) departures [18]. The proper treatment of such systematic biases is critical for the success of data assimilation systems [9,20–26].

Many studies have shown that both effective bias correction and STD estimation are significant prerequisites for successful data assimilation [9,25], but the current estimation methods mostly provide a uniform estimate over the ocean in consideration of the high spatial consistency of the ocean surface. However, the biggest difference between land and sea is the complex underlying surface characteristics of land.

In addition to large STDs caused by the artificial RFI, the variable underlying surface types over land cause considerable error in the surface emissivity. Moreover, a change in surface elevation will further complicate the simulation errors of brightness temperature caused by the surface temperatures and surface emissivity. Therefore, the assimilation of AMSR-2 data over land requires the targeted estimation of OMB standard deviations for different vegetation types and terrain heights on the basis of the current accuracy of the surface emissivity and surface temperature. Thus, the observation weight can be adjusted adaptively in the actual assimilation process and the effective assimilation of the AMSR-2 data over land can be realized.

However, the research on bias correction and STD estimation for AMSR-2 data has been restricted by RFI. AMSR2, which contains a low-frequency C-band (6.9-GHz and 7.3-GHz channels) and an X-band (10.7-GHz channel), is suitable for soil moisture monitoring [27–29]. The optimal low-frequency channel for data assimilation and retrieval using AMSR-2 is the 6.9-GHz channel, as this relatively low frequency responds to a deeper soil layer and is less attenuated by the atmosphere and vegetation than other channels [30]. However, the 6.9-GHz channel is also prone to interference by RFI signals, and the strong signal interference of RFI makes it impossible to effectively estimate the STD and bias of data from this channel, which makes the application of the channel data very difficult. Japan Aerospace Exploration Agency (JAXA) soil moisture products are mainly constructed based on the results retrieved from the 10.7-GHz channel due to the wide range of radio frequency interference (RFI) that occurs globally [28].

RFI refers to the radiation signal received by a satellite microwave radiometer that is confused by active remote sensing signals with similar bands to those used in human activities [31]. The strong signals emitted from these interfering sources conceal relatively weak thermal radiation signals from the Earth–atmosphere system, thus leading to the distortion of observations and causing significant increases in the brightness temperature of the detectors at the low-frequency band [31,32]. Numerous studies have shown that RFI is an extremely vital and nonnegligible factor in low-frequency bands (such as the C-band and the X-band), causing an anomalous bias which affects the application of microwave radiometer data [7,33,34].

An RFI filter has been used before data assimilation in a number of studies [7,9,35]. However, eliminating observational data from the low-frequency channel, which is affected by interference, inevitably causes a large amount of data to be wasted, and may also lead to a large range of observation data being lost.

To compensate for the loss of a large amount of observation data caused by RFI, Shen et al. (2019) [36] proposed an RFI data restoration method based on principal component analysis (PCA), making full use of the channel correlation and the spatial continuity of observations.

Most of the studies on AMSR-2 assimilation directly discard the data affected by RFI. Although the restored data can fill a wide range of observational data gaps, the applicability of these restored data in the assimilation process still requires further evaluation; specifically, answering the question of whether this restoration method can retain the STD and bias characteristics of the observational data is crucial for research on the follow-up of targeted bias-corrections and observational weight settings in the assimilation process. Therefore, in this paper, we used the established PCA iterative restoration method to repair RFI-affected data and then evaluated the bias and STD characteristics before and after the restoration process for different vegetation types and terrain heights. We hoped to provide more accurate bias and STD estimation results for AMSR-2 data assimilations over land.

The paper is structured as follows. In Section 2, we briefly describe the AMSR2 radiance data and the community radiative transfer model (CRTM), and give a brief introduction to the RFI detection and restoration method. In Section 3, we presents the validation of the restoration method and then compare and analyze the bias characteristics of the data before and after RFI restoration. Conclusions and discussions are summarized in Section 4.

### **2. Materials and Methods**

#### *2.1. AMSR-2 Brightness Temperature Observations*

AMSR-2, an instrument carried on GCOM-W1, is a 14-channel, dual-polarization conically scanning passive microwave radiometer with 7 frequencies ranging from 6.9 to 89.0 GHz. This radiometer detects faint microwave emissions from the surface and atmosphere of Earth. The AMSR2 radiance observations frequencies are 6.9, 7.3, 10.65, 18.7, 23.8, 36.5, and 89.0 GHz, as listed in Table 1 [37]. The low-frequency channels below 10.65 GHz are usually used to retrieve various surface parameters, such as the soil moisture, vegetation water content, and snow thickness, as they are window channels with strong vegetationand soil-penetrating abilities [2,3,5]. The surface incident angle of AMSR2 is maintained at 55 degrees, as this angle is less affected by sea surface winds and produces a large difference between the horizontal and vertical polarization results. The interval between the two conical scans is 1.5 s. The satellite advances approximately 10 km along the running track during this interval, and the width of one scanning line is approximately 1450 km. This scanning process can cover 99% of the world in two days.

**Channel Frequency (GHz) Polarization Bandwidth (MHz) Resolution (km) Sensitivity (K)** 1/2 6.925 H/V 350 35 × 62 0.34 3/4 7.3 H/V 350 34 × 58 0.43 5/6 10.65 H/V 100 24 × 42 0.7 7/8 18.7 H/V 200 14 × 22 0.7 9/10 23.8 H/V 400 15 × 26 0.6 11/12 36.5 H/V 1000 7 × 12 0.7 13/14 89.0 H/V 3000 3 × 5 1.2

**Table 1.** AMSR2 characteristics and performance.

The study domain is the central and southeastern United States (30◦–40◦N, 260◦–285◦W) where C-band AMSR-2 radiance data are seriously affected by RFI. This domain also includes a variety of temperate land cover types with complex topography [38]. Performing the experiments in this domain allowed us to test the impact of the PCA iterative restoration method on changeable surface types and terrain.

To certify that this restoration method had good stability and prospects for data assimilation, it was necessary to obtain a sufficiently vast data sample to conduct RFI identification and restoration. Therefore, in this study we selected the AMSR-2 L1R-class observed brightness temperature data covering the study domain for the one-year period of 2016 (1 January to 31 December).

#### *2.2. Background—CRTM Simulations*

Three fast radiative transfer models have been applied worldwide: the radiative transfer for TOVS (RTTOV) [39], the community radiative transfer model (CRTM), and the advanced radiative transfer model system (ARMS) [40]. In particular, the newly developed ARMS model can be applied to the assimilation of data from the Fengyun satellites and those sensors not included in existing radiative transfer models [40,41]. The CRTM was developed by the U.S. Joint Center for Satellite Data Assimilation (JCSDA) to provide fast and accurate satellite radiance simulations and Jacobian calculations at the top of the atmosphere under all weather and surface conditions [42]. Only the CRTM model was used in this study. It can be shown that the measured radiance in this case is a weighted average of the atmospheric temperature profile.

Figure 1 showed the weighting functions calculated by the atmospheric profiles over ocean (a), and at altitudes of 1000 (b), 2000 (c) and 3000 (d) meters over land, respectively.

**Figure 1.** Weighting functions of the AMSR-2 channel 1–14 using CRTM based on the atmospheric profile over ocean (**a**) and for terrain height of 1000 m (**b**), 2000 m (**c**) and 3000 m (**d**) over land.

The weighted function *K*(*p*) can be calculated as follows:

$$K(p) = {}^{d\tau}/d!n(p) \tag{1}$$

here *τ* means the atmospheric transmittance, *p* is for the pressure [43].

The weighting functions were calculated based on the atmospheric profiles using the CRTM. The profile information includes temperature, specific humidity and pressure profiles, as well as surface temperature and surface wind field information. It can be seen that weighting functions change little for channels with frequencies less than 10.7 GHz, but for other channels' weighting functions, the differences between the ground and the atmosphere gradually decrease with the increase of terrain height. The weighting functions of the channels with different polarization modes at the same frequency were consistent [44]. The peaks of the weighting function for each channel was located near the surface, as the microwave imager was mainly designed to improve our ability to detect surface parameters through remote sensing.

The amount of radiation detected by the microwave imager is represented by a weighted sum of surface radiation and atmospheric upward microwave radiation in different vertical layers near the ground; this value is mostly sensitive to the atmospheric temperature at the height of the maximum weighting function. The horizontal polarization channel and the vertical polarization channel with the same frequency have the same weighting function.

On the lowest-frequency channel (i.e., 6.9 GHz), the atmosphere contributes the least to the amount of observed radiation. The higher the frequency of the channel is, the wider the weighting function is. The weighting functions of the low-frequency channels are generally located inside the high-frequency channels, except for the 23.8- and 36.5-GHz channels. Thus, the brightness temperatures observed between different channels are highly correlated if the atmospheric contribution is significant [44].

#### *2.3. Model Input—ECMWF Reanalysis Data*

European Center for Medium-Range Weather Forecasting (ECMWF) hourly reanalysis data, with a horizontal resolution of 0.25 × 0.25 degrees and 37 vertical model levels, were used as the input for the CRTM. The input variables for CRTM include the threedimensional atmospheric temperature, water vapor mixing ratio, and air pressure, as well as the two-dimensional surface variables of soil moisture, surface skin temperature, wind speed, and wind direction.

Hourly ECMWF liquid water path (LWP) reanalysis data with a horizontal resolution of 0.25◦ × 0.25◦ were used to identify data collected under clear-sky conditions.

#### *2.4. OMB Calculation Method*

In this study, we used the International Geosphere-Biosphere Programme (IGBP) surface type dataset to identify the continental brightness temperature data. Among all the AMSR-2 pixels labeled as "water" in terms of their surface type, further works were carried out to eliminate the pixels within 50 km from coastlines to remove those mixed pixels with water.

Although microwave radiation is able to penetrate some non-precipitating clouds, it is basically unable to penetrate deep precipitation clouds. Even in penetrable clouds, various particles affect microwave radiation through absorption, emission and scattering effects [45,46]. To prevent effects associated with brightness temperature simulation uncertainties in cloudy areas on the bias and STD estimation, in this study we only used data obtained over continental areas under clear-sky conditions.

In order to acquire the simulated brightness temperature at AMSR-2-observed pixel locations and times, polynomial interpolation and linear interpolation were performed on the ECMWF analysis dataset in the horizontal and temporal dimensions, respectively. We processed the hourly ECMWF liquid water path (LWP) data in the same way. The brightness temperature data were considered "cloudy" data when the cloud water path value was greater than 0.01 g/kg, thus allowing us to identify data collected under clear-sky conditions. For the threshold, we referred to the study by Zou et al. (2017) [47]. The total water and ice cloud contents are close to 0.01 kg m−2, which is used as the threshold to detect the cloud in Zou et al. (2017) [47].

Due to the lack of true observed values, the observation errors in the brightness temperature data are mostly estimated by obtaining the standard deviations of the OMB (observation-minus-background) values [48–51]. In satellite data assimilation, both the observations (*O*) and model simulations (*B*) are assumed to be unbiased. Therefore, STDs can be expressed as:

$$
\Delta D\_i = O\_i - B\_i
$$

$$
\sigma = \sqrt{\frac{\sum\_{i=1}^{N} \left(\Delta D\_i - \overline{\Delta D}\right)^2}{N - 1}}\tag{2}
$$

where *Oi* and *Bi* are the observed and simulated brightness temperature values on the same pixel, respectively, and Δ*Di* means the OMB value of the pixel. Δ*D* and *σ* represent the mean value and the standard deviations of the OMB value, respectively. *N* represents the counts of all the continental pixels under clear-sky conditions.

#### *2.5. RFI Detection Method—Normalized Principal Component Analysis (NPCA)*

The spatial correlations of natural-radiation-generated microwaves among different AMSR-2 instrument observation channels are often very high, as natural surfaces usually produce ultrawideband and smooth microwave radiation.

However, the brightness temperature of the low-frequency AMSR-2 channel increases significantly and abnormally in cases where RFI signals exist, resulting in weakened correlations between these RFI-affected channels and the other channels. The NPCA method, which takes advantage of the aforementioned feature, can effectively identify RFI signals through a PCA decomposition of the constructed interference coefficient matrix, using the brightness temperature difference calculated between the low-frequency channel and the high-frequency channel (low-high). On the other hand, the brightness temperature of the high-frequency channel can be strikingly reduced under the scattering effect of some natural targets (such as ice and snow), thus resulting in an inverse spectral difference gradient in continental regions covered with ice and snow. Therefore, Zou et al. (2013) [52] proposed an RFI detection method for NPCA analyses that has been shown to be effective for identifying RFI in data collected over snow- and ice-covered surfaces; this proposed method is suitable for identifying RFI over complex continental areas with mixed winter snow and RFI signals or over non-scattering surfaces in summer.

#### *2.6. RFI Restoration Method—Iterative PCA Method*

To compensate for the loss of a large amount of observation data caused by RFI, Shen et al. (2019) [36] proposed an RFI data restoration method based on principal component analysis (PCA). PCA can be used to extract observation information at different spatial scales into some independent PCA modes. The iterative PCA restoration method was established to obtain the correct brightness temperature of the RFI-affected point according to the correct observations around it.

For any observation, if the NPCA method recognizes that this observation has been affected by RFI, then on the satellite orbit where the point is located, the observation data from multiple channels for RFI-free points within the experience range of 350 km around the target point can form a repair matrix containing the target point, but the brightness temperature of the target point will be set to an initial value of 0.

PCA modes representing spatial features with different scales can be obtained through PCA decomposition of the matrix. For any data matrix B, the PCA modes correspond mathematically to the eigenvectors of the covariance matrix of B. The order of the PCA modes is determined based on the eigenvalues of the matrix corresponding to the eigenvectors. The higher-ranked modes correspond to larger eigenvalues, and larger eigenvalues correspond to spatial features with larger values of covariance. In relation to atmospheric variables, a large value of covariance often corresponds to more energy, and the energy of a large-scale weather system is generally much larger than that of a small-scale weather system. Thus, the PCA modes of meteorological variables often correspond to the weather variability features at different scales. More details can be found in Demšar et al. (2013) [53].

The brightness temperature of the target point, determined by means of a large-scale spatial structure, can be obtained by iteratively repeating the reconstruction process of the first mode. The same iterative restoration process can be performed for the rest of the PCA modes, and when all PCA modes are included, the final iterative repair results are obtained.

The proposed restoration method was used to recover observations affected by RFI with high precision [36]. The results of theoretical experiments and real data restoration experiments proved that the accuracy and effectiveness of the new method were much better than those of the Cressman method. Furthermore, the spatial continuity of observations in the recovered data were very well preserved by the new method.

#### **3. Results**

#### *3.1. C-Band Continental RFI Characteristics*

The NPCA method, described in Section 2.5, was used for RFI detection on C-band AMSR-2 data in this study. Figure 2 shows the brightness temperatures obtained by the AMSR-2 instrument in the 6.9-GHz and 10.7-GHz vertical polarization channels (Hereinafter referred to as 6.9-GHz-V and 10.7-GHz-V) over the area of the U.S. in the autumn of 2016, as well as the spatial distribution of the RFI signals identified through NPCA (Figure 2a–c). The brightness temperature of the 6.9-GHz channel was generally less than that of the 10.7-GHz channel for most of the continent, because the dielectric constant of water in soil and vegetation depends on this frequency, thus resulting in an increased surface emissivity with an increasing frequency [30]. However, the presence of an RFI signal at the 6.9-GHz frequency caused the brightness temperature of this frequency to increase abnormally, thus resulting in a spectral difference with an opposite sign to that expected. The brightness temperatures of the 6.9-GHz channel in the concentrated areas of Virginia, North Carolina, Texas, and other states were significantly higher than the brightness temperatures of the higher-frequency 10.7-GHz channel, which were far above 300 K, with notable horizontal spatial distribution discontinuities. In the identification results obtained using the NPCA method, the larger the value was, the stronger the possibility of RFI interference. As shown in Figure 2c, regions with abnormally high brightness temperatures (shown in Figure 2a) were detected as having significant RFI signals.

**Figure 2.** Spatial distributions of brightness temperatures of the 6.9-GHz-V channel (**a**) and the 10.7-GHz-V channel (**b**) over the U.S. continental area in the autumn of 2016; RFI signals identified by the NPCA for the 6.9-GHz-V channel are shown in (**c**).

The NPCA method was used for the detection of RFI signals in the horizontal and vertical AMSR-2 6.9-GHz channels over the study domain in 2016, and a daily variation curve of the proportion of the 6.9-GHz-V and 6.9-GHz-H channel scanning points affected by RFI for the land scanning points was obtained for the study domain (Figure 3). In Figure 3, the red line represents the vertical channel and the blue line represents the horizontal polarization channel. The figure shows that both the horizontal and vertical channels in the study region encountered continuous RFI signals throughout the year. In particular, the degree of interference in the vertical channel was obviously greater than that in the horizontal channel. Thirty to forty percent of the data were not available for data assimilation or retrieval applications because of RFI interference.

**Figure 3.** Daily variation curves of the proportion of pixels affected by RFI in the study domain for the 6.9-GHz-H (blue) and 6.9-GHz-V (red) channel in 2016.

#### *3.2. RFI Restoration and Validation*

Figure 4 shows the spatial distributions of the mean observed (a) and restored (b) brightness temperatures of the 6.9-GHz-V channel and the mean observed brightness temperatures of the 7.3-GHz-V (c) and 10-GHz-V (d) channels in autumn 2016. Comparing Figure 4a,b, it can be seen that those abnormally high brightness temperatures caused by RFI were well repaired. The overall geographic distribution of the brightness temperature showed good spatial continuity after this restoration, and the spatial distribution was consistent with the natural surface emission characteristics; in addition, the small-brightness temperature characteristics were restored as well.

In addition to the existing AMSR-E channel, two more channels were added to the AMSR-2 with frequencies near 6.9 GHz and 7.3 GHz. Anne et al. (2015) showed that the RFI phenomenon in the 7.3 GHz observation channel was significantly reduced in the U.S., Japan, and India, where there was severe pollution in the 6.9 GHz channel. As can be seen from Figure 4c, only a few regions showed abnormally high brightness temperatures over 300 K, such as northern West Virginia, central and eastern Alabama, and southern Kansas. However, in the corresponding region of the 6.9-GHz-V channel, there were no abnormally high brightness temperatures. The brightness temperatures of 6.9-GHz-V were generally lower than those of 10.7-GHz-V, except for the RFI-affected region. The frequencies of the 6.9-GHz channel and the 7.3-GHz channel were very close, so the brightness temperatures of the 7.3-GHz channel could be used qualitatively to verify the correctness of the repaired brightness temperatures. It can be seen that the spatial structure of the restored brightness temperature was similar to that of the 7.3-GHz channel. The low-value center in the middle of the region was well reproduced, and the spatial structures of three brightness temperature centers in the northeast of the United States, which were severely impacted by RFI, were also well restored.

**Figure 4.** Spatial distributions of mean observed (**a**) and restored (**b**) brightness temperatures of the 6.9-GHz-V channel and the observed 7.3-GHz-V (**c**) and 10.7-GHz-V (**d**) channels in autumn 2016.

Figure 5 shows the distribution of the brightness temperature difference between the 6.9-GHz-V channel and the two high-frequency channels, 7.3-GHz-V (a) and 10.7-GHz-V (c), respectively. Figure 5b,d are the same as Figure 5a,c except for the restored brightness temperatures of the 6.9-GHz-V channel. RFI interference led to an abnormal increase in the brightness temperature values, resulting in the opposite spectral differences. Therefore, the larger the positive value in the spectral difference, the more affected were the values in the 6.9-GHz-V channel by RFI. As can be seen in Figure 5a,c, a large area of this region was affected by RFI, and the differences were even greater than 10 K. As can be seen in Figure 5b,d, this difference was basically within 5 K after the repair process. This indicates that the abnormal brightness temperature was well corrected, and also proves the effectiveness of the restoration method.

In consideration of the relatively high percentage of RFI signals in the 6.9-GHz-V channel (the red curve in Figure 3), in this study, we focused on the observation bias and STDs of the 6.9-GHz-V channel in the subsequent analysis.
