*3.3. Comparison of Simulated Brightness Temperature under Clear- and Cloudy-Sky Conditions over Ocean*

The hourly cloud liquid water paths based on ERA5 reanalysis data were used to detect clear sky and cloudy data. In order to prove the accuracy of the cloud detection process and the reliability of CRTM simulation, the ocean surface area within the study area was selected for the comparison of OMB characteristics between clear-sky and cloudy areas.

Here, the monthly OMB standard deviations in clear-sky (blue line) and cloudy-sky (red line) areas were calculated separately (Figure 6). The OMB standard deviation in the cloudy area was approximately 6.0 K; this value was much larger than that obtained for the clear-sky area, with an obvious monthly difference. The largest standard deviation, reaching 7.26 K, was observed in June, whereas the smallest value was obtained for December. This may be due to the prevailing convective weather in summer, resulting in more deep clouds. However, the simulation errors for the clear-sky area were primarily reduced, as the standard deviation was maintained at around 0.9 K with a minimal standard deviation of approximately 0.6 K from June to July. The stationary standard deviation in the clear sky areas also proves the effectiveness of the cloud detection method.

**Figure 5.** Spectral difference between the observed 6.9-GHz-V channel and the 7.3-GHz-V (**a**) and 10.7-GHz-V (**c**) channels, respectively, and the restored 6.9-GHz-V channel and observed 7.3-GHz-V (**b**) and 10.7-GHz-V (**d**) channels in autumn 2016.

**Figure 6.** Monthly variations of OMB standard deviations (solid line) and bias (dotted line) for data in clear-sky (blue line) and cloudy-sky (red line) conditions over ocean from the 6.9-GHz-V channel in 2016.

It can be seen that there was also a large discrepancy between the monthly OMB bias in clear-sky (blue dotted line) and cloudy-sky (red dotted line) areas over ocean. The simulation was relatively accurate in clear-sky conditions, and the bias was basically below 1 K, with a minimum bias of zero in summer. The bias under cloudy conditions was significantly larger than that for clear-sky areas on the whole, and the bias value was basically around 3 K, with a maximum value of 3.8 K in September and October. The bias changed slightly from January to June, and was stable around 3.7 K.

#### *3.4. Standard Deviation over Land*

Figure 7 depicts the averaged OMB values before and after the restoration for the 6.9-GHz-Vchannel within the selected domain in autumn 2016. It reveals that the RFI area exhibited an obviously large bias without restoration (Figure 7a), basically exceeding 15.0 K and even exceeding 100.0 K at the maximum point. The simulation errors in the RFIaffected area were significantly reduced following the repair process (Figure 7b), with errors basically within 5.0 K, apart from some systematic deviations in high-terrain areas. Using

the training data sets obtained under RFI-free conditions from AMSR-E, Wu et al. (2011) [54] developed the linear relationship between the measurements obtained at 10.7 GHz and those at 18.7 or 6.9 GHz. Then, the RFI-affected brightness temperatures were corrected based on the RFI-free measurements at 18.7 or 10.7 GHz via this linear relationship. The RFIcorrection algorithm was able to produce brightness temperatures at AMSR-E frequencies with a root mean square (RMS) error of no more than 1.5 K. In this study, we focused on the 6.9-GHz-V channel of AMSR-2. The standard deviation of the OMB of this channel was 6.7 K, and it decreased to 4.0 K after restoration using the PCA iterative method.

**Figure 7.** Averaged OMB before (**a**) and after (**b**) the restoration for the 6.9-GHz-V channel in autumn 2016.

Although it is clear that the spatial continuity of the brightness temperature data was improved through the restoration process, the impact of this restoration method on the standard deviations still needs to be further clarified in order to apply this method in data assimilation. Figure 8 shows the standard deviation and mean values of the OMB before (magenta line) and after the restoration (red line) for RFI-affected data of the 6.9-GHz-V channel. For comparison, the undisturbed data (blue line) are also shown here. The pink and gray bars in Figure 8a represent the numbers of RFI-affected and RFI-free pixels, respectively. The OMB standard deviation of the unpolluted data was approximately 4.0 K from January to May, whereas this value remained at approximately 3.0 K from June to December. The standard deviation for RFI-interfered data was significantly higher than that of the pollution-free data, with a value of approximately 8.0 K, with a minimal OMB standard deviation of 6.4 K obtained in June. From this OMB standard deviation comparison, it can be seen that the OMB STD values of RFI-affected data were significantly reduced after the restoration. The OMB STD of the restored data in each month was basically similar to that obtained from the RFI-free data; even monthly variation characteristics were also effectively reproduced in these OMB STDs.

As seen from the bias variation shown in Figure 8b, the bias of the RFI-free data was within the range of ±3.0 K. This varied obviously with the season, about 2 K in winter and −2 K in summer. From winter to summer, the bias basically showed the characteristics of a gradual decrease. The bias of RFI-affected data was significantly higher than that of the RFI-free data. The high values reached 9 K, and the low values were above 3 K. It also showed the same seasonal variation characteristics as the correct data. However, after the restoration, the bias derived for each month was very close to that obtained from the nonpolluted data, and the seasonal variation characteristics were effectively reproduced, further confirming the rationality of the restoration method. The land surface temperature had a strong impact on the simulated brightness temperatures. Some previous studies have pointed out that there are obvious seasonal biases in the surface temperature of ERA5 LST, attributed to uncertainty in land surface variables such as the leaf area index and land cover type, etc. [55]. This is a possible reason for the formation of seasonal differences in OMB biases.

**Figure 8.** Monthly variations of the OMB standard deviation (**a**) and bias (**b**) obtained from the RFI-affected data before (magenta) and after (red) the restoration process and from the RFI-free data (blue) in 2016. The column bars represent the counts of considered data.

#### *3.5. Variation Characteristics of STDs with Terrain Height and Surface Type*

In contrast with the marine domain, which has uniform underlying surface properties, the underlying surfaces in land areas have two important characteristics: significant discrepancies in topographic height and changeable surface types. The STDs and bias estimation results obtained in land areas are thus inevitably affected by these two characteristics.

The biggest discrepancy between the assimilation of microwave imaging data over the land surface and the ocean is the complexity of the land surface's emissivity. In the microwave range, the land emissivity model is complicated as the land emissivity of each surface type depends on different parameters, such as soil moisture, topography, and the presence and physical properties of vegetation or snow [56]. The surface emissivity error may be significantly different for different land surface types, which will inevitably lead to inconsistency in the brightness temperature simulation bias observed over different land surface types. Therefore, it is necessary to estimate the STDs according to different surface types for the assimilation of AMSR-2 data over land. In addition, the errors of the surface temperature and wind field are much larger than those of variables in the upper atmosphere, so it is particularly important to estimate the OMB bias and STD according to the land cover type. After that, the effective bias correction and observation error specification can be achieved in the assimilation, to effectively account for the observation information of different vegetation types.

To increase the representative of the statistical results, the OMB values of the AMSR-2 6.9-GHz-V channel in the study domain were converted into grid data with a horizontal resolution of 0.25◦ × 0.25◦. The spatial distributions of the standard deviations and bias before and after the restoration for 2016 within the analyzed land area are show in Figure 9. For comparison, the spatial distribution of terrain and vegetation types are also shown in the figure.

**Figure 9.** Spatial distributions of the terrain heights (**a**), surface types (**b**), standard deviations (**c**,**d**), and bias (**e**,**f**) before (**c**,**e**) and after the restoration (**d**,**f**) in the analyzed land area.

It can be seen from the topographical distribution shown in Figure 9a that the topography in the study domain was complex, exhibiting a large gradient that was mainly characterized by a distribution in which eastern areas were higher than western areas. The elevation of the Appalachian Mountains in the eastern study domain was relatively high, ranging from approximately 1000 to 1500 m. The elevations in the west Mississippi River Plain and the south Gulf Coast Plain were lower in comparison. As seen from the surfacetype distribution (Figure 9b), the study domain mainly consisted of distributed pine trees, brush forests, and a small area of low vegetation. As seen from the STD distribution of the integral observed data in the 6.9-GHz-V channel (Figure 9c), which was abnormally large (above 4.0 K), the whole study domain was seriously affected by RFI before the restoration was applied. In the domain, the region with the largest STD—of approximately 7.0 K—was found in the Mississippi River Plain.

After the restoration of the disturbed brightness temperature data, the standard deviations characterizing this region were significantly reduced (Figure 9d). The standard deviation in the Mississippi River Plain area was approximately 3.0 K; this value was basically reduced to approximately 2.0 K in the other areas. The standard deviation in the Appalachian Mountain region basically decreased to less than 1.0 K after the restoration; this value was lower than that of the plain region because the low-frequency AMSR-2 observations are highly sensitive to soil moisture variations, which were relatively small in the mountainous region, leading to the smaller STDs obtained for this area than those obtained for the plains region. The observation bias was correspondingly large due to the strong RFI effect, as seen from its distribution (Figure 9e), with the highest mean value located in the Appalachian Mountains at approximately 8.0 K. The bias in the plain area was relatively low, with values between −3 and 3.0 K. After the restoration, the biases

in most areas decreased significantly, approaching close to 0.0 K, but a positive bias was maintained in high-terrain areas, whereas the negative bias persisted in coastal and central low terrain areas.

Figure 9 shows that the spatial distribution of bias and standard deviations was very similar to that of vegetation types. In order to further clarify the impact of RFI restoration on different vegetation types, Figure 10 presents the OMB mean values and standard deviations before and after the restoration of RFI-affected data under different vegetation types.

**Figure 10.** Comparison of standard deviation (**a**) and mean values (**b**) of OMB before (red bar) and after (blue bar) the restoration of RFI-affected data for the AMSR-2 6.9-GHz-V channel in 2016 over brush, pine forest, and low vegetation within the selected domain.

It can be seen in Figure 10a that STDs were obviously reduced after RFI restoration under all different surface types. Among these, the restoration effect of brush-covered area was the most significant, with the STD decreasing from 8.0 K to about 3.6 K. Furthermore, the STD decreased from 6.3 K to about 3.6 K within pine-forest-covered areas. The STD of low-vegetation regions was the highest after restoration, around 4.8 K. This is because increased vegetation cover and surface roughness reduce the sensitivity of microwave observations to soil moisture, leading to greater uncertainty in the background simulation [30].

The bias of the restored data was also significantly lower than before. The bias of pine and brush forest regions decreased from around 4.0 K to about 0.0 K. The bias was reduced from 1 K to −2.0 K over low-vegetation area after the accurate repair process.

In addition to vegetation types, the rapidly changing topographic height is another important feature of the land surface that is different from the ocean surface. In order to further

evaluate the characteristics of data errors over land, the bias and STDs are also presented here for data under different terrain heights and different vegetation types (Figure 11). In contrast with Figure 10, the statistics here include all RFI-affected and RFI-free data, so that the statistical results can be directly applied to the actual data assimilation process.

**Figure 11.** Variation curves of the OMB mean values (**b**,**d**,**f**) and standard deviations (**a**,**c**,**e**) characterizing the 6.9-GHz-V AMSR-2 channel with terrain height in the U.S. in 2016. The red and blue lines represent the pre- and post-repair results, respectively.

Figure 11 shows the variation characteristics of the STDs and bias obtained before and after the restoration with varying terrain heights and surface types, with pink reticulated bars indicating the amount of data processed. In this study, we analyzed three major ground types that corresponded to large amounts of data, namely, pine forests, brush regions, and low vegetation. The results revealed obvious differences in the influence of RFI on the brightness temperatures corresponding to different vegetation types under different terrain heights. Among these differences, in pine-forest- and brush-covered areas, the restoration method had an obvious improvement effect on the STD and bias values at different elevations. The STD even reached 8.0 K before the restoration, whereas this value was maintained between 2.0 and 3.0 K following the restoration, decreasing gradually with increasing terrain. The bias value obviously increased with increasing terrain height; this trend was contrary to that of the STDs. The bias value increased rapidly with increasing terrain height below 700 m. When the elevation reached heights above 700 m, the bias was reduced from 8 K to basically below 4.0 K overall following the repair process. In the area covered by low vegetation (Figure 11e,f), RFI was most serious at elevations located below 500 m, where the STD even reached 12.0 K; this value decreased to approximately 4.0 K following the restoration. In the areas with elevations over 500 m, the STDs obtained before and after the restoration were similar, both of which were approximately 4.0 K, and these barely changed with regard to terrain variations. The maximum bias obtained for elevations below 500 m before restoration was 6.0 K, and this value was gradually stabilized from −4.0 K to −2.0 K following the restoration process.

#### **4. Discussion**

The data obtained from microwave radiometer observations have important application value, especially in the case of low-frequency-channel observations, which play a crucial role in the surface parameter retrieval and data assimilation required in NWP; however, the effects of large-range RFI signals in these low-frequency channels lead to a large amount of observation data being wasted.

To obtain more effective observational data that are applicable to retrieval and assimilation tasks, an iterative PCA method was proposed to repair the RFI-affected data. Although it is clear that the spatial continuity of the brightness temperature data was improved through the restoration, the question of whether this restoration method can retain the STD and bias characteristics of the observational data is crucial for subsequent targeted bias-correction and observational weight-setting research in data assimilation.

Based on AMSR-2 observations from 1 January 2016, to 31 December 2016, in this study, we used the NPCA method to identify RFI-affected data on the C-band (6.9 GHz) in the central and southeastern United States and then applied an iterative PCA method to repair the corrupted data.

Finally, the STD and bias characteristics of the data obtained before and after the repair task and of the pollution-free data collected from the 6.9-GHz-V channel were analyzed in detail, and specifically, the variation characteristics of the STD and bias observed in land areas with varying terrain heights and surface types were further examined, thus providing a corresponding reference for subsequent data assimilation tasks involving low-frequency-channel data from AMSR-2 in land areas.

The long-term restoration results obtained herein show that the applied restoration method was not affected by the terrain height, vegetation type, or seasonal differences. Therefore, the next step will involve assimilating the restored brightness temperatures into numerical models to explore the impacts of the brightness temperature restoration process on the data assimilation.
