1. Introduction
Sea surface temperature (SST) is a geophysical quantity of fundamental importance in the Earth system, since it is a controlling factor in air-sea fluxes [
1,
2] and therefore profoundly influences atmospheric and oceanographic thermodynamics [
3], dynamics [
4,
5] and coupled interactions [
6]. Near-real time estimation of global SST at adequate spatial resolution is crucial to weather forecasting by numerical weather prediction (NWP, [
7]) and errors in knowledge of SST can materially degrade weather forecast skill [
8,
9]. SST is used as the measure of Earth’s surface temperature over oceans [
10,
11,
12] and is therefore a key metric of climatic variability and change whose global evolution can be estimated back to the mid-19th Century [
12]. Historic observations of SST are relatively sparse prior to the satellite era [
13], and centennial-scale reconstructions draw heavily on the relative completeness and detail of remotely sensed SST [
14]. The series of Advanced Very High Resolution Radiometers (AVHRRs) have been operated since 1979 with channels supporting SST estimation, using differential-absorption-based techniques to account for the influence of the atmosphere on infra-red (IR) brightness temperatures [
15,
16,
17,
18]. Thus, reprocessing of multi-decadal satellite SST datasets has concentrated on IR sensors, namely, the AVHRRs [
19] and Along Track Scanning Radiometers (ATSRs; [
20]). Merchant et al. [
21] more recently used both AVHRRs and ATSRs jointly to develop a blended, gap-filled analysis for climate applications, analogous to the SST analyses produced operationally for NWP [
9,
22], but with more attention to long-term stability.
Microwave (MW) observations of SST were first attempted with the Scanning Multichannel Microwave Radiometer (SMMR) launched in 1978 and in 1999 the Tropical Rainfall Measuring Mission’s (TRMM’s) Microwave Imager began delivering SSTs of useful accuracy across the tropics. The record of globally SST-capable microwave radiometers is shorter, having commenced with the Advanced Microwave Scanning Radiometer-E (AMSR-E) in 2002. MW radiometry for SST has strengths and weakness relative to IR records. The primary advantage is coverage [
23]: MW SSTs are available over the open ocean under non-precipitating cloud cover, while both precipitation and cloud cover strongly limits the sampling available in the IR. MW SST is not available near coasts, near sea-ice and in areas of persistent radio-frequency interference (RFI). The spatial resolution of MW SST is typically 50 km [
24] compared to 1 km for IR, limiting the precision with which thermal ocean fronts can be located in MW imagery. The potential for confounding of SST signals by wind variability (via emissivity effects) is greater for MW SSTs than for IR SSTs. Nonetheless, since cloud cover is persistent in some seasons in climatologically significant regions, the coverage advantage of MW radiometry is such that the blending of MW and IR SSTs for climate data records should be considered.
AMSR2 is a microwave radiometer instrument flying on board the Japan Aerospace Exploration Agency’s (JAXA) Global Change Observation Mission 1st-Water (GCOM-W1) satellite, launched in 2012. This forms part of the “A-train” [
25] series of satellites that fly in the same orbit separated by a few minutes. It observes at 6.9, 7.3, 10.65, 18.7, 23.8, 36.5 and 89.0 GHz in both H and V polarizations. The 7.3 GHz channel is an addition compared to the predecessor AMSR-E instrument on Aqua and improves detection of radio frequency interference (RFI) from artificial sources.
This paper provides an information content analysis for the AMSR2 radiometer. Our aims are to establish the fundamental limits of retrieval uncertainty for AMSR2 SST retrieval in the framework of optimal estimation (OE), and to inform strategies about channel selection for developing a new MW SST product, ultimately intended for joint use with IR products in a climate data record. A previous study with similar objectives [
26] neglected the importance of variable cloud liquid water in MW SST retrieval, and did not address itself to the prioritisation of channels, both addressed here.
In
Section 2, we review some of the underlying physics relevant to MW SST retrieval, noting and contrasting the MW case from the IR case.
Section 3 reviews some background theory relating to information content analysis and OE. These are applied to SST retrieval from the AMSR2 instrument in
Section 4 and
Section 5.
2. Physical Considerations
Microwave thermal emission from the ocean surface occurs in the Rayleigh–Jeans tail of the Planck function. This is in contrast to the thermal IR, where the peak of the Planck function is in the 10.5–12.5
m window that is often used for SST remote sensing. The ocean surface emissivity (
) for the low-frequency AMSR2 channels is around ∼0.5 compared to an emissivity of ∼1 in the IR. The intensity of MW radiation at the top of atmosphere (TOA) is low, which is mitigated somewhat by the ability to use large (∼m) antennae for microwave instruments. Despite this, the effective noise equivalent temperature difference (NEdT) is larger in the MW region than in the IR. The longer wavelengths involved also give rise to diffraction effects that limit the spatial resolution of AMSR2 to ∼50 km. The MW emissivity of land and ice is significantly higher than the ocean. With contemporary instruments, this leads to side-lobe contamination of the ocean MW signal close to coasts and ice edges and prevents accurate SST retrievals in these areas. There is also a larger change in emissivity with polarisation over ocean compared to ice. This can be exploited for ice detection and classification [
27].
A significant advantage of using MW measurements when attempting to achieve global coverage of SST is that microwaves can penetrate cloud, so they can observe the surface signal under cloudy conditions wherein IR instruments cannot. This is useful, in particular, in persistently cloudy regions such as winter high-latitudes. Here, the restriction of IR instruments to clear-sky conditions decreases the temporal frequency of the observations and thus increases sampling errors.
This study utilises simulations of AMSR2 brightness temperatures by the fast radiative transfer model “Radiative Transfer for TOVS” (RTTOV; whose acronym has evolved into a name). We use the v11.3 software package [
28,
29,
30,
31] to carry out the simulations in
Section 4 and
Section 5. In the MW region, this uses the FAST EMissivity (FASTEM) code to calculate the surface emissivity which, for version 4, is described by Liu et al. [
32]. In this study, we use the latest version, FASTEM-6. The MW emissivity model involves a complex calculation, which we summarise below.
There are several models for the emissivity and permittivity of seawater [
33,
34,
35,
36,
37,
38,
39]. FASTEM-6 uses a method that starts from a formulation for the permittivity based on Ellison et al. [
33]. This describes the complex permittivity with a double Debye model:
Here, is the permittivity of free space and the frequency of the electromagnetic wave. The other parameters have been derived by fitting to measurements: has a linear dependence on temperature; , , , and are represented by polynomial fits to temperature (T) and salinity (S); and has a mixed polynomial and exponential dependence on temperature and salinity.
The modelled permittivity is used to calculate Fresnel reflectivities (
where
p is
v or
h for vertical and horizontal polarization components respectively) from the standard Fresnel equations. These are subsequently modified to effective values that account for other factors such as foam and surface roughness. In general, these factors add a dependency of the final emissivity on the wind vector (
). Surface roughness causes MW energy to be scattered both into and out of the direct line of sight of the surface by quasi-specular reflection events. FASTEM represents these with a two-scale model [
32,
40]. The small-scale waves have a size close to the wavelength of the emitted radiation. These small waves ride on the large-scale undulations of gravity waves. The correction to
for the small-scale features takes the form of a multiplicative factor
where
y is a polynomial fit to wind speed and frequency and
is the zenith angle of the observation. The large-scale correction (
) takes the form of an additive term with polynomial fit to frequency, wind speed and
. The wave orientation is accounted for by adding three cosine harmonics for the relative azimuth angle (
) between the observation and wind vectors. The wind-speed factors here act as a proxy for what is in reality the mechanical stress on the ocean due to the wind. This drives the creation of small scale waves and thus changes the effective surface area.
Above wind speeds of a few metres per second, foam begins to form on the sea surface [
38]. This is principally a mixture of water with air bubbles. FASTEM-6 calculates the fraction of the surface covered by foam (
f) using the expression of Monahan et al. [
41] where
. (An alternative form
by Tang [
42] is used in FASTEM-4.) The model then computes area-weighted mean values of foam emissivities (
) and the modified sea water emissivities. The foam emissivities are calculated using a combination of the zenith angle polynomial fit of Kazumori et al. [
43] with the linear frequency dependence from Stogryn [
44]. The final form relating the effective emissivities (
), Fresnel reflectivities and the correction factors is thus
where the functional dependencies are
,
,
,
,
.
2.1. Cosmic Microwave Background
The cosmic microwave background (CMB) is radiation from the recombination era of the early universe that has subsequently cooled due to the expansion of the universe and now forms a near isotropic source of background photons [
45,
46]. Its spectrum is characterised by an effective temperature of ∼2.73 K [
47]. We can make a simple estimate of the relative intensity of this source to emission from the Earth from the ratio of the black-body functions
for the two sources:
for
K and emissivity
. Although we have neglected surface roughness and atmospheric effects, this demonstrates that the contribution of the CMB to the observed TOA flux, although small, is not negligible and must be included in MW radiative transfer modelling.
2.2. Skin Depth
There is typically a cooling of order 0.2 K from a depth of ∼1 mm at the top of the ocean (the sub-skin) to the interface where the atmosphere and ocean meet. At IR wavelengths, electromagnetic waves are absorbed in a distance of order 10 and sample the ocean at the top of the skin layer and are thus sensitive to “SST-skin”. In contrast, microwaves have a frequency-dependent penetration depth measured in millimeters and so observations here are sensitive to SST-sub-skin. To compare or harmonise measurements made in the two wavelengths regions with those from in situ sources, retrievals must be corrected to the depth of in situ measurements, typically to . This requires a model for the skin effect and the diurnal warming.
Robinson [
48] gives an expression for the apparent temperature (
) seen by a radiometer assuming an exponential form for the temperature profile in the skin-layer. This temperature profile can written as
where
is the surface (interface) temperature and
is the sub-skin temperature. Using an e-folding distance
for the absorption of radiation at the surface, results in
where
.
If the cooling across the skin layer is due to molecular conduction, we might expect the temperature profile through the skin layer to be linear. A similar derivation using a total skin thickness
and such a linear assumption for
yields
where
.
2.3. Salinity
Salinity has a negligible effect on emissivity in the IR region but can be significant at MW wavelengths.
Figure 1 shows the change in brightness temperature with salinity for a given atmospheric profile. For the most SST-sensitive, low-frequency channels, the effect is relatively small across the typical range of global oceanic salinity (33–37 PSU). The effect is more significant, however, for the higher frequency channels and is temperature dependent. Including this effect in modelling would be more important in areas with a strong freshwater influence.
2.4. Emissivity Dependence on Wind
As noted at the start of this section, the ocean emissivity in the MW region is affected by wind speed through the generation of foam and large- and small-scale waves. Accurate modelling of these processes is difficult particularly at low frequencies and is an ongoing area of research.
Figure 2 shows the change in emissivity with wind speed for each of the channels for a SST of 297 K. The deviation from this azimuthal-mean emissivity value at a given wind speed is displayed against the separate wind-speed components in
Figure 3. The lack of azimuthal symmetry means that it is possible, in principle, to derive some information about the separate wind components from MW observations. The small size of the deviation, however, implies that this is a weak constraint.
2.5. Top-of-Atmosphere Radiance Dependence on Total Column Water Vapour
Water vapour acts as an additional source of absorption for radiation traveling through the atmosphere both at MW and IR wavelengths. There are interesting differences between the two regions, however. For illustrative purposes, consider radiative transfer for microwaves using a simple slab model of the atmosphere with absorptivity
a (equal to its emissivity
) and temperature
. Being in the Rayleigh–Jeans tail
and, for convenience in this section, we absorb the constants of proportionality into the temperature units. The radiance of the upward emission by the atmosphere at temperature
is then
and, similarly, the downward emission by the atmosphere is
The radiance from the surface emission at temperature
and the amount that is transmitted through to the top of the atmosphere is
The total outward radiance is thus
For a given column with fixed and , can either increase or decrease with atmospheric absorption according to the sign of the final bracket. For (as in the IR part of the spectrum), will always decrease as the absorption in the atmosphere increases. In the MW region, however, where , can increase with increasing absorption.
In reality, the situation is obviously more complex. Not only is the atmosphere not isothermal, but, across the global ocean, there is a large-scale correlation between the total column water vapour (TCWV) and . This sign of relationship, however, does occur and is counter to behaviour at IR wavelengths.
2.6. Top-of-Atmosphere Radiance Dependence on Total Cloud Liquid Water
At IR wavelengths, clouds are largely opaque, thus rendering observations of the surface impossible except perhaps in instances of thin cirrus. Microwaves penetrate non-precipitating clouds, although measured radiances are sensitive to the cloud liquid water content which must be included in any radiative transfer modelling.
Figure 4 shows the change in modelled brightness temperature for the same conditions but with the cloud liquid water profile scaled to achieve different total cloud liquid water (TCLW) values. There is a significant effect on all of the channels as well as clear differences in the sensitivity between channels. Not only does this emphasise the importance of including these effects in any modelling but also suggests that TCLW can be retrieved to some degree.
3. Information Content and Optimal Estimation
OE provides a means to combine measured values from an instrument with initial a priori estimates of physical quantities of interest to provide a best estimate of the true value of the physical quantities. It does this by weighting the observations and a priori values via the appropriate covariance matrices of their uncertainties. The solution is always an optimised (minimised) function of the squares of residuals between observation and solution.
From Rodgers [
49], the optimal estimate of the physical quantities in the state vector
is given by
This is the solution with maximum a posteriori probability given priori information and its uncertainty. In Equation (
13),
is a vector containing the observations,
is the Jacobian matrix describing the sensitivity of each of the measurements to each physical quantity,
is the uncertainty covariance matrix of the a priori values for the physical quantities and
is the uncertainty covariance matrix for the measurements. The quantity
is the observation vector that would result from the a priori state
. This must be calculated using a forward model and
is treated as linear in the region of
. This equation can be interpreted as a form of multi-dimensional “weighted average” between the a priori values for the retrieved quantities and the values of the retrieved quantities that would give rise to the observations. Consider very small values of the a priori uncertainties. Here, the second term vanishes and the best estimate of the retrieval vector is the initial a priori values. Conversely, for large a priori uncertainties or very low measurement uncertainties, the best estimate is dominated by the observation vector. The degree to which observations and modelled values in the final bracket differ is translated from observation space into physical-quantity space by the preceding matrices. No assumption about the Gaussianity or otherwise of the uncertainty distributions is required in the derivation of this equation. In the particular case of Gaussian uncertainty distributions, the maximum a posteriori solution is also the solution with minimum error variance.
The expected uncertainty covariance matrix for the retrieved variables is
In principle, this approach allows all sources of information about a problem to be combined with the correct weighting no matter how weak their sensitivity to the variables we are interested in. In practice, imperfect forward modelling and the lack of exact knowledge of appropriate covariance matrices, limit the degree to which additional observations improve the accuracy of the retrieved quantities.
Without performing any retrievals, we can calculate the degrees of freedom for signal in a measurement system from
gives an estimate of the number of distinct quantities that may be inferred from the measurements. It is not, in general, an integer because usually retrieved variables are only partially constrained rather than precisely determined. A fuller description of optimal estimation as applied to retrieval of SST is given by Merchant et al. [
50].
In the following sections, these techniques are applied to simulations using 2680 profiles over ocean taken from the EUMETSAT Satellite Application Facility on Numerical Weather Prediction (NWP SAF) 91-level dataset [
51] sampled for specific humidity. The RTTOV simulation code is used as the forward model to generate
and
appropriate to the AMSR2 instrument. A constant salinity of 35 PSU is assumed for all the profiles.
Prigent et al. [
26] carried out a similar analysis for a new mission concept, Microwat, simulating retrievals based on AMSR-E channel sensitivities. They retrieved SST and wind speed assuming initial uncertainties on these two quantities of 3.31 K and 1.33 m·s
respectively. They also carried out an information content analysis including water vapour content uncertainties of 10% on model levels. To provide comparability, we conduct an analysis below based on this specification using, as did Prigent et al. [
26], a retrieval vector containing the four variables SST (
), the natural logarithm of TCWV (
W) and the two wind-speed components (
):
with an assumed-diagonal
populated with a priori uncertainties of 3.31 K in SST, 10% TCWV and 0.94 m·s
for each wind component. We also extend this approach using a retrieval vector with five variables:
that includes the logarithm of TCLW (
L). With this formulation, we use a priori uncertainties of 1 K in SST, 10% in TCWV, 1.41 m·s
in each wind component and 10% in TCLW. Retrieving the logarithm of the integrated column values avoids retrieving unphysical negative estimates for quantities bounded at zero. The fractional uncertainties expressed on the quantities TCWV and TCLW transform into absolute uncertainties when expressed in log-space since, for a fractional uncertainty
f on a quantity
a, where
and the absolute uncertatinty in
L is
is also assumed to be diagonal with values filled by the NEdT for each AMSR2 channel. In ascending order of frequency, these are (0.34, 0.43, 0.7, 0.7, 0.6, 0.7, 1.2) K with both H- and V-components having the same value [
52].
4. Information Content Analysis
The degrees of freedom for signal
, using all 14 channels, for each of the considered profiles, is shown in
Figure 5. The mean value
for the four-variable retrieval vector and
when using the five-variable vector. These values are lower in both cases than the number of retrieved quantities and likely reflects the weak constraint that the observations place on the separate wind-speed components. There is also a noticeably wider spread of
values for the five-variable retrievals compared to the four-variable cases.
The estimated retrieval uncertainty matrix was calculated from Equation (
14) for every profile for all possible channel combinations. For a given channel combination, we define the estimated average SST retrieval uncertainty (
s) as the root mean squared expected uncertainty for SST across the profile set i.e.,
where
n is the number of profiles (2680) that are indexed by
i.
Figure 6 shows
s for the single-channel-only retrievals, illustrating which channels make the greatest individual contribution to reducing uncertainty in retrieved values of SST.
Figure 7 shows the smallest value of
s when a given number of channels is included in the observation vector along with the best channel to add. This is summarised in
Table 1 and
Table 2.
5. Simulated Retrieval
Simulated retrievals were carried out by randomly perturbing the NWP SAF profiles according to the uncertainties for the two cases. A 10% variation was also applied to the total cloud liquid water (TCLW) profiles for the four-variable case even though this was not a retrieved variable. The water vapour and CLW values on each level of the profiles were uniformly scaled to give the perturbed TCWV and TCLW values. These perturbed profiles were treated as the unknown true values and corresponding simulated observations were generated using RTTOV with random noise added consistent with . The unperturbed profiles were used both as the a priori state and linearisation point from which and were generated, again using values obtained from RTTOV.
The simulated retrieval error was calculated for every profile for all possible channel combinations. For a given channel combination, we define the simulated uncertainty (
) as the standard deviation of the SST retrieval errors (
e) across the profile set. Thus, for any retrieval
and, for a given channel combination,
Figure 8 shows the values of
for single-channel-only retrievals, again illustrating which channels make the greatest individual contribution to a retrieval of SST.
Figure 9 shows the smallest value of
for a given number of channels included in the observation vector along with the best new channel to add to the existing set. These results are also summarised alongside the information content analysis in
Table 1 and
Table 2.
6. Discussion
The OE framework provides a mechanism for combining all available information relating to an inverse problem with appropriate weighting. Since each channel brings some information, adding more channels to the observation vector results in progressively improving retrieval uncertainties if all sources of uncertainty are well-described by the error covariances used, and if the retrieved variables account for all significant variability in the observations. This is the behaviour that we see in the information content analyses summarised in
Table 1 and
Figure 7, where the predicted uncertainty monotonically decreases to the all-channel value at a declining rate as less informative channels are added.
The simulated uncertainty using the four-variable retrieval vector and shows different behaviour, with the uncertainty increasing with added channels after the 10th. This arises because TCLW is missing from the retrieval vector. The OE method use a forward model run using the a priori values for the quantities in the state vector to generate simulated observation. The differences between the simulated and observed values are ascribed to deviations between the a priori values in and their true values. However, if the observed radiances additionally include variability due to TCLW (which is not in the 4-variable state vector), the scheme can only interpret any observational differences in terms of the other four state-vector variables. This misattribution is naturally largest for those channels that are most sensitive to TCLW where the “observed” values are most affected and which therefore result in the largest retrieval errors. These channels thus drop down the ranking of the best channel to add to the scheme. This effect is most obviously demonstrated in that adding the four least-favoured channels actually increases the SST retrieval error.
When TCLW is included in the retrieval vector, there is consistency between the behaviour of the estimated uncertainty (
s) and simulated uncertainty (
).
Figure 5 bears out the above interpretation. Here, the degrees of freedom for signal of the four-variable retrieval has a lower mean value across the profile set, while the five-variable retrieval has a larger mean and a spread of values. In the five-variable retrieval case, the degrees of freedom for signal steadily increase with TCLW up to approximately 0.3 kg·m
(which includes 90% of the profiles) before plateauing. It then slowly declines again above about 1 kg·m
(4% of the profiles). The five-variable results indicate that a retrieval uncertainty for SST of ∼0.37 K may be achievable if TCLW is explicitly accounted for, whereas neglecting that aspect of variability would limit the achievable SST uncertainty to ∼0.45 K.
To check that the above difference is a result of including TCLW in the vector rather than merely an effect of the different a priori error covariance assumptions in the two case studies, we calculated results for a third configuration (not shown). This used the 5-variable retrieval vector with the error covariance assumptions used in the 4-variable case study. The error covariance assumption for L was as used in the 5-variable case. When including all 14 channels, the values of K and K are comparable to the 5-variable case. The value of also decreases monotically as channels are added to the scheme. This comparison proves that expanding the vector is more critical than the error covariance assumptions.
The analyses suggest a preferential ordering of channels for inclusion in the observation vector. We can interpret the channel ordering through
Figure 10 and
Figure 11 for low- and high-TCLW profiles, respectively. In these figures, the axes represent the brightness temperature in pairs of channels in the order suggested by the five-variable information content analysis. The sensitivity of the two channels with respect to the retrieved quantities is scaled to a “typical” change in brightness temperature by multiplying by the a priori uncertainty on the quantities. In
Figure 10, the panel (a) shows that the leading two channels (6.9 V and 7.3 V) are principally sensitive to SST and TCWV, with only small contributions from the other variables. In this case, the difference between the modelled and observed retrieval vectors is interpreted in proportion to the a priori uncertainties expressed as radiances. Panel (b) shows the next pair (7.3 V and 36.5 H) with very different responses for SST and TCWV. The 36.5 H channel is largely insensitive to SST in comparison to large changes due to TCWV, and it is consequently possible to remove the previous ambiguity and separate the two variables in the retrieval. It is not until the third pair (36.5 H and 6.9 H) that it begins to be possible to resolve wind speed effects and thus refine the small contributions they made to brightness temperature changes in the earlier channel combinations. The fact that the two wind-speed components are largely co-linear suggests that it is difficult to discriminate their individual contributions. This is the main reason that
is less than the number of state-vector variables.
The high TCLW profile shown in
Figure 11 suggests significant ambiguity for brightness temperature changes between SST, TCWV and TCLW for the 6.9 and 7.3 V pair of channels. While the remaining panels show the effect of SST now being distinguishable from both TCWV and TCLW, these latter two variables remain largely co-linear. This figure also shows different sensitivities for the two wind-speed components largely indicative of the change in wind-speed sensitivity with wind speed. The u-component of the wind speed in this case is significantly smaller than the v-component. Consequently, the u-component sensitivity arrow is barely visible, whereas the v-component shows changes in some of the channel combinations comparable to TCLW.
As alluded to in
Section 2, modelling the emissivity in the MW region and particularly the wind speed dependency is a difficult task. In an effort to assess the effect of any shortcomings of the forward model in this respect, the information content and retrieval analysis were rerun doubling the sensitivity of brightness temperature to each of the wind components in
. The results are summarised in
Table 3. From the information content analysis, the expected SST uncertainties for both retrieval vectors with all channels included change by around 0.01 K and although there is some slight reordering of the channels, the top five remain the best five to include. For the simulated retrievals with a five-variable retrieval vector, the 10.7 H channel has been promoted into the top five, but the best 14-channel retrieval changes by only 0.001 K. In the four-variable simulated retrieval case, the best retrieval error values is similarly small. Here, though, there are no changes to the channel order down to 7th place, perhaps reflecting that the absence of TCLW from the retrieval vector dominates the ordering.
As mentioned in relation to the increasing retrieval errors for the four-variable retrieval, including all channels in the retrieval is not necessarily the best approach in practice since there may be unrepresented physical processes (such as calibration errors) or poorly-estimated covariance matrices. Given the reasonable consistency of the channel ordering for the five-variable retrievals, we conclude that including the top five or six channels here is the optimum approach in practice when estimating SST using AMSR2.