1. Introduction
The remote sounding of atmospheric composition and temperature typically involves an under-constrained retrieval process. This process includes a forward model inversion and a cost function minimization, whereby the latter must be converted from an under-constrained problem to a constrained one using external information [
1]. The constraining information can either consist of an a priori estimate of the atmospheric state and its covariance, such as in the Optimal Estimation (OE) approach, or a continuity constraint, such as in Philips–Tikhonov-like approaches. In both cases, the resulting retrieved state unavoidably consists of a mixture of information contributed by the measurement and of a priori constraints.
The presence of prior information in the atmospheric state retrievals complicates their scientific application and interpretation in several ways [
2]. In comparison with other measured or modeled data, the differences and their uncertainties contain several prior-induced terms that can be hard to estimate [
3]. These additional terms not only affect the quantitative validation of remotely sensed data, but also their visual inspection. It is virtually impossible to tell whether significant structures originate from the measurement or from the a priori [
2]. Furthermore, depending on the nature of the a priori information, profile retrievals at different locations are no longer statistically independent, which complicates their averaging and assimilation [
4,
5]. Finally, the data volume of a complete set of diagnostic parameters can be enormous, or users might not deal with this diagnostic information, risking misinterpretation of the data. It is, therefore, often desirable to remove the a priori information from a retrieved product.
Optimal estimation retrievals have the advantage that their information mixing is fully quantified by the retrieval’s averaging kernels [
1], represented by a single vector kernel for column retrievals, or by a square averaging kernel matrix for vertically resolved profile retrievals. Consequently, it is, in principle, possible to perform a deconvolution operation that removes the prior information from the retrieval, resulting in a so-called information-centered representation of the retrieved atmospheric state, where each datapoint represents one independent degree of freedom (DOF) [
2]. In practice, however, a pure deconvolution is hampered by the presence of the retrieval uncertainty, as expressed by its covariance (matrix).
von Clarmann and Grabowski [
2] have nevertheless developed a methodology to convert a given OE profile retrieval into its information-centered representation by imposing a staircase or triangular profile representation. Keppens et al. [
3] have later argued that this result can also be obtained from the complete data fusion framework if the latter includes a regridding operation. The complete data fusion framework provides a method for combining retrieved atmospheric states that is equivalent to their simultaneous retrieval [
6]. This method, and its ability to combine the replacement of prior information with an interpolation (and a corresponding uncertainty assessment), was developed by Ceccherini et al. [
6,
7].
In this work, it is demonstrated (next section) that the information-centered representation of a retrieved atmospheric state can also be obtained iteratively by performing a Wiener deconvolution on the retrieved profile. This deconvolution explicitly considers the convoluted state’s uncertainty by minimization of an uncertainty-related cost function [
8]. In
Section 3, the developed method is applied to a selection of satellite and ground-based atmospheric state retrievals, both from simulated and real data, which are also compared with ozonesonde data to assess the validity of their information-centered representations. The last section provides an additional discussion on the applicability of the developed deconvolution method and conclusions.
2. Methodology
In the absence of uncertainties, an optimal estimation retrieval yields a retrieved profile
given by [
1]:
which combines measurement information from the true profile
with prior information from the a priori profile
. The weighting matrix
A is called the averaging kernel matrix and is usually non-diagonal, meaning that the elements of the state vector are not mutually independent;
I is the unit matrix. As the measurement information from the true profile in terms of the number of degrees of freedom of the retrieval
is typically (much) smaller than the number of retrieved profile levels or layers, both input profiles are strongly convoluted by the averaging kernel matrix
A and by
, respectively. In theory, however, by solving Equation (
1) for
, one can perform a simple deconvolution operation to reconstruct the true profile for a given vertical grid:
with
representing the retrieval after (non-optimized) correction for the prior profile contribution [
3].
In reality, the deconvolution operation in Equation (
2) is hampered in two ways. First, the under-constrained retrieval typically makes the straightforward inversion of the averaging kernel matrix
A impossible. Second, systematic and random uncertainties
, originating from both the remote measurement and the retrieval process, contribute to the retrieved profile, resulting in an additional term in Equation (
1):
and making the solution for
undetermined by a term
that includes contributions from the chosen a priori and its constraints. In order to obtain a direct estimate
of the true profile in the absence of prior information, i.e., obtain an information-centered representation of
, one needs a different approach.
von Clarmann and Grabowski [
2] make use of the Optimal Estimation retrieval theory developed by Rodgers [
1] to combine a prior replacement operation with a vertical regridding operation:
with
representing the retrieval covariance matrix associated with
(thus omitting the vertical smoothing error covariance because of the subsequent regridding operation; see the end of this section).
and
are the new prior profile and prior covariance matrix of choice, respectively, that have to replace the retrieval’s initial
and
. The regridding matrix
W converts the vertical grid
z of
to
of
x. The information-centered representation of
is the one that defines
W,
and
in such a way that
x no longer contains any prior information. This is achieved by setting
and determining the matrices
W and
that satisfy
non-trivially, with
being the least-squares pseudo-inverse of
W. von Clarmann and Grabowski [
2] provide two approaches for this, using either a staircase or triangular representation of
.
In this work, we present an alternative approach using a Wiener deconvolution in the vertical domain that minimizes an uncertainty-related cost function [
8]. Based on [
1], the required cost function
was developed within the complete data fusion framework [
6]:
with
and
the new prior profile and prior covariance matrix that constrain the solution of the cost function minimization (and note the use of
, as defined in Equation (
2), instead of
). In contrast with Equation (
4), however,
and
can immediately be defined on a different vertical grid than the initial retrieval
. In that case, it suffices to replace both terms
by
in Equation (
5). Minimizing
then yields [
7]:
The constraint
on the new vertical grid can hence be straightforwardly imposed within the complete data fusion framework. Insertion in Equation (
6) results in:
with
. The latter identification demonstrates that Equation (
7) can still be considered a deconvolution with
in Equation (
2) being replaced by a Wiener-like deconvolution matrix
P that also includes a regridding operation.
P equals the least-squares inverse of
if
.
Equation (
7) is mathematically equivalent to Equation (
4) with
[
3], but has the important advantage that only
W has to be determined. This can be carried out quite straightforwardly from the observation that the averaging kernel matrix of the information-centered representation has to equal the unit matrix. The averaging kernel matrix
that corresponds with
x is given by the application of the deconvolution
P to
or
[
3,
7]. Within the complete data fusion framework, the information-centered representation thus depends on obtaining a regridding matrix
that fulfills
by construction, or
. One, hence, immediately obtains an analytical yet recursive expression for the looked-after regridding matrix
W:
which cannot be trivially solved for
W, as even its dimensions are undetermined. However, considering that
has to reproduce the unit matrix, one can preset the dimensions of
W by
, with
d being the rounding of the number of degrees of freedom of the initial retrieval (although this number slightly depends on the retrieval constraints; see the next section).
Fixing the dimensions of
W, Equation (
9) can easily be iteratively solved, e.g., one can opt for a pseudo-inverse linear or mass-conserving regridding [
3,
9] from the initial retrieval grid towards
d equidistant levels spanning the same vertical range as a first estimate
, and hence apply:
until a converged solution of Equation (
9) is reached within a given limit. As such, apart from its number of elements, the vertical target grid
is optimally determined by the iteration process and must not be assumed to be a subset of the initial vertical retrieval grid
z, as in the approach developed by von Clarmann and Grabowski [
2]. The deconvoluted profile
on
and the corresponding covariance matrix
immediately follow from [
3], while
by construction.
The vertical smoothing difference error [
2,
3] or vertical interpolation error [
7] introduced by combining the deconvolution operation with a regridding operation equals
on the initial vertical retrieval grid
z, provided that the covariance matrix
of choice fully characterizes the true atmospheric variability [
2,
3,
7]. This uncertainty contribution however disappears when represented on the coarser grid
of the deconvoluted profile
x if the atmospheric variability is still sufficiently well-characterized by
[
2]. This condition is assumed to be fulfilled in the following, with
thus representing the full ex-ante uncertainty on
x.
4. Conclusions
This work explores and develops a novel method for obtaining an information-centered representation, stripped of all a priori constraint information, of an optimally estimated atmospheric state. The method basically consists of a Wiener deconvolution of the retrieved atmospheric profile, which considers the convoluted state’s uncertainty by minimization of a cost function. The required cost function was taken from the complete data fusion framework, by setting its converted prior constraint to zero (). Additionally asserting that the deconvoluted averaging kernel matrix has to equal the unit matrix results in an iterative procedure for determining the deconvolution matrix P. In contrast with previous approaches, this iteration process automatically shifts the levels or layers of the information-centered profile to their optimal deconvoluted positions, i.e., where most of the retrieval information is.
This deconvolution method has been demonstratively applied to simulated ozone retrievals and to real ozone profile observations at the Izaña ground station in April 2012. The simulated Metop-SG-A1 UVNS (Sentinel-5) and IASI-NG TIR retrievals revealed the necessity of presetting the number of deconvolution levels or layers to the lower integer of each retrieval’s degrees of freedom. This approach results in some loss of retrieved information (less than one DOF), but it also avoids strong prior-induced fluctuations in the deconvoluted profiles. The deconvolution of real FTIR, GOME-2A, IASI-A, and MIPAS ozone profile observations at Izaña confirmed that individual deconvoluted profiles often largely deviate from their initial retrievals. This is an expected result; there is a reason for the insertion of a priori information (as a retrieval constraint) in the first place. The number of outliers seems to be slightly reduced in the layer representation of the deconvoluted profiles with respect to their level representation, but negative ozone concentrations still occur outside of the stratospheric ozone layer.
It is, therefore, suggested to rather apply a deconvolution process upon the creation of spatiotemporally averaged (Level-3-like) data. Although resampling to a common vertical grid is required, the preceding retrieval deconvolution avoids smoothing difference errors and difficulties with averaged averaging kernels. However, if the retrieval DOF or the number of profiles to be averaged is low (e.g., below 5 and about 50, respectively), it might be necessary to reinsert common prior information in the averaged profile. Doing this essentially comes down to the application of the complete data fusion framework for data harmonization, while this work exploits its limiting case for
(cf.
Section 2). Both options are being currently considered by the authors in their contributions to the Committee on Earth Observation Satellites (CEOS) tropospheric ozone activity (VC-20-01: Tropospheric Ozone Dataset Validation and Harmonization) and to the second phase of the Tropospheric Ozone Assessment Report (TOAR-II, 2020–2024) of the International Global Atmospheric Chemistry (IGAC) project. Some of the data harmonization shortcomings revealed in the first phase of TOAR (2014–2019) are, hence, explicitly addressed [
18].