*5.1. Retrieval by Search on Pre-Calculated Datacube*

The previous observation underlines that the bottleneck of the retrieval process is the iterative generation of simulated spectra during fitting. Thus, a great improvement in efficiency could, in principle, be achieved by avoiding that process. This can be achieved if spectra are pre-calculated, as follows:


To test this procedure, simulated spectra datacubes with a spectral resolution of 1 cm−<sup>1</sup> and *Tb* = 350 ◦C were calculated for each of the three pollutant gases studied. Gas temperatures varied between 0 ◦C ≤ *Tg* < 69 ◦C with a step Δ*Tg* = 1 ◦C, and the range of column densities was 70 ppm·m, centered for each gas at its expected column density, with Δ*Qg* = 1 ppm·m.

Results are shown in Table 3, under the heading SSD (simulated spectra datacube). Comparison with nominal values gives relative errors of −7.8% for CH4, +5.1% for N2O and −7.4% for C3H8, similar to those of the iterative fitting method except for a larger value in CH4. Standard deviations are of the same order of those obtained previously with PCA-processed spectra.

**Table 3.** Column density retrieved for air pollutants in a 7 × 7 square at the center of the gas cell. Values obtained by search in simulated spectra datacube (SSD) and in simulated PC datacube (SPCD).


Generation of each simulated spectra datacube took 23.2 s of CPU time in an Intel i7 processor based computer at 3.2 GHz, with six cores and 64 GB of RAM. Then, the realization of a column density map over a region of 70 × 70 pixels took 5630 s of CPU time. This result was unexpected, since it is longer than the 1460 s of CPU time for the same task if completed by pixel-by-pixel iterative fitting.

The explanation is that in order to find the (*Tg*, *Qg*) couple at each pixel an exhaustive search was used, i.e., the SSE was calculated between the experimental spectrum and *all* the spectra in the simulated datacube. This is a very inefficient strategy, and time can be reduced at least by an order of magnitude if a gradient search algorithm is used. Clearly, time will also be shorter if the simulated spectra datacube is made smaller, either by increasing the steps (Δ*Tg*, Δ*Qg*) or by reducing the range of (*Tg*, *Qg*). No attempt of improvement along these lines has been made, however, since the approach based in PCA described in the following section is much more powerful.

## *5.2. Simulated PC Datacube*

The retrieval strategy just described above compares experimental spectra as measured (i.e., in the spectral space) with the simulated ones. However, it can be enhanced by the use of principal components to make it faster.

If a PCA is performed on the simulated spectra datacube, its z dimension can be drastically reduced. The datacube thus obtained will be called the *simulated PC datacube*. Now, the number *p* of PCs needed may be larger than 2, since spectra in the simulated datacube have a larger variability than those of gas cell, because of the much wider interval of temperatures and column densities involved. However, the absence of noise reduces the variance of the simulated spectra, and, in our case, *p* = 2 is still enough to account for more than 99.95% of the total variance.

Now, to retrieve the values of *Tg* and *Qg* for a pixel, the experimental spectrum is projected onto the first *p* eigenvectors of the simulated spectra datacube, in order to obtain its PCs (scores), and these *p* numbers are compared by a simple exhaustive search with those in the simulated PC datacube to find the (*Tg*, *Qg*) couple with optimal agreement. It is important, however, not to make the direct comparison of the scores, but rather to multiply them by the magnitude of the corresponding eigenvector so as to to calculate correctly the distance between the experimental and the simulated spectra in the PC space.

Retrieval of Q and T is dramatically faster with this procedure. Generation of the simulated PC datacube from the simulated spectra datacube took 2.3 s of CPU. Then, creation of a map of Q over the same 70 × 70 region as above took only 1.0 s of CPU.

Results are shown in Table 3, under the heading SPCD (simulated PC datacube). Relative errors as compared to nominal values are now much smaller than previously: −1.9% for CH4, +2.7% for N2O and 1.4% for C3H8. Standard deviations are of the same order, being somewhat smaller for CH4 and larger for C3H8.

Retrieved temperatures are also more accurate, and nearly identical for the three gases: 305.1 ± 2.7 K for CH4, 305.7 ± 1.5 K for N2O and 304.7 ± 5.1 K for C3H8.

A point worth noting is that, since this approach is based on a PCA performed on simulated spectra rather than on experimental ones, it can be applied as well to nonimaging spectrometers.
