**3. Experimental Measurements**

The experimental setup reproduces the scheme of Figure 1, but with the gas to be measured confined to a gas cell in order to know precisely the optical path (see Figure 2). The three main elements are: a blackbody radiator as a temperature controlled background, a gas cell for the pollutant to be characterized and the imaging Fourier transform spectrometer (IFTS) that captures both spectral and spatial information of the scene.

**Figure 2.** (**Left**) Overall view of the experimental setup. (**Right**) A close-up view of the gas cell without the gas supply tubes.

Specifically, an extended area (15 × 15 cm) blackbody radiator from Santa Barbara Infrared, Inc., with nominal emissivity of 0.9 was placed as uniform background, and a 43 cm long gas cell made of stainless steel with two 38 mm diameter sapphire optical windows was used to enclose the gas under test. This cell has two valves separated by a distance of 20 cm for gas input and output.

The experimental spectra have been acquired with a Telops FIRST-MW Hypercam IFTS [16,17] placed at a distance of two meters from the blackbody radiator, with the 43 cm metallic gas cell in-between. In this instrument, the incoming radiance is modulated by a Michelson interferometer, and then is detected by an InSb 320 × 256 focal plane array (IFOV = 0.35 mrad), sensitive in the mid-infrared (1850 to 6667 cm−1). Interferograms are acquired for each pixel, which, after processing, can provide spectra with a maximum resolution of 0.25 cm<sup>−</sup>1.

In order to reduce acquisition time to ≈ 1 min, in this work the spectral resolution of the measurements was set at 1 cm−<sup>1</sup> and a spatial sub-windowing of 256 × 160 pixels was used. Integration time was 10 μs. Four interferograms were acquired for each measurement, and the dataset was pre-processed by calculating its median and then Fourier-transformed to obtain the radiance spectra. Processing of the interferograms includes triangular apodization, zero-padding to obtain experimental spectra with same wavenumbers as the theoretical ones, as well as off-axis correction [18]. All the processing steps have been described in [10].

Radiance spectra were obtained for both reference (with gas cell filled with N2) and pollutant gas and divided according to Equation (4) to get a nominal transmittance spectrum.

Measurements have been carried out with the gas at ambient temperature and the blackbody background at 350 ◦C, for methane (CH4), nitrous oxide (N2O) and propane (C3H8) at the concentrations and in the spectral regions detailed in Table 1. The bottles were prepared by the Spanish Metrology Institute (CEM, Centro Español de Metrología), ensuring high accuracy in the concentration values.

**Table 1.** Air pollutants under test.


#### **4. Noise Filtering by Principal Component Analysis**

Experimental radiance spectra for the three gases studied are shown in the left-hand graphs of Figure 3. These spectra, divided by the reference spectrum obtained with the gas cell full of N2, give the transmittance spectra of the right-hand side. The best fitting by theoretical spectra (achieved with the iterative algorithm as explained in Section 2.3) is also shown.

It is well known that when two noisy spectra are divided, the signal to noise ratio (SNR) decreases greatly. Therefore, it would be very convenient to reduce the noise level of radiance spectra before calculating transmittance. This can be performed by acquiring more interferograms, at the cost of increasing measuring time, or by averaging over neighboring pixels, thus decreasing spatial resolution.

There is, however, a better solution provided by principal components analysis (PCA) [19]. This is a well-known statistical technique used to reduce the dimensionality of sets of multivariate data. If we have *n* measurements, each of *m* variables, the data can be interpreted as a cloud of *n* points in a *m*-dimensional *variable space*. PCA generates a new orthogonal basis in this space, optimally adapted to the data in the sense that (a) its origin coincides with the center of mass of the points and (b) the new (sometimes called "main") axes are oriented so that the projections of data on them are uncorrelated (i.e., in the new axes, the covariance matrix of the data is diagonal). The unit vectors corresponding to these axes are the eigenvectors of the covariance matrix, and PCA provides them in decreasing order of the associated eigenvalue. This means that the first principal direction is that along which the variance of the data is a maximum; the second principal component is, among the subset of vectors perpendicular to the first, the one whose direction contains the largest variance, and so on. The coordinates of a point in the spectral space with respect to the new basis are called principal components (PCs) or sometimes scores, and are obtained by subtracting the coordinates of the center of mass and then projecting on the basis of eigenvectors.

**Figure 3.** Experimental spectra of air pollutants: radiance (**left**) and nominal transmittance, with best fit (**right**).

Since most of the variance of the data is found in the first principal components, a good approximation to the original data set can be made by considering only a small number of principal components, say *p*. This is equivalent to projecting the data set in the *p*-dimensional sub-space built from the first *p* main axes, and achieves a reduction in the dimensionality of the data set from *m* to *p*.

In our case, the original data are the spectra (each one with *m* wavenumbers, *m* ∼ 15.000 for 1 cm−<sup>1</sup> resolution) from a region of *n* pixels corresponding to the gas cell. Since the spectra depend on two variables, T and Q, we can conjecture that the data should have an intrinsic dimensionality close to two. They should all, therefore, lie very close to a surface in the variable space, although this surface will not be a plane, since transmittance is not linear with Q or T. However, if the range of variation of T and Q in the data is relatively small, the corresponding surface region will be approximately flat, so that two principal components should be enough to describe with good approximation all the variability of the original data (*p* = 2). When T and Q have a wider variation, it will be necessary to take *p* > 2, but in any case, the principal components of large order will contain mainly noise. In summary, selecting the subspace spanned by the first major components not only dramatically reduces data volume, but also results in efficient noise filtering [20,21].

To apply PCA to our experimental data, a preliminary scene classification is performed by a standard k-means algorithm [22,23] to select the region of the image that corresponds to the gas in the cell. After applying PCA to the radiance spectra in that region, it is found that eigenvalues decrease sharply (Figure 4), so that for all the gases studied the first two account for more than 99.95% of the trace of the covariance matrix (i.e., the total variance of the data). This confirms our conjecture and suggests that a good spectrum reconstruction should be obtained with only two principal components. Indeed, Figure 5 (left-hand side) shows that the reconstructed radiance spectra reproduce with high fidelity the original ones (shown in Figure 3), but with noise filtered out; as expected, the effect is stronger in

transmittance (Figure 5, right-hand side). The results of iterative fitting of these spectra are shown also in the right-hand side of Figure 5.

**Figure 4.** Values of the first 5 eigenvalues for the covariance matrix of the radiance spectra of the three gases studied.

**Figure 5.** PCA–processed experimental spectra of air pollutants: radiance (**left**) and nominal transmittance, with best fit (**right**).

By fitting spectra over the whole field of view of the instrument, a map of retrieved Q is created. Figure 6 compares the C3H8 maps obtained from unprocessed spectra (left) and PCA-filtered spectra (right). As expected, only the round cell window regions have meaningful values, and they are quite similar in both cases, although the PCA-processed map is more uniform.

**Figure 6.** Maps of Q values retrieved by iterative fitting from *τnom*, unprocessed (**left**) and PCAfiltered (**right**). The scale is in ppm·m; the size of the field of view is 5.5 cm × 5.5 cm. Retrieved values of Q only have physical meaning in the central round region that corresponds to the gas cell window; it is clear that PCA filtering improves uniformity in that region.

Retrieved Q values are summarized in Table 2, both for PCA-filtered (Figure 5) and unfiltered spectra (Figure 3). Values are the mean ± the standard deviation in a square of 7 × 7 pixels at the center of the gas cell. Signal to noise ratios measured in dB are also tabulated. PCA increases SNR in all cases, and the effect is larger the noisier is the original spectrum: the dB value is multiplied by 3.2 for CH4, by 2.1 for C3H8, and by 1.1 for N2O. It must be pointed out that this improvement does not come at the expense of spatial resolution (which is not degraded) or acquisition time (which is not increased), since no spatial or time averaging is involved.

Comparison of the retrieved Q values with the nominal ones gives relative errors of −2.6% for CH4, +4.8% for N2O and −9.2% for C3H8 for non-PCA-processed spectra and similar values for the PCA-processed, except for a slightly better value for C3H8 (relative error −7.1%). These results, however, do not mean that PCA does not improve the measurement of Q. Since they have been obtained by spatially averaging over a uniform region, the most relevant parameter here is standard deviation, which is much smaller for PCA-filtered spectra. The conclusion to be extracted is that the main effect of PCA processing has been to improve the precision of retrieval rather than its accuracy.

Regarding the retrieved temperatures, for a room *Tg* ≈ 302 K, results for CH4, N2O and C3H8 were, respectively, 310.6 ± 25.6 K, 305.0 ± 2.2 K and 312.6 ± 16.7 K for non-PCAprocessed spectra, and 306.7 ± 2.4 K, 305.4 ± 1.4 K and 312.5 ± 8.7 K for the PCA-processed. These values show a similar behavior to those of Q: PCA processing has only improved slightly the value of T for CH4 but has achieved an important reduction in standard deviations, i.e., gives better results regarding uniformity.

**Table 2.** Column density values retrieved and signal to noise ratio for air pollutants in a 7 × 7 square at the center of the gas cell. Values obtained by iterative search using as-measured experimental spectra and PCA-processed experimental spectra.

