*3.3. Evaluation Procedure*

Different metrics were defined to evaluate the proposed reconstruction method, as requirements of a successful approach are also diverse. Noise should be reduced while changes related to vegetation dynamics of interest should be retained. A first metric was defined to quantify the smoothness, the time-series smoothness index (TSI) as proposed in [43]. It is calculated as the average absolute difference between the reconstructed reflectance value Rˆ *<sup>λ</sup>*,*<sup>i</sup>* at acquisition time *i* and the mean of the corresponding values before and after that acquisition. The difference is then averaged over all *T* acquisitions in the time series for each pixel (*p*):

$$\text{TSI}\_{\lambda}(p) = \frac{1}{T} \sum\_{i=1}^{T} |\hat{\mathbb{R}}\_{\lambda,i}(p) - \frac{\mathbb{R}\_{\lambda,i-1}(p) + \mathbb{R}\_{\lambda,i-1}(p)}{2}| \tag{2}$$

Low TSI values indicate smooth time series and high TSI values indicate noisy time series for a specific spectral band *λ*. A similar evaluation method was conducted in [42,44]. Reconstruction methods with a high level of smoothing are expected to perform well on the TSI metric. However, they risk concealing the temporal changes of interest. Clear observations with low noise content should be represented in the reconstructed time series. A second evaluation metric was therefore based on the error between the reconstructed time series Rˆ *<sup>λ</sup>*,*<sup>i</sup>* and the original observation R*λ*,*<sup>i</sup>* in clear conditions (*i* = *i*clear). For each test site, *i*clear was selected as the acquisition with the maximum number of clear pixels. The reconstructed time series Rˆ *<sup>λ</sup>*,*<sup>i</sup>* was then compared to the observed time series R*λ*,*<sup>i</sup>* at acquisition time *i*clear (see top of Figure 4). Scatterplots were created per spectral band based on all clear pixels *p* in a test site (*N* in total). Pixels masked as not clear were not taken into account. In addition, the root mean squared error (RMSE0) was calculated as:

$$\text{RMSE}\_{\lambda}^{0} = \sqrt{\frac{\sum\_{p=1}^{N} (\hat{\mathbf{R}}\_{\lambda, \dot{i}\_{\text{clear}}}(p) - \mathbf{R}\_{\lambda, \dot{i}\_{\text{clear}}}(p))^{2}}{N}} \tag{3}$$

Next, the most clear observation at acquisition time *i*clear was masked, resulting in the metric RMSE<sup>1</sup> (see center of Figure 4). This prevented the reconstruction algorithm using it as a trustworthy observation. Notice the extra gap that was introduced at acquisition time *i*clear in addition to the existing gaps due to the already masked pixels. Similarly, RMSE<sup>3</sup> was calculated by introducing three consecutive gaps in the time series, i.e., masking the most clear and its two surrounding observations (see bottom Figure 4).

A final evaluation metric, RMSEMVC, was defined based on the maximum NDVI composite over the four-month period 1 May 2019–31 August 2019. This is illustrated in Figure 5. With an average of 41 available observations, the maximum NDVI value composite of the original time series is expected to minimize problems common to singledate remote sensing studies, such as cloud contamination, atmospheric attenuation, surface directional reflectance, and view and illumination geometry [19]. Reconstructed time series that retain high-quality pixel observations should have a similar maximum NDVI composite and result in low values of RMSEMVC. On the other hand, higher values of RMSEMVC are expected for reconstruction methods that alter high-quality observations.

**Figure 4.** Metrics RMSE0, RMSE1, and RMSE<sup>3</sup> to evaluate the reconstruction method on the ability to retain high-quality pixels using the most clear observation as a reference.

**Figure 5.** Evaluation metric based on the maximum NDVI value composite over the four-month period 1 May 2019–31 August 2019.

#### **4. Results**

The RTSR reconstruction algorithm was implemented in Python using the open source pyjeo [45] library. Processing was performed on the Big Data Analytics Platform (BDAP [46], the in-house storage and computing platform of the Joint Research Centre (JRC) of the European Commission (formerly known as the JEODPP). The results confirmed the potential of the adaptive smoothing using a vegetation index as a proxy for reliability of the reflectance values. A more detailed analysis of the test site near Montpellier is presented first, followed by an analysis of the metrics based on all test sites.
