*4.1. Evaluation of Test Site near Montpellier*

In Figure 6, a subset of true color images for the reconstructed time series is shown. The subset corresponds to the eight acquisitions that are also shown in Figure 2 and illustrates qualitatively that cloudy and shadow pixels were reconstructed successfully.

**Figure 6.** Reconstructed true color images corresponding to the acquisitions in Figure 2.

Scatterplots for the spectral bands under study were then created for the same test site (Figure 7). The pixel values of the observed and the reconstructed time series are compared at acquisition time *i*clear and visualized based on the kernel density estimate. It is shown that the proposed RTSR method outperforms SG and DTS. The linear regression model indicates a better fit for RTSR, with a strong correlation (*R*2). In general, the SG- and DTSreconstructed surface reflectance values over-estimated the observed surface reflectance value. This is also illustrated for the example pixel indicated by the red square in Figure 7 (except for DTS in B8). The time series in Figure 3f is based on the same pixel. Both SGand DTS-reconstructed values for the clear observation on 23 May 2019 (first vertical line) indeed over-estimate the observation in B4 (lower graph). For this particular pixel and acquisition time, only SG over-estimated the observation in B8. The DTS-reconstructed surface reflectance in B8 was slightly lower than the clear observation, as can seen in the upper graph in Figure 3f. The RTSR reconstructed value shows little bias in any of the bands.

Furthermore, the RMSE based on the difference between the reconstructed and clear observations on 23 May 2019 was lower for RTSR than for the other two method, except for B8, where it was slightly larger the for this specific test site (0.22 with respect to 0.20 for SG and DTS). However, it will be shown that for the majority of test sites, RTSR has a smaller RMSE for all tested bands.

**Figure 7.** Scatterplot comparing the pixel values of the observed and the reconstructed time series for each spectral bands at acquisition time *i*clear. The red square corresponds to the pixel that is also indicated in Figures 2 and 6. The RMSE value, slope, intercept, and coefficient of determination (*R*2) of the linear regression are also shown.

#### *4.2. Smoothness Index for Aggregated Test Sites*

The smoothness for the three methods was evaluated by calculating TSI according to Equation (2). A cumulative frequency distribution representing all pixels aggregated for all test sites is shown in Figure 8. The results show little difference in the smoothness between the three reconstructed methods, with values for the 95 percentile that are almost identical. Most variation was found for spectral band B2, with smoothest results obtained for the non-adaptive Savitzky–Golay filter (SG) followed by the proposed RTSR and DTS. This is not surprising, as the SG-reconstructed time series represents the long-term change trend with a relatively wide half-width of the smoothing window. Both RTSR and DTS constrain the smoothing filter to capture dynamic changes of interest in vegetation.

**Figure 8.** Cumulative frequency distribution for the time-series smoothness index in the visible (B2, B3, B4), near infrared (B8) and short-wave infrared (B12) calculated for three reconstruction methods (SG, DTS, and the proposed RTSR). Small values indicate smoother time series. The 95 percentile is indicated in brackets.

### *4.3. Statistical Analysis of RMSE Metrics for Aggregated Test Sites*

As shown by the box plot in Figure 9a, the RMSE for RTSR was found to be smaller than for SG and DTS for the majority of test sites. This was the case for all spectral bands. The difference was most distinct in the visual to near infrared part of the electomagnetic spectrum. Furthermore, in the short wave infrared (band B12), the minimum as well as the 25 percentile value (Q1) and the median value of the RMSE was minimum over all test sites

for the proposed RTSR. However, for some test sites the RMSE was higher for the RTSR than for the SG without adjusting the smoothed value, as expressed by the slightly higher 75 percentile value (Q3) and upper whisker (Q3 + 1.5 × (Q3 − Q1)).

Following the addition of an extra gap in the time series, the three tested reconstruction methods performed more similarly (see Figure 9b). This can be expected given that the gap introduced exactly at acquisition time *i*clear was also used as a reference. This trustworthy reflectance value in the time series was not available to adjust the smoothed value in the RTSR-reconstructed time series. Only clear observations from nearby acquisition times were able to contribute to the reconstructed time series. When introducing three consecutive gaps (*i*clear − 1, *i*clear, *i*clear + 1), the difference between the compared methods was even more reduced, resulting in a comparable RMSE3 (see Figure 9c).

The results for the maximum-value composite over the four-month period 1 May 2019– 31 August 2019 are shown in Figure 9d. The advantage of the proposed RTSR reconstruction method with respect to the other two reference methods is more expressed than in the previous metrics. A potential explanation is that only a single observation was used for the previous metrics. Although the most clear observation was selected, varying atmospheric conditions over the test site result in reflectance values that are not homogeneous. The pixels in this reference image contain some noise, which blurs differences in the evaluation metrics. As motivated in other evaluation studies that obtained reference data from the 10-year average (2011–2020) of NDVI data [25], aggregating data over time reduces noise in the reference data.

**Figure 9.** RMSE based on the difference between the reconstructed image and the original observation for the most clear observation (RMSE0, RMSE1, and RMSE3) and for the maximum NDVI value composite over a four-month period (1 May 2019–31 August 2019).
