*5.1. Reference Data and Model Parameters*

A major difficulty with the quantitative evaluation of reconstructed time series is the lack of reference data, which must match the test data in temporal, spatial, and spectral dimensions. Several studies have used modeled time series based on the average of multiyear data as a reference [13,25,47,48]. Noise is often added to the modeled reference time series to evaluate the performance of the reconstruction method. This assumes that the modeled data and added noise realistically represent the actual data that will be used for the model that is evaluated. In [25], reference data were obtained from the 10-year average (2011–2020) of NDVI data. The advantage of this is a relatively noise-free reference time series. Taking the average produces a smooth signal as frequent changes related to, e.g., bidirectional effects, crop-harvesting practice, and sudden changes in land cover, are smoothed out. Using averaged data as a reference risks favoring reconstruction algorithms that include a higher level of smoothing. There is a trade-off between reconstructing a smooth noise-free signal on one hand and retaining the original values of high-quality pixels that have been observed in clear conditions on the other. The optimal approach depends on the application. This is also reflected by the tuning parameters that often accompany reconstruction algorithms [20,33]. The idea is that users can then optimize these parameters for their application at hand. However, optimizing these parameters adds to the complexity of a fair comparison of different techniques. The DTS algorithm proposed in [33] contains 13 parameters, wherein the authors optimized these parameters using a reference dataset that captured a range of ecosystems and land use/cover types. They claim the parameters should therefore be fairly robust for applications using vegetation indices such as the EVI2 or NDVI. However, as reflectance values can have a different variance structure or dynamic ranges, different optimal parameters might apply.

In a production environment where products are created at regional and continental scales, the fine-tuning of parameters for individual areas can become tedious and potentially result in unwanted border effects between areas where parameters differ. Therefore, the parameters for the three methods under comparison were fixed. A first parameter of the proposed RTSR is related to the cloud buffer. Due to omission errors of the Sen2Cor cloud detection algorithm, not all cloudy pixels have been identified as such. The observation corresponding to the acquisition on 7 June 2019 (second vertical line in Figure 3) can also be considered as an outlier, but was identified by the SCL band as clear and thus not masked. Cloudy or shadow pixels that are not masked will not be interpolated during the gap-filling step of the method. The corresponding surface-reflectance values will be taken into account and add to the noise in the time series. Because omission errors are more frequent on cloud edges [49], an extra buffer around clouds masks is commonly used [38]. Here, a relatively small buffer of five pixels (50 m) was added. It was able to mask the observation corresponding to the acquisition on 12 June 2019 that was considered by the SCL band as clear (third vertical line in Figure 3). Using a larger buffer provides the potential to mask more pixels. For instance, buffer sizes of 100–300 m were proposed in [38]. However, as suggested in [39], this also removes a large amount of usable imagery. The impact on the reconstructed time series will be reduced, because the observed surface-reflectance value near the cloud edge will most likely have a relatively low calculated NDVI value. The iterative RTSR algorithm will then reject this value and keep the smoothed value.

The parameters of the smoothing filters SG<sup>1</sup> and SG2, were fixed based on the values found in the literature and experimental results. For the half-width of the smoothing window, values between 4 and 7 were suggested in [20]. Small values result in less smoothing with the risk of over-fitting the data points, whereas larger values risk not picking up important variations in the time series. For the first version of the SG filter, SG1, a value of 7 was selected with a polynomial degree of 2 (suggested between 2 and 4 in [20]). This creates a relatively smooth signal that is able to represent the slowly varying long-term change trend. A lower filter width and a higher polynomial degree was set for SG2 (3 and 3 instead of 4 and 2) so that important variations potentially smoothed out with SG1, can still be picked up by the iterative algorithm. This is illustrated in Figure 3c,f. In contrast to the long-term change trend (in orange), the reconstructed time series (in blue) represents well the original values near the green peak (23 May 2019). The reconstruction based on DTS (in green) neither represents the original values near the maximum nor the minimum

greenness. The relatively poor performance of DTS is potentially due its application to surface reflectance data. Although the code in [32] is claimed to be applicable to time series of satellite observations in general, the DTS presented in [33] was applied to EVI2 time series and involved an adjustment to the upper envelope. As for the NDVI, the assumption is that relatively high values correspond to a trustworthy observation. This assumption does not hold for surface reflectance. The DTS-reconstructed time series was obtained with the original code in [32] without this adjustment. Nevertheless, for most test sites the RMSE values of DTS are still in line with the RMSE values obtained in other studies. In [28], a reconstruction algorithm on Sentinel-2 surface reflectance data was evaluated using a metric similar to RMSE<sup>0</sup> and obtained RMSE values of 0.0426 (B2), 0.0407 (B3), 0.0405 (B4), and 0.0449 (B8). In [30], reconstructed MODIS Terra MOD09A1 data were evaluated on cloud-free MODIS Aqua MYD09A1 data: 0.0366 (red band), 0.0519 (near infrared band), and 0.0499 (short-wave infrared band).
