*3.3. Exploring the Impact of Date Selection on Wheat Yield Prediction Using Backscatter Information Derived from Sentinel-1*

In this study, the feasibility of using backscatter information obtained from S1 at various dates to train and test machine learning models was evaluated. The results, represented in terms of R<sup>2</sup> and RMSE, obtained during the testing process are presented in Figure 5.

**Figure 5.** R<sup>2</sup> and RMSE of the four algorithms (MLR, Multiple Linear Model; RF, Random Forest; SVM, Support Vector Machine; CatBoost) when trained with VV and VH polarization backscatter information derived from S1 corresponding to three different dates. It also shows their combined use.

The pattern observed with S2 is repeated with the S1 data, where the best results were obtained using CatBoost and the worst using MLR. In the case of employing single days, the results showed notable variations depending on the selected day. For example, the R<sup>2</sup> value for Day 2 was 0.36, while for Day 3, it decreased to 0.08 when using CatBoost.

For the S1 data, the combination of multiple dates improved the results compared to a single date. The highest R<sup>2</sup> values were obtained when using information from the three days (Days 1–3). Among the algorithms tested, CatBoost showed the best results with an R<sup>2</sup> of 0.69, while the lowest R2 value of 0.20 was obtained with the MLR model (Figure 5). The RF and SVM models showed similar results, with the latter showing a slightly better performance.

It is noteworthy that combining data from multiple dates did not always result in better performance compared to using data from a single date. For example, the RMSE for Day 2 was 1.34, while the combination of Days 1–3 was 1.59 with the CatBoost algorithm.

Additionally, the greatest differences in the RMSE and R<sup>2</sup> were observed between the algorithms that can analyze non-linear relationships (RF, SVM, and CatBoost) and the one that only analyzes linear relationships (MLR) when compared to the information of S2. In all cases, the non-linear algorithms showed better results (Figure 5).
