*3.3. Accuracy Assessment*

The ability to discriminate LC classes was first assessed using optical NDVI profiles and radar backscatter (i.e., VV and VH) coefficients and JM distance, which measures statistical separability between two distributions.

Afterwards, the validation protocol of different classification scenarios used a stratified random sample of 70% of the reference pixels for training and 30% for the validation [33]. Mean overall accuracy (OA) with confidence intervals was reported in this research since twenty random splits of training and validation data were performed [11]. Besides OA, two simple measures (i.e., quantity disagreement (QD) and allocation disagreement (AD) [63]) were used in this research. As reported in the paper by Stehman and Foody [64], Kappa coefficient is highly correlated with OA, and therefore, we opted for QD and AD. The former measure refers to a difference in a number of pixels of the same class and the latter measure refers to a spatial location mismatch for every LC class between the training and test dataset [65]. Additionally, the user's accuracy (UA), as a measure of the reliability of the map, and producer's accuracy (PA), as a measure of how well the reference pixels were classified, were computed for individual LC classes [66].

In this research, various classification scenarios (package "randomForest" [67]) and accuracy assessments (package 'caret' [68]) were conducted using the R programming language, version 4.0.3., through RStudio version 1.3.1093.

#### **4. Results and Discussion**

*4.1. Optical NDVI and Radar Backscatter (VV, VH) Time-Series*

As shown in Figure 2, water and forest can easily be detected throughout the whole season, whereas built-up and bare soil class show a similar pattern, except in August, which can easily be resolved by using additional spectral indices (e.g., SAVI, normalized difference built-up index (NDBI) [69]). Since the cropland class in the investigated study area is consisted mostly of single cropping plant systems (e.g., cereals, maize, and

potato), characteristic crop phenology pattern can be recognized, which consists of the sowing (March), growth (from April to August), and harvest (September). The biggest inter-class overlap for separating the vegetation occurs between grassland, orchard, and vineyard. Therefore, in this research, Jeffries–Matusita (JM) distance was used as a spectral separability measure [10,70].

**Figure 2.** Temporal behavior of optical NDVI time-series profiles for LC class analysis.

Since the backscatter signal is affected by soil moisture, surface roughness, and terrain topography, the VV and VH polarization bands analysis is presented in Figure 3. Overall, the lowest VV and VH values have water class, since only a very small proportion of backscatter is returned to the sensor due to the side-looking geometry [71]. On the other hand, the highest mean VV and VH values consist of the built-up class, due to the doublebounce effect in the urban areas [72]. Vegetation classes tend to overlap within the VV and VH bands due to the volume scattering, whereas the backscatter values are higher in the VV than VH due to a combination of single bounce (e.g., leaves, stems) and bare soil double-bounce backscatter [73].

**Figure 3.** Mean (**a**) VV and (**b**) VH backscatter values, for each LC class investigated. The classes are represented as follows: 1 = Cropland; 2 = Forest; 3 = Water; 4 = Built-up; 5 = Bare soil; 6 = Grassland; 7 = Orchard;8=Vineyard.

#### *4.2. Jeffries–Matusita (JM) Distance Variability Results of Each Class*

The JM distance results for the similarity of each class and each sensor calculated are shown in Table 4. For both sensors used in this research (i.e., S1 and S2), the water class was the only LC class identified with JM values above 1.7, which indicates good separability with other classes. This class separability was also confirmed with calculated NDVI profiles (Figure 2). Furthermore, fairly good separation can be found for the forest class using the S1 polarization bands, whereas bare soil class separability is noticeable for the S2 bands. Similar to the NDVI profiles and radar backscatter (VV, VH) values, vegetation classes yielded low JM distance values, indicating that additional features (e.g., spectral indices, GLCM textures) should be used for better class differentiation.

**Table 4.** JM \* distance values of each LC class used in this research calculated for S1 (blue color) and S2 (green color) sensors.


\* JM values are in the range from 0 to 2, where distance values greater than 1.7 indicate a good separability between the LC classes. # The ID represents following classes: 1 = Cropland; 2 = Forest; 3 = Water; 4 = Built-up; 5 = Bare soil; 6 = Grassland; 7 = Orchard;8=Vineyard.

> Since this research used reference data from higher thematic levels of CORINE, LU-CAS, and LPIS database, the aforementioned eight LC classes were used for different classification scenarios and comparison with similar research. According to Dabboor et al. [74], the JM distance measure is wide in the case of high-dimensional feature space, mostly when hyperspectral imagery is used. In this research, different texture measures were used for increasing the class separability, as noted in the research by Klein et al. [75].

#### *4.3. Random Forest Hyperparameter Tuning Results*

For optimization of the RF hyperparameters, a grid search approach with *k*-fold cross-validation was performed (Table 5), and *k* was set to 5. Although Cánovas-García and Alonso-Sarría [57] mentioned that RF is not very sensitive to its hyperparameters, *ntree* and *mtry* values were set to 1000 and a one half of the input variables for each classification scenario, respectively. A larger number of trees of the forest led to a more stable classification, albeit it can increase computational time for vegetation mapping at regional to global scales.

**Table 5.** The cross-validated grid search relationship between the overall accuracy (%) and hyperparameters (*mtry* and *ntree*) of the RF classifier.


#### *4.4. Importance and Selection of S1 and S2 Input Features for Vegetation Mapping*

Before any classification scenario was conducted, the feature selection was performed for SAR (i.e., S1) and optical (i.e., S2) time-series data, as well as their ancillary features. As shown in Figure 4, major improvements in the overall accuracy are perceptible up to 50 features. An increase from the aforementioned number of features in the classification model provides a negligible improvement in the OA in relation to the computational cost

and processing requirements. Therefore, one-fourth most important features from the overall number of input features available for each classification scenario were used in this research.

**Figure 4.** Mean overall accuracy (OA) for combined S1 and S2 time-series as a function of the various number of input features.

According to the feature importance approach described in Section 3.2.2, Figure 5 shows the 50 first features sorted by the decreasing MDA. Color coding was used, depending on the source of the input feature (e.g., S1 or S2 band, derived ancillary features from S1 and S2). The digital elevation model (DEM) was the most important input feature for the classification, followed by the summer B4 (i.e., Red) S2 band, and winter MSAVI and NDWI S2 indices. Overall, for S1, the VH polarization band was the most important feature among the first 50 features, whereas GLCM Mean, Variance, and Correlation were the most important features among the nine textural features used in this research. The former variable (i.e., VH) is expected to be included in the final classification model, since it contains volume scattering information [76], whereas the latter GLCM features have already been proven for vegetation mapping [20,41].

In terms of S2, B12, B11, and B5 (i.e., SWIR2, SWIR1, and RE1) are the most present S2 spectral bands. These results coincide with similar research [77], e.g., Abdi [78] where nearly half of the input S2 variables belonged to the RE and SWIR bands, included in scenes from spring and summer dates. The high importance of the RE1 band could be associated with the mapping of different crop types [79], whereas SWIR bands were found to be important for mapping the forest class [80]. In the research by Immitzer et al. [81], the aforementioned S2 bands were most important for tree species and crop type mapping using single-date S2 imagery. In this research, the spectral indices were represented the most, with 28 of them among the 50 input features. NDVI, NDWI, SAVI, and MSAVI were represented the most in the classification model within this feature group, whereas NDI45, MCARI, and GNDVI were not included at all in the model. This is expected, since vegetation phenology in the time-series can be greatly represented with NDVI [82], and other indices provided good separation between other LC classes. In terms of the relevance of time periods, the spring dates are the most present for features that are connected with the vegetation classes, whereas December appeared to be the most important month for discriminating other non-vegetation LC classes.

**Figure 5.** S1 and S2 time-series feature importance sorted by the decreasing MDA. Error bars indicate 95% confidence intervals.

The aforementioned feature selection results coincide with similar research. Jin et al. [19] evaluated the variable importance of Landsat imagery through MDA and Gini index using RF for LC classification. Summer NDVI and NIR band, DEM, GLCM mean, and contrast were the key input features for classification, which achieved OA of 88.9%. Abdi [78] classified boreal landscape using S2 imagery and RF, support vector machine (SVM), extreme gradient boosting (XGB), and deep learning (DL) classifiers. RE and SWIR S2 bands were the most important features, and interestingly, none of the spectral indices were ranked highly in his research, probably because of the high correlation with the red edge bands. RF achieved an overall accuracy of 73.9%. Tavares et al. [77] used S1 and S2 data for urban area classification. Red and SWIR S2 bands were identified as the most significant contributors to the classification, whereas VV and GLCM mean were the most important features from S1 and texture features, respectively. The authors agree that DEM should be included for major classification enhancements, which was done in our research. The integration of S1 and S2 data yielded the highest OA of 91.07% [74].
