Enhancement of Comparative Assessment Approaches for Synthetic Aperture Radar (SAR) Vegetation Indices for Crop Monitoring and Identification—Khabarovsk Territory (Russia) Case Study

Sorokin, Aleksei; Stepanov, Alexey; Dubrovin, Konstantin; Verkhoturov, Andrey

doi:10.3390/rs16142532

Open AccessArticle

Enhancement of Comparative Assessment Approaches for Synthetic Aperture Radar (SAR) Vegetation Indices for Crop Monitoring and Identification—Khabarovsk Territory (Russia) Case Study

¹

Computing Center Far Eastern Branch of the Russian Academy of Sciences, 680000 Khabarovsk, Russia

²

Far-Eastern Agriculture Research Institute, 680521 Vostochnoe, Russia

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(14), 2532; https://doi.org/10.3390/rs16142532

Submission received: 27 May 2024 / Revised: 4 July 2024 / Accepted: 8 July 2024 / Published: 10 July 2024

(This article belongs to the Special Issue Remote Sensing in Land Management)

Download

Browse Figures

Versions Notes

Abstract

Crop identification at the field level using remote sensing data is a very important task. However, the use of multispectral data for the construction of vegetation indices is sometimes impossible or limited. For such situations, solutions based on the use of time series of synthetic aperture radar (SAR) indices are promising, eliminating the problems associated with cloudiness and providing an assessment of crop development characteristics during the growing season. We evaluated the use of time series of synthetic aperture radar (SAR) indices to characterize crop development during the growing season. The use of SAR imagery for crop identification addresses issues related to cloudiness. Therefore, it is important to choose the SAR index that is the most stable and has the lowest spatial variability throughout the growing season while being comparable to the normalized difference vegetation index (NDVI). The presented work is devoted to the study of these issues. In this study, the spatial variabilities of different SAR indices time series were compared for a single region for the first time to identify the most stable index for use in precision agriculture, including the in-field heterogeneity of crop sites, crop rotation control, mapping, and other tasks in various agricultural areas. Seventeen Sentinel-1B images of the southern part of the Khabarovsk Territory in the Russian Far East at a spatial resolution of 20 m and temporal resolution of 12 days for the period between 14 April 2021 and 1 November 2021 were obtained and processed to generate vertical–horizontal/vertical–vertical polarization (VH/VV), radar vegetation index (RVI), and dual polarimetric radar vegetation index (DpRVI) time series. NDVI time series were constructed from multispectral Sentinel-2 images using a cloud cover mask. The characteristics of time series maximums were calculated for different types of crops: soybean, oat, buckwheat, and timothy grass. The DpRVI index exhibited the highest stability, with coefficients of variation of the time series that were significantly lower than those for RVI and VH/VV. The main characteristics of the SAR and NDVI time series—the maximum values, the dates of the maximum values, and the variability of these indices—were compared. The variabilities of the maximum values and dates of maximum values for DpRVI were lower than for RVI and VH/VV, whereas the variabilities of the maximum values and the dates of maximum values were comparable for DpRVI and NDVI. On the basis of the DpRVI index, classifications were carried out using seven machine learning methods (fine tree, quadratic discriminant, Gaussian naïve Bayes, fine k nearest neighbors or KNN, random under-sampling boosting or RUSBoost, random forest, and support vector machine) for experimental sites covering a total area of 1009.8 ha. The quadratic discriminant method yielded the best results, with a pixel classification accuracy of approximately 82% and a kappa value of 0.67. Overall, 90% of soybean, 74.1% of oat, 68.9% of buckwheat, and 57.6% of timothy grass pixels were correctly classified. At the field level, 94% of the fields included in the test dataset were correctly classified. The paper results show that the DpRVI can be used in cases where the NDVI is limited, allowing for the monitoring of phenological development and crop mapping. The research results can be used in the south of Khabarovsk Territory and in neighboring territories.

Keywords:

vegetation index; remote sensing; machine learning; synthetic aperture radar; Sentinel-1

1. Introduction

Crop rotation control is relevant for both agricultural departments and insurance companies. In countries with intensive agriculture and large numbers of small fields, crop identification and monitoring are not possible without the use of remote sensing methods. In India, for example, the Mahalanobis National Crop Forecast Centre (MNCFC) has supported pilot studies in accurate cropland classification and yield estimation throughout the country using remote sensing and crop simulation modeling [1]. Crop identification and monitoring are also crucial problems for regions where the existing cropland databases contain a relatively large number of errors. An example of this is the Russian Far East, where the huge distances, poor transport logistics, waterlogging of land, and significant area of unutilized agricultural land in the southern part of the region necessitate a preliminary assessment of the cropland using satellite data.

Recently, vegetation indices (VIs) calculated based on optical and radar images have been used to identify and classify cropland. This has been achieved using sets of various optical vegetation indices, such as the normalized difference vegetation index (NDVI), perpendicular moisture index (PMI), and normalized difference residue index (NDRI). Classification methods such as Random Forest (RF), support vector machines (SVM), and decision trees are often used [2,3], including neural network classification before the peak of vegetation is reached [4]. The most popular optical vegetation index is the NDVI. Using the decision tree method, Chen et al. [5] applied 16-day MODIS NDVI composites to extract crop rotation patterns and evaluate crop area in the state of Mato Grosso in Brazil. Bellon et al. [6] utilized higher-resolution Landsat-8 data to calculate the mean NDVI values for each field in a study area in Brazil and identified 10 existing crops using the k nearest neighbors method. Later, an automated cropland classification was carried out in three provinces of Germany using RF to identify 12 crops according to the NDVI obtained by Landsat-8 and Sentinel-2 imagery [7]. In cloudy conditions, unmanned air vehicles (UAVs) are often used to obtain remote sensing data [8,9,10]. In India, classification using neural convolutional networks based on UAV imagery was performed to identify five crops [11]. However, the use of UAVs over a large area increases the cost of acquisition and data processing (including ground data monitoring and technical support).

Synthetic aperture radar (SAR) images can act as an alternative data source for crop classification. The main advantage of SAR data compared to optical indices is their lower sensitivity to weather conditions, including clouds. Moreover, the backscatter coefficient is very sensitive to the structural characteristics and dielectric properties of the targets. This makes such data suitable for distinguishing among different vegetation types, because the response is crop-specific due to factors such as the plants’ size and shape, and the orientation of the leaves and stems.

The use of SAR data in cropland mapping began with the use of individual polarizations and their ratio. In [12], the sensitivity of the backscatter coefficient ratio of the dual-pol VH/VV (vertical–horizontal/vertical–vertical polarizations) to crop growth dynamics was investigated. This ratio had a better correlation with biomass than separate VV and VH. McNairn et al. [13] indicated that this ratio is also applicable to assessments of phenological stages, the characterization of vegetation, and the mapping of corn, soybean, and sunflower before harvesting. Numerous subsequent studies have shown that SAR data are suitable for crop type classification [14,15,16,17,18,19].

Polarimetric parameters have been used for crop monitoring and mapping in other studies. Shannon entropy, alpha, and anisotropy from SAR data are suitable for use in crop growth stage identification and crop mapping [20,21,22]. The use of interferometric coherence with polarimetric features has also allowed for an increase in the number of detectable phenological events in the phenological cycles of crops [23].

A number of vegetation indices based on SAR data have been developed for crop monitoring purposes, such as the modified radar vegetation index [24,25], dual polarization SAR vegetation index [26], compact-pol radar vegetation index [27], generalized radar vegetation index [28], and SAR-calibrated NDVI index [29]. These indices are usually calculated based on full-polarimetric SAR images from satellites such as RADARSAT-2 and ALOS-2. Several studies have been carried out using SAR indices for cropland mapping [14,18].

Full-polarimetric (full-pol) SAR data, which are acquired in various polarimetric combinations, are related to the electrical and geometric properties of the surface. This provides complete, high-quality information about the characterization of the probed surface. However, full-pol images have a smaller swath width compared to single- and dual-polarization SAR images and have higher system requirements for their acquisition [27]. The main advantage of compact-polarization (compact-pol) SAR data is that they provide enhanced target information compared to standard single- and dual-polarization SAR systems while covering a much wider swath width than full-pol systems. The computational complexity of calculating vegetation indices from compact-pol data is even higher than that from full-pol SAR data.

Mandal et al. [30] proposed the dual polarimetric radar vegetation index (DpRVI). Unlike VH/VV and RVI, DpRVI can be derived from Sentinel-1 Level-1 Single-Look Complex (SLC) products with only dual polarization (VV and VH). These data contain information about the scattering of the SAR signal, which is characterized by the degree of polarization and provides a measure of the dominant scattering mechanism. In turn, because of these indicators, DpRVI becomes more sensitive to crop growth and is used as a relatively simple and physically interpretable descriptor of vegetation, as noted by Mandal et al. [30].

Along with the development of SAR satellites, the lack of comprehensive studies dedicated to using SAR data for crop mapping in different regions is an urgent problem. Widely used optical indices are not always applicable. It is necessary to analyze and evaluate SAR vegetation indices, compare their stability in time series, establish the characteristic type of the seasonal growth curve for each crop, and evaluate the possibility and accuracy of crop classification and identification using SAR indices. The presented work is devoted to the study of these issues in the southern part of the Russian Far East on the basis of Sentinel-1 data.

2. Materials and Methods

2.1. Study Area

The experimental fields of the Far Eastern Agriculture Research Institute (FEARI) extend from 48°28′N to 48°33′N and from 135°13′E to 135°24′E in the Khabarovsk Municipal District in Russia, between the villages of Mirnoe and Sergeevka. The region displays monsoon features and is characterized by moderately cold winters with little snow and warm, excessively moist summers. The alluvial and meadow alluvial soils of the Amur River valley are very suitable for agriculture. In addition, the abundance of moisture, sunlight, and suitable soils allows the cultivation of a wide range of crops, including cereals and legumes [31]. Figure 1 shows the study area with marked crop fields.

Data on crop rotation for 43 experimental fields were provided by FEARI employees and were checked via a ground visual inspection (Table 1). Field borders were obtained from the Unified Federal Agricultural Land Information System (https://efis.mcx.ru (accessed on 15 April 2024)) and revised after a mid-season field inspection in 2021. The aim of such inspections is to identify any substantial discrepancies between formal and natural field borders. The major crops in the FEARI fields were soybean, oat, buckwheat, and perennial grasses, such as timothy grass, and these four crops were included in this study. The climatic conditions in the Russian Far East allow only spring crops to grow. The crop season starts at the end of April and finishes at the beginning of November, with a length of 193 days. The full maturation of crops occurs before the middle of September. The total area of the fields was 1009.8 ha, with an average field size of 26.5 ha. The most common crop in the FEARI (as well as in the Far East of Russia as a whole) was soybean. It was grown in 18 fields with a total area of 460.8 ha and an average field size of 25.6 ha. Buckwheat grew on twelve fields with a total area of 243.1 ha and an average field size of 20.3 ha, and oat was sown in nine fields with a total area of 181.4 ha and an average field size of 16.5 ha. Timothy grass occupied four fields, covering 124.5 ha of cropland with an average field size of 31.1 ha.

2.2. Data Acquisition and Processing

SAR images from Sentinel-1B for the study area at a spatial resolution of 20 m at 12-day intervals from 14 April 2021 to 1 November 2021 were obtained from the Alaska Satellite Facility Distributed Active Archive Center (https://search.asf.alaska.edu/ (accessed 11 January 2024)), which contains modified Copernicus Sentinel data from 2015 processed by the European Space Agency. The two polarizations (VH and VV) of 17 Sentinel-1B products were processed to generate the DpRVI, RVI, and VH/VV time series. The output data were time series data in the raster format (*.tif) of SAR indices for each crop field. The technique used to obtain such series is based on the transformation of complex values obtained simultaneously via several polarization channels by the polarimetric decomposition method, which contains all the information about the polarimetric scattering properties of the probed surface. The basic formulas for calculating the VH/VV and RVI indices are as follows:

σ_{V H V V} = \frac{σ_{V H}^{0}}{σ_{V V}^{0}}

(1)

R V I = \frac{4 \cdot σ_{V H}^{0}}{σ_{V V}^{0} + σ_{V H}^{0}}

(2)

where

σ_{V H}^{0}

—VH polarization backscatter intensity;

σ_{V V}^{0}

—VV polarization backscatter intensity.

2.3. Generation of the DpRVI Time-Series Product

The calculation of the DpRVI index using Sentinel-1 mission data was based on the transformation of complex polarimetric SAR data obtained simultaneously in two polarization channels, VV and VH [30]. All SAR images were processed using the SNAP 9.0 Sentinel-1 Toolbox (http://step.esa.int/main/, accessed 5 February 2024), with three main stages of processing. First, for each sub-swath, those bursts that covered the study area were selected using the S1 Terrain Observation with Progressive Scans (TOPS) split operator. Next, the precise orbit state vector files, which contained accurate information about the satellite position and speed, were downloaded and applied. Then, radiometric calibration was performed, during which it was important to keep the data in a complex-valued format because complex values were needed to calculate the covariance matrix for dual-polarization C2 in the following steps. At this stage, data processing was performed in batch mode for all interferometric wide swath products separately.

In the second step, the TOPS Deburst operator was applied to each stack of interferometric wide swath products. Then, using the S1 TOPS merge operator, all bursts were merged into one SLC SAR image. Next, the operator for multi-looking by 3 × 1 (in the range and azimuth directions, respectively) was applied. This operation allows the user to reduce the level of speckle noise and make the pixels “square” in size. The spatial resolution of the output was 14 m. Then, for each date for the entire data stack, a dual-polarization 2 × 2 covariance matrix C₂ was formed, as shown in Equation (3):

C_{2} = [\begin{matrix} C_{11} & C_{12} \\ C_{21} & C_{22} \end{matrix}] = [\begin{matrix} 〈{|S_{V V}|}^{2}〉 & 〈|S_{V V} S_{V H}^{*}|〉 \\ 〈|S_{V H} S_{V V}^{*}|〉 & 〈{|S_{V H}|}^{2}〉 \end{matrix}]

(3)

where the * symbol indicates complex conjugation and 〈〉 indicates the spatial average over a moving window.

All elements of matrix C₂ were also subjected to a noise-mitigation procedure using the refined Lee adaptive filter [32]. The elements of the matrix were complex values containing all the information about the polarimetric scattering properties of the targets. The polarimetric decomposition technique was then used to obtain the parameters of the state of polarization of an electromagnetic wave, which was characterized by the degree of polarization m (0 ≤ m ≤ 1) [30,33] and the measure of the dominant scattering mechanism β of the reflecting target as follows:

m = \sqrt{\frac{4 |C_{2}|}{{(T r (C_{2}))}^{2}}}, β = \frac{λ_{1}}{λ_{1} + λ_{2}}

(4)

where Tr is the sum of the diagonal elements of the matrix, | | is the determinant of a matrix, and m is the degree of polarization, which is defined as the ratio of the average intensity of the polarized portion of the wave to the average total intensity of the wave. The measure of the dominant scattering mechanism is represented by β, which was determined based on the decomposition of the matrix C₂ into two non-negative eigenvalues (λ₁ ≥ λ₂ ≥ 0), as follows:

C_{2} = U_{2} \cdot Σ \cdot U_{2}^{- 1}, w h e r e Σ = [\begin{matrix} λ_{1} & 0 \\ 0 & λ_{2} \end{matrix}]

(5)

The DpRVI index was calculated for each date using Equation (6):

D p R V I = 1 - m β = 1 - \sqrt{\frac{4 |C_{2}|}{{(T r (C_{2}))}^{2}}} \cdot \frac{λ_{1}}{λ_{1} + λ_{2}}

(6)

In the third step, a stack of DpRVI time series was created, and the study area subset was chosen. All SAR images were geocoded to a WGS84 UTM projected coordinate system using the Range Doppler Terrain correction operator. The shapefiles of the fields were imported, and the data for each separate field were saved as rasters in the GeoTiff format. Finally, using the GeoPandas library in Python, the raster files were converted to numeric format.

2.4. NDVI Time-Series

For the study area, 102 multispectral images with 10 m resolution, acquired from Sentinel-2A/B satellites from 14 April to 1 November 2021, were considered. Subsequently, NDVI time series were formed, with the values of indicators for each day determined using the following formulas:

N D V I = \frac{N I R - R E D}{N I R + R E D}

(7)

2.5. Estimation of SAR Time Series

The spatial stabilities of SAR indices for different crop sites were evaluated using their coefficients of variation. VH/VV, RVI, DpRVI, and NDVI time series served as input information. The formulas for calculations of the estimated parameters for one crop species are as follows:

V A R ({V I}_{i}) = \frac{{σ (V I}_{i})}{{\bar{V I}}_{i}} \cdot 100 %

(8)

i \in (1 . . n)

where

VAR—coefficient of variation;

{V I}_{i}

—vegetation index on a certain date (for VH/VV, RVI, DpRVI, or NDVI);

n—number of satellite imagery dates;

{σ (V I}_{i})

—standard deviation of VI for all fields at a given date for a single crop;

{\bar{V I}}_{i}

—average vegetation index on a given date for all fields with the same crop (for VH/VV, RVI, DpRVI, or NDVI).

The mean values of maximum VI and day of maximum VI for each crop were calculated, and the

∆ {\bar{V I}}_{m a x}

and

∆ {\bar{D O Y}}_{m a x}

were calculated using the following formulas:

∆ {\bar{V I}}_{m a x} = t (p, f) \cdot {σ (V I}_{m a x})

(9)

∆ {\bar{D O Y}}_{m a x} = t (p, f) \cdot {σ (D O Y}_{m a x})

(10)

where

{σ (V I}_{m a x})

—standard deviation of

{V I}_{m a x}

across all fields at a given date for a single crop;

{σ (D O Y}_{m a x})

—standard deviation of

{D O Y}_{m a x}

across all fields at a given date for a single crop;

t(p,f)—t-value;

p = 0.05.

2.6. Sample Filtering

Figure 2 shows a flowchart of the study effort, which included the preparation of the DpRVI time series, classification using machine learning, and the evaluation of classification accuracy using different methods.

Bias caused by atmospheric conditions is a prevalent feature of optical indexes. SAR data are less susceptible to such distortions. However, uneven sowing or plant growth may cause substantial variations in DpRVI data. Therefore, the correction of abnormal DpRVI values is a necessary measure aimed at preparing a high-quality dataset for cropland classification by applying ML methods. The main requirement for the training dataset construction is the reliability of the data; hence, it is advisable to include only relevant (not abnormal) samples from each field. Determining the relevance criteria is also extremely important. During the course of the study, we calculated the average values of the DpRVI for each field. Abnormal values were defined as values for each field that did not fall within the two-sigma intervals from the average DpRVI for that field. All samples with abnormal values were excluded from the final dataset. The normality of the DpRVI time series was verified by applying Shapiro–Wilk’s test (p < 0.05) to the averaged crop series [34]. The test statistics and p-values are shown in Table 2.

The normality of the data allowed for the use of both nonparametric and parametric classifiers.

2.7. Construction of the Training and Test Datasets

After filtering, there was a total of 53,576 pixels. The full set of pixels was divided into training and test datasets at the level of individual fields. The training dataset contained 39,528 (74% of time series) pixels and covered 26 fields. The test dataset contained 14,048 (26% of time series) pixels and covered 17 fields. The distribution of pixels by crop type is presented in Table 3, which shows that the crop area ratio does not match the pixel ratio. This is due to the variation in DpRVI in different fields. For example, the smallest proportion of filtered pixels was observed in the fields with timothy grass, and the largest was observed in those with oat.

2.8. Crop Classification Methods

Classification was performed using the Matlab 2020b software with the Classification Learner package. This package contains different ML approaches for supervised classification, such as decision trees, discriminant analysis, naïve Bayes (NB) classifiers, the k nearest neighbors technique, and ensemble classifiers. The Matlab optimizable algorithm was applied to choose one particular method from each group and set optimal parameters (Table 4).

Six methods were selected for cropland classification: fine tree (FT), quadratic discriminant (QD), Gaussian naïve Bayes (GNB), fine k nearest neighbors (KNN), random under-sampling boosting (RUSBoost), and SVM. We also applied the RF classifier, which is the most common ML method for crop mapping [14,35,36].

The decision tree classifier (DTC) is one of the best classifiers for land cover classification [5]. Compared to other ML methods, which require considerable time to obtain the optimized parameters, DTC is much easier to interpret and can be built from a direct inspection of the variables. FT has high flexibility because many leaves are used to make fine distinctions between classes. The maximum number of splits (3321 splits in this case) controls the depth of the tree. The split criterion decides when to split nodes; for example, we used maximum deviance reduction (also known as cross-entropy), which is always applied for multiclass classification. A Bayesian optimizer was applied for the extremum search.

Discriminant Analysis is a popular classification algorithm because it is fast, accurate, and easy to interpret [37]. It is known to perform well for wide datasets. Quadratic Discriminant Analysis is the classic classifier with a quadratic decision surface. This classifier is useful because it has closed-form solutions that can be easily computed, is inherently multiclass, has been proven to work well in mapping tasks [38], and has no hyperparameters to tune. To train such a classifier, the fitting function is used to estimate the parameters of a Gaussian distribution for each class. We used the standard full covariance structure for training.

The naïve Bayes method (NB) is a supervised learning algorithm based on Bayes’ theorem with the “naïve” assumption of conditional independence between every pair of features given the value of the class variable [39]. NB uses classification rules and maximum a posteriori estimation to estimate class probabilities. It is easy to interpret, which is why it is widely used in multiclass classification [40]. The likelihood of the features in our case is assumed to be Gaussian. We specified the Triangle kernel smoother type and unbounded density support for the NB classifier.

KNN is one of the simplest and most intuitive techniques for classification. It uses the parameter k (number of neighbors), where a new observation is placed into the class of observations from the learning set that is closest to it with respect to the covariates used. The determination of this similarity is based on distance measures [41]. This method has been successfully applied in forest mapping [42,43]. Fine KNN assumes finely detailed distinctions between classes. In this study, the number of neighbors was set to 30, which allowed the model to learn faster. The squared inverse Euclidean distance was used as a distance metric, and additional standardization was not applied.

RUSBoost is designed to improve the performance of models trained on skewed data. It introduces data sampling into the AdaBoost algorithm and applies random undersampling (RUS), a technique that randomly removes samples from the majority class [44]. In this study, the maximum number of splits during training was 5455. To construct an ensemble with good predictive power, 72 learners were applied. The learning rate for shrinkage was standard (set to 0.1).

SVM is a supervised learning method based on statistical learning theory and the structural risk minimization principle [45]. Given a set of labeled training points associated with each class, the SVM optimization algorithm attempts to find the optimal decision surface (the hyperplane) that separates the training points and classifies all pixels based on whether the hyperplane distance to the marginal training points (or support vectors) of each class is maximized [46]. This classification method is widely used in remote sensing for crop type mapping [3,47,48]. In this study, we applied SVM with a radial basis kernel (C = 1, gamma = 0.07).

RF [49] is an ML algorithm that has been proven to be useful for crop mapping [18,50,51]. It is known for performing particularly well and efficiently on large datasets with many different features. Another advantage of RF is its high accuracy and robustness despite the presence of outliers and noise [47]. It is an ensemble of classification trees based on decision tree classifiers. The number of trees in our model was 100.

2.9. Accuracy Assessment of the Classification Results

We performed an accuracy assessment of cropland classification to compare the predictive power of different methods using the kappa coefficient, producer accuracy (PA), user accuracy (UA), and overall accuracy (OA). The kappa coefficient measures the difference between the observed agreement and the probability of agreement occurring by chance. Its value ranges between 0 (no more agreement than expected by chance) and 1 (perfect agreement). It was introduced to the remote sensing community in the early 1980s and is now a widely used measure for classification accuracy. Bishop [52] has largely been credited with formulating Equation (11):

κ = \frac{N \sum_{i = 1}^{r} X_{i i} - \sum_{i = 1}^{r} X_{i +} X_{+ i}}{N^{2} - \sum_{i = 1}^{r} X_{i +} X_{+ i}}

(11)

where N is the total number of pixels for all crop types, r is the number of crop types, X_ii is the number of correctly classified pixels of crop type i, X_i+ is the total number of pixels of crop type i in the ground truth map, and X_+i is the total number of pixels of crop type i in the classified map.

The following equations were used to calculate PA, UA, and OA:

{P A}_{i} = \frac{X_{i i}}{X_{i +}} * 100 %

(12)

{U A}_{i} = \frac{X_{i i}}{X_{+ i}} * 100 %

(13)

O A = \frac{\sum_{i = 1}^{r} X_{i i}}{N} * 100 %

(14)

PA is the accuracy of the map from the producer’s point of view and quantifies the probability that feature class i on the ground is correctly classified on the map. UA is the accuracy from the user’s perspective and quantifies the reliability of the map, i.e., the probability that feature i on the map is actually present on the ground. OA quantifies the fraction of pixels correctly mapped by the crop classification method [53]. OA, PA, and UA are calculated using confusion matrices. Each row of such a matrix represents the instances in an actual class, and each column represents the instances in a predicted class. For example, in confusion matrix C, the element Ci,j represents the number of observations that are known to be in class i and predicted to be in group j. Accordingly, OA is calculated as the proportion of the sum of the diagonal elements in the matrix, PA for class i is calculated as the ratio of diagonal element Ci,i to the sum of elements in row i, and UA for class j is calculated as the ratio of diagonal element Cj,j to the sum of elements in column j. We performed the McNemar test [54] to compare the performance of classifiers.

2.10. Crop Identification at the Field Level

Undoubtedly, determining whether pixels belong to a particular class is very important. However, for agricultural purposes, it is more important to determine the crops grown in individual experimental fields. In our study, the size of the fields in the test set varied from 3.3 ha to 42.6 ha. Thus, after filtering, the number of image pixels from the Sentinel-1 satellite in the field coverage area ranged from 233 to 2078. Such a sample size is sufficient to identify the crop in a field. To achieve this, we counted the number of pixels of each class in a particular field and matched the pixels with the crop of the most common class. Similar identification algorithms have been applied by some researchers for parcel mapping [17]. In the Results section, we present the results of the classification of individual pixels and the crop identification results for individual fields in the test set.

3. Results

3.1. Comparative Analysis of SAR Indices

Figure 3 shows the values of SAR vegetation indices for fields with soybean, oat, buckwheat, and timothy grass in the Khabarovsk Territory. For each class considered, the DpRVI, RVI, and VH/VV curves had a similar shape. The correlation coefficients calculated between the time series indices for the three SAR indices were in the range of 0.96–0.99 for the different field classes. The time series values of the seasonal course of soybean radar VI increased from the beginning of July to the last decade of September (Figure 3a). This is logical, considering that soybean sowing in the southern part of the Far East is usually carried out in late May and the first half of June and that the formation of grass cover on fields in May 2021 depended on the timing of plowing, herbicide treatment, and other measures. Starting in late September, the SAR VI values for fields with soybean decreased quite rapidly, which was associated with the wilting stage of plants at low temperatures. At that time, soybean is ready for harvesting, but the exact date varies for different regions and municipalities depending on weather conditions up to the beginning of the calendar winter.

Oat is sown in May. For the Khabarovsk Territory, an early increase in SAR VI values before the sowing of the oat crop was observed (Figure 3b). Sufficiently high values are also characteristic of soybean fields during the same period, which may be caused by the rapid growth of weeds and the peculiarities of crop rotation—oat seeds left on the field last year produced early sprouts. The maximum SAR VI for oat was reached in July. For the whole study area, oat harvesting was carried out from the last decade of July to the last decade of August. VI values continued to increase in September because of the development of the biomass of perennial grasses after oat harvesting, which is especially relevant for the Far East, where oat is often sown together with perennial grasses.

The curves for SAR VI before sowing buckwheat are characterized by an increase in values associated with the development of weed vegetation (Figure 3c). The sowing of buckwheat is carried out in the middle of July. Harvesting is carried out in the second half of September, which also corresponds to the courses of SAR VIs. Timothy grass is a perennial grass that is sown together with oat in the first year. In the second and third years, biomass recruitment takes place immediately after the snow cover. As Figure 3d shows, the maximums of DpRVI, RVI, and VH/VV are reached between late July and early August, after which the grass is harvested. In the Khabarovsk Territory, two mowings of timothy grass per season are possible.

An analysis of the variability of the VI time series for individual fields confirmed the highest stability of the DpRVI index. The variabilities of the three indices’ series during the vegetation cycle for all studied classes differed significantly (Figure 4). For example, for soybean, VAR values in the period from May to the first decade of July for DpRVI were in the range of 10–15%, those for RVI were in the range of 13–18%, and those for VH/VV were in the range of 18–23% (Figure 4a). From mid-July through the third decade of September, the variability of DpRVI, RVI, and VH/VV decreased to 5–8%, 8–12%, and 10–15%, respectively. Figure 4b shows the changes in the variability of VI values for oat. The coefficients of variation of DpRVI for individual fields over the entire observation period were lower than those for RVI and VH/VV, respectively, by averages of 5–8% and 10–15%. The minimum values of variability of SAR indices in the Khabarovsk Territory in the period from early July to mid-August (from maturation to harvesting) were in the ranges of 3–10% for DpRVI, 10–13% for RVI, and 14–20% for VH/VV. Further increases in variability are associated with the peculiarities of oat sowing. In some places, sowing includes perennial grasses that contribute to the growth of index values; in other places, perennial grasses are not included. In the first half of August, the variability of buckwheat after sowing is directly related to the irregularity of crop development. When the values of VI reach maximum values in September, the coefficients of variation become minimal (Figure 4c). As with other crops, throughout the vegetation cycle, the variability of DpRVI is lower than the variabilities of RVI and VH/VV. As Figure 4d shows, the coefficients of variation of timothy grass from the beginning of May until the maximum VI are fairly stable, in the ranges of 7–10% for DpRVI, 9–15% for RVI, and 12–19% for VH/VVV. Further increases in variability are due to different harvesting dates of perennial grasses for different fields up to the climatic winter.

The main characteristics of the VI time series for individual fields with soybean, oat, buckwheat, and timothy grass are presented in Table 5. The mean VI maximum, mean value of the maximum day, and variability of these SAR indices were calculated. For comparison, the mean VI maximum and mean day of maximum were also determined for the NDVI time series.

The coefficients of variation of the mean maxima of the DpRVI, RVI, and VH/VV time series were 6.7%, 9.1%, and 11.7%, respectively. The variation in the maximum day for soybean was also minimal for DpRVI. The variabilities in maximum NDVI and maximum NDVI day were slightly lower than the corresponding values for DpRVI.

The variability in the maximum VI day was higher for oat fields than for soybean fields. The

{V A R}_{D p R V I}

was 6.2%, while the indicator values for RVI and VH/VV were higher by factors of about 1.5 to 3. The variability of the NDVI maximum for oat was comparable to the DpRVI variability at 8.1%. The calendar day maximums for oat were also characterized by greater variability in comparison with soybean, which is explained by the peculiarities of sowing oat with perennial grasses.

The variabilities of

{\bar{D p R V I}}_{m a x}

for buckwheat and timothy fields were also lower than the variability of

{\bar{R V I}}_{m a x}

and

{\bar{V H / V V}}_{m a x}

were, making them slightly lower than the coefficient of variation of the NDVI maximum. In the average case, the day of the buckwheat DpRVI maximum is similar to the day of the soybean DpRVI maximum, with the buckwheat NDVI maximum occurring on day of year (DOY) 254 and the soybean NDVI maximum occurring on DOY 238. The maximum of the SAR time series for timothy grass corresponds to DOY 214 in the average case, and the maximum NDVI of timothy grass corresponds to DOY 180. The higher stability of the DpRVI time series characteristics in comparison with other SAR indices and the differences in seasonal series curves for different crops contributed to the choice of this index for classification.

3.2. Accuracy Estimation for ML Methods

Table 6 shows the learning time, OA for the test dataset, and κ for the applied ML methods. The accuracy and κ for the test dataset indicate the applicability of the trained model for data that are not used in training, and, consequently, for practical application. The QD model shows the highest OA for the test dataset, with 81.9% of pixels being recognized correctly (κ = 0.67). This value did not exceed 80% for the other algorithms. Although the KNN and RUSBoost methods yielded the best-fitting results, i.e., the best overall training accuracy, they were both prone to overfitting. The application of simple methods such as FT did not yield any significant advantages. The SVM and RF methods predicted 79% and 77% of the data correctly, respectively. The McNemar test [54] revealed that the performance of QD is significantly higher than that of other classifiers. The results show that QD was significantly more accurate (p < 0.05). This method permitted a decrease in the learning time to 30.3 s. This value was a third of that for RUSBoost, a fourteenth of that for RF, less than a fifteenth of that for fine KNN, and a forty-fourth and a hundredth of those of the computationally difficult SVM and GNB, respectively (learning on CPU took more than an hour).

It must be noted that the OA and κ coefficient do not fully reflect the quality of classification. Therefore, we built confusion matrices for the most productive method (QD) to assess UA and PA. These metrics helped us to interpret some features of the classifiers in relation to the separation of classes. Figure 5 shows the confusion matrix for QD.

The classification of soybean was performed successfully (UA = 95.7%, PA = 90.0%). This was facilitated by the significant acreage of soybean and its specific growing season. The PA for buckwheat was quite good (68.9%), although the UA was lower (56%). These results suggest that the number of pixels classified as buckwheat is overestimated. This could be a consequence of the uneven distribution of pixels among the original classes caused by the dominance of one crop in the region. The problem of the separation of oat and timothy grass was the most substantial. It was caused by the joint use of these crops during long-term crop rotation in fields designated as oat fields, where timothy grass is often sown to improve the quality of the growth of the main crop. In addition, in timothy grass fields, there are residual traces of harvested oat or oat seedlings. Nevertheless, 74.1% of the oat pixels and 57.6% of the timothy grass pixels were classified correctly. Finally, the PA and UA for all classes were above 50%. Therefore, we conclude that QD is the most suitable ML method for cropland classification.

3.3. Crop Identification

The pixel-level classification results were not linked to agricultural fields. Subsequently, an OA assessment was performed for each studied field. We applied the most productive method, the QD algorithm. Crops in 16 of the 17 fields were identified correctly. Of these 17 test fields, 10 were occupied by soybean. As mentioned previously, the PA for soybean was 90%, which explains why these fields were identified easily. Seven of the ten soybean fields contained more than 80% of the soybean-classified pixels. For the remaining soybean fields, the percentage of correctly classified pixels was lower but exceeded 50%. This decline in accuracy may be related to the small acreage of several fields and the ambiguity of their borders due to uneven sowing. However, this process ensured that all the soybean fields were correctly identified. False “buckwheat” pixels were present in some soybean fields but did not influence the correct identification. Figure 6 shows examples of soybean identification based on pixel distribution.

Three oat fields were identified successfully. Two of these contained over 95% of the pixels that were classified as oat. Field 51 contained 73% of the oat pixels and 21% of the timothy grass pixels, which were largely located near the field borders. These results suggest that field borders should be investigated further. Figure 7 shows some classification results for oat fields. Figure 7a shows the classified map of Field 51 and Figure 7b shows the successful classification of Field 162.

Two fields with buckwheat were identified correctly. However, the PA values for these fields were not very high. Field 93b included 79% buckwheat pixels and 20% timothy grass pixels. Almost all the grass pixels were located in the northern part of the field, which may indicate the absence of buckwheat plants at this site due to uneven sowing or the non-germination of buckwheat. Figure 8 shows the distribution of the pixel classes in Field 93b.

In the Crop Phenology section, we discussed the complexity of separating oat and timothy grass due to the joint growth of these crops. The only mismatch at the field level was detected for Field 92, which was labeled as an oat field in 2021. Figure 9 shows the distribution of pixel classes for this field. The classification showed similar proportions of timothy grass and oat (42% and 41%, respectively). The pixels of these two classes were spatially interspersed, which decreased the probability of reliable crop identification. In addition, the timothy grass field contained 42% of randomly distributed oat pixels throughout. These results necessitate the introduction of a methodology for distinguishing crops that grow together when entering data into the database.

Field 90 was the only field labeled as timothy grass that was identified correctly. However, the classifier faced certain challenges: 42% of the pixels were classified incorrectly due to oat residuals in the field. Figure 10 illustrates the classification results for this field. Despite some challenges, the crops in 94% of the studied fields were identified correctly.

4. Discussion

Many researchers have reported on the high accuracy of cropland classification when using satellite data. For example, Ouzemou et al. [47] reported an OA of 89% at the level of individual pixels. They used NDVI time series and an RF classifier to separate sugar beet, tree crops, cereals, and alfalfa in Morocco. Landsat 8 pixels were resampled to 15 m using the Gram–Schmidt algorithm due to the use of small fields. The classification of planted and ratoon sugarcane in the state of Uttar Pradesh in India based on smoothed NDVI time series using decision trees yielded an OA of 84.5% [55]. These results can only be reproduced if a sufficient number of cloud-free images are available. In our case, there were a limited number of these, which led us to use SAR data.

The accuracy obtained in our study indicates the possibility of the successful application of this method for crop identification at the field scale. Several studies have been carried out to identify crops at the field level using remote sensing data. Hao et al. [4] used 15-day NDVI and enhanced vegetation index (EVI) composites, calculated using data from three optical satellites, and the artificial immune network as the ML method, for classification at the field level. The classification accuracy was reported to be 97%. However, the classified crops had very different phenologies. Distinguishing between winter and spring crops is much easier than distinguishing among multiple spring crops. In addition, the computation of composites requires the reprojection and calibration of images with different resolutions and shooting paths, as well as data approximation, which is a separate, complex, and computationally difficult task. Finally, identification using only the centroids of each field does not allow for an assessment of the real outlines of the cropland in cases of uneven sowing or the partial use of areas to determine overseeding or the joint growing of crops.

Arias et al. [56] identified 14 crops at the field scale in Spain using a time series based on Sentinel-1 data. The classification results varied depending on the set of input features considered. An OA of approximately 70–75% was obtained for separate regions when the three channels (VH, VV, and VH/VV) were used as the input. The warm climate of Spain also allows for the sowing of both winter and summer crops, which have very different sowing dates and peaks of vegetation, which simplifies the task of classifying crops.

Robertson et al. [16] investigated three crops (maize, soybean, and wheat), and calculated the OA of two classification methods, DT and RF. Based on SAR-only stacks of RADARSAT-2 and Sentinel-1 data for ten global agricultural polygons, the OAs of six of the ten polygons obtained using the RF method were reported to be 85%. For the soybean class, UA and PA were 87% and 79%, respectively, which are both substantially lower than the corresponding metrics in our study. In a study by van Tricht et al. [18], the RF method was used to identify eight types of crops and obtained an OA of 82%. It should be noted that the classification accuracy achieved when using SAR and optical images together was the same as that obtained with the QD algorithm in our study, although we used only SAR data. Robertson et al. [16] and van Tricht et al. [18] used Sentinel-1 SAR images from Level-1 Ground Range Detected (GRD) products. These are stored as radar cross-sections and can be transformed to the area normalization factor and calibration constants, such as σ0, β0, or γ0. The SAR data of the SLC were used to calculate the DpRVI index. From these data, one can also extract information in terms of the degree of polarization (m) and the measure of the dominant scattering mechanism (β). For example, the average values of m and β decrease as soybean grows until it reaches its full vegetative growth. This may be because the order of scattering increases as the crop canopy develops. At an early stage, scattering from the soil surface predominates. At later stages of the growing season, there is scattering from the canopies of multiple crops. In addition, as observed by Mandal et al. [30], the DpRVI values correlate better with biophysical parameters than the values of σ0, the modified RVI, and the DPSVI (based on σ0). For different crops, m and β have different sensitivities at all stages of growth. This contributes significantly to the efficient use of information about scattered waves and makes the DpRVI more robust and stable in classification problems.

NDVI data do provide higher accuracy for automated crop classification. However, there are situations when, for a number of reasons, we cannot obtain NDVI data in the required period of time, so we consider the capabilities of DpRVI as an alternative, or as an addition, to NDVI data. In comparing the results obtained in this study with those of previous studies, we note that the stability and low variability of the DpRVI can contribute to a reasonably successful application of the proposed classifier for field-level crop identification in cloudy conditions. In the future, we plan to extend our research to include data from other SAR satellites and other study regions and evaluate the corresponding classification accuracy. It is also planned to investigate the possibility of the joint use of multispectral optical data and SAR data for crop identification. Adding optical data to the model will increase the number of recognizable crops. Highly accurate models can be used to produce detailed crop maps, estimate crop acreage and identify unused areas of cropland. The construction of time series of not only multispectral, but also SAR indices, along with an assessment of their characteristics makes it possible to monitor the development of crops in the phenological period and obtain more accurate forecasts.

5. Conclusions

The study results show that time series of DpRVI, RVI, and VH/VV for the main crops of the Khabarovsk Territory have distinctive features and can be used in classification tasks, yield modeling, crop recognition, and other tasks of digital farming. The annual DpRVI, RVI, and VH/VV curves for soybean, oat, buckwheat, and timothy grass were shown to have characteristic shapes that make it possible to identify each crop and to evaluate crop development. An approach to the assessment of the spatial variability and stability of three SAR indices and NDVI for crops at the site level was applied to a single region for the first time in this study. The coefficients of variation of DpRVI for individual fields with the same crops ranged from a half to a third of those of RVI and VH/VV. The main characteristics of the SAR and NDVI time series (the values of VI maximum, the maximum DOY, and the coefficient of variation) were calculated. The variabilities of the maximum and maximum DOY for DpRVI were lower than those of RVI and VH/VV, and the variabilities of the maximum and maximum DOY for DpRVI were comparable to those for NDVI.

In this study, it was established that the DpRVI dataset obtained by processing Sentinel-1 radar imagery for the period between the end of April and the beginning of November was suitable for cropland classification in the southern part of the Russian Far East. The QD algorithm achieved the best results among the seven ML methods considered. The pixel classification accuracy was approximately 82%, and the value of κ was 0.67, while the training time was significantly less than when using other methods. This method was also used to identify 90% of soybean pixels, 74.1% of oat pixels, 68.9% of buckwheat pixels, and 57.6% of timothy grass pixels correctly. Furthermore, 94% of the sites were correctly classified. The incorrectly classified pixels can be attributed to a disparity between the actual field boundaries and the declared borders, the overgrowth of weeds near field boundaries, or the features of crop rotation, such as the joint cultivation of oat and timothy grass. In general, the use of DpRVI can be a good alternative and complement to the indices calculated from multispectral images in cases where images are unavailable due to cloudiness or other reasons. The proposed SAR-based approach can be used to identify crops in the Russian Far East. Further research is planned to investigate the possibility of joint use of optical and SAR data, to increase the number of identifiable crops to create more detailed and accurate cropland maps.

Author Contributions

Conceptualization: A.S. (Alexey Stepanov) and A.S. (Aleksei Sorokin); methodology: A.S. (Alexey Stepanov) and A.S. (Aleksei Sorokin); formal analysis and investigation: K.D. and A.V.; writing—original draft preparation: K.D. and A.V.; writing—review and editing: A.S. (Alexey Stepanov) and A.S. (Aleksei Sorokin), resources: A.S. (Aleksei Sorokin); Supervision: A.S. (Alexey Stepanov). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Russian Science Foundation, project number 24-11-20030, https://rscf.ru/en/project/24-11-20030/.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Gumma, M.K.; Kadiyala, M.D.M.; Panjala, P.; Ray, S.S.; Radha, A.V.; Dubey, S.K.; Smith, A.P.; Das, R.; Whitbread, A.M. Assimilation of Remote Sensing Data into Crop Growth Model for Yield Estimation: A Case Study from India. J. Indian Soc. Remote Sens. 2021, 50, 257–270. [Google Scholar] [CrossRef]
Pasha, S.V.; Behera, M.D.; Mahawar, S.K.; Barik, S.K.; Joshi, S.R. Assessment of shifting cultivation fallows in Northeastern India using Landsat imageries. Trop. Ecol. 2020, 61, 65–75. [Google Scholar] [CrossRef]
Zhang, H.; Kang, J.; Xu, X.; Zhang, L. Accessing the temporal and spectral features in crop type mapping using multi-temporal Sentinel-2 imagery: A case study of Yi’an County, Heilongjiang province, China. Comput. Electron. Agric. 2020, 176, 105618. [Google Scholar] [CrossRef]
Hao, P.; Tang, H.; Chen, Z.; Meng, Q.; Kang, Y. Early-season crop type mapping using 30-m reference time series. J. Integr. Agric. 2020, 19, 1897–1911. [Google Scholar]
Chen, Y.; Lu, D.; Moran, E.; Batistella, M.; Dutra, L.V.; Sanches, I.D.; da Silva, R.F.B.; Huang, J.; Luiz, A.J.B. Mapping croplands, cropping patterns, and crop types using MODIS time-series data. Int. J. Appl. Earth Obs. Geoinf. 2018, 69, 133–147. [Google Scholar] [CrossRef]
Bellón, B.; Bégué, A.; Lo Seen, D.; Lebourgeois, V.; Evangelista, B.A.; Simões, M.; Ferraz, R.P.D. Improved regional-scale Brazilian cropping systems’ mapping based on a semi-automatic object-based clustering approach. Int. J. Appl. Earth Obs. Geoinf. 2018, 68, 127–138. [Google Scholar] [CrossRef]
Griffiths, P.; Nendel, C.; Hostert, P. Intra-annual reflectance composites from Sentinel-2 and Landsat for national-scale crop and land cover mapping. Remote Sens. Environ. 2019, 220, 135–151. [Google Scholar] [CrossRef]
Boiarskii, B.; Hasegawa, H.; Muratov, A.; Sudeykin, V. Application of UAV-Derived Digital Elevation Model in Agricultural Field to Determine Waterlogged Soil Areas in Amur Region, Russia. Int. J. Eng. Adv. Technol. 2019, 8, 520–523. [Google Scholar]
Su, Z.; Wang, Y.; Xu, Q.; Gao, R.; Kong, Q. LodgeNet: Improved rice lodging recognition using semantic segmentation of UAV high-resolution remote sensing images. Comput. Electron. Agric. 2022, 196, e106873. [Google Scholar] [CrossRef]
Wang, F.; Yi, Q.; Hu, J.; Xie, L.; Yao, X.; Xu, T.; Zheng, J. Combining spectral and textural information in UAV hyperspectral images to estimate rice grain yield. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102397. [Google Scholar] [CrossRef]
Pandey, A.; Jain, K. An intelligent system for crop identification and classification from UAV images using conjugated dense convolutional neural network. Comput. Electron. Agric. 2022, 192, e106543. [Google Scholar] [CrossRef]
Blaes, X.; Defourny, P.; Wegmuller, U.; Della Vecchia, A.; Guerriero, L.; Ferrazzoli, P. C-band polarimetric indexes for maize monitoring based on a validated radiative transfer model. IEEE Trans. Geosci. Remote Sens. 2006, 44, 791–800. [Google Scholar] [CrossRef]
McNairn, H.; Champagne, C.; Shang, J.; Holmstrom, D.; Reichert, G. Integration of Optical and Synthetic Aperture Radar (SAR) Imagery for Delivering Operational Annual Crop Inventories. ISPRS J. Photogramm. Remote Sens. 2009, 64, 434–449. [Google Scholar] [CrossRef]
Inglada, J.; Vincent, A.; Arias, M.; Marais-Sicre, C. Improved Early Crop Type Identification by Joint Use of High Temporal Resolution SAR And Optical Image Time Series. Remote Sens. 2016, 8, 362. [Google Scholar] [CrossRef]
Larranaga, A.; Alvarez-Mozos, J.; Albizua, L. Crop Classification in Rain-fed and Irrigated Agricultural Areas Using Landsat TM and ALOS/PALSAR Data. Can. J. Remote Sens. 2011, 37, 157–170. [Google Scholar] [CrossRef]
Robertson, L.D.; Davidson, A.M.; McNairn, H.; Hosseini, M.; Mitchell, S.; de Abelleyra, D.; Verón, S.; le Maire, G.; Plannells, M.; Valero, S.; et al. C-band synthetic aperture radar (SAR) imagery for the classification of diverse cropping systems. Int. J. Remote Sens. 2020, 41, 9628–9649. [Google Scholar] [CrossRef]
Skakun, S.; Kussul, N.; Shelestov, A.Y.; Lavreniuk, M.; Kussul, O. Efficiency Assessment of Multitemporal C-Band Radarsat-2 Intensity and Landsat-8 Surface Reflectance Satellite Imagery for Crop Classification in Ukraine. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 3712–3719. [Google Scholar] [CrossRef]
van Tricht, K.; Gobin, A.; Gilliams, S.; Piccrad, I. Synergistic Use of Radar Sentinel-1 and Optical Sentinel-2 Imagery for Crop Mapping: A Case Study for Belgium. Remote Sens. 2018, 10, 1642. [Google Scholar] [CrossRef]
Wozniak, E.; Rybicki, M.; Kofman, W.; Aleksandrowicz, S.; Wojtkowski, C.; Lewiński, S.; Bojanowski, J.; Musiał, J.; Milewski, T.; Slesiński, P.; et al. Multi-temporal phenological indices derived from time series Sentinel-1 images to country-wide crop classification. Int. J. Appl. Earth Obs. Geoinf. 2022, 107, 102683. [Google Scholar] [CrossRef]
Mercier, A.; Betbeder, J.; Baudry, V.; Le Roux, V.; Spicher, F.; Lacoux, J.; Roger, D.; Hubert-Moy, L. Evaluation of Sentinel-1 & 2 time series for predicting wheat and rapeseed phenological stages. ISPRS J. Photogramm. Remote Sens. 2020, 163, 231–256. [Google Scholar]
Mengmeng, L.; Bijker, W. Potential of Multi-Temporal Sentinel-1A Dual Polarization SAR Images for Vegetable Classification in Indonesia. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 3820–3823. [Google Scholar]
Harfenmeister, K.; Itzerott, S.; Weltzien, C.; Spengler, D. Detecting Phenological Development of Winter Wheat and Winter Barley Using Time Series of Sentinel-1 and Sentinel-2. Remote Sens. 2021, 13, 5036. [Google Scholar] [CrossRef]
Löw, J.; Ullmann, T.; Conrad, C. The Impact of Phenological Developments on Interferometric and Polarimetric Crop Signatures Derived from Sentinel-1: Examples from the DEMMIN Study Site (Germany). Remote Sens. 2021, 13, 2951. [Google Scholar] [CrossRef]
Gururaj, P.; Umesh, P.; Shetty, A. Assessment of spatial variation of soil moisture during maize growth cycle using SAR observations. In Remote Sensing for Agriculture, Ecosystems, and Hydrology XXI; SPIE: Strasbourg, France, 2019; Volume 11149, p. 8. [Google Scholar]
Nasirzadehdizaji, R.; Balik Sanli, F.; Abdikan, S.; Cakir, Z.; Sekertekin, A.; Ustuner, M. Sensitivity analysis of multi-temporal Sentinel-1 SAR parameters to crop height and canopy coverage. Appl. Sci. 2019, 9, 655. [Google Scholar] [CrossRef]
Periasamy, S. Significance of dual polarimetric synthetic aperture radar in biomass retrieval: An attempt on Sentinel-1. Remote Sens. Environ. 2018, 217, 537–549. [Google Scholar] [CrossRef]
Mandal, D.; Ratha, D.; Bhattacharya, A.; Kumar, V.; McNairn, H.; Rao, Y.S.; Flery, A.C. A Radar Vegetation Index for Crop Monitoring Using Compact Polarimetric SAR Data. IEEE Trans. Geosci. Remote Sens. 2020, 58, 6321–6335. [Google Scholar] [CrossRef]
Ratha, D.; Mandal, D.; Kumar, V.; Mcnairn, H.; Bhattacharya, A.; Frery, A.C. A Generalized Volume Scattering Model-Based Vegetation Index from Polarimetric SAR Data. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1791–1795. [Google Scholar] [CrossRef]
Jiao, X.; McNairn, H.; Yekkehkhany, B.; Robertson, L.D.; Ihuoma, S. Integrating Sentinel-1 SAR and Sentinel-2 optical imagery with a crop structure dynamics model to track crop condition. Int. J. Remote Sens. 2022, 43, 6509–6537. [Google Scholar] [CrossRef]
Mandal, D.; Kumar, V.; Ratha, D.; Dey, S.; Bhattacharya, A.; Lopez-Sanchez, J.M.; McNairn, H.; Rao, Y.S. Dual polarimetric radar vegetation index for crop growth monitoring using Sentinel-1 SAR data. Remote Sens. Environ. 2020, 247, e111954. [Google Scholar] [CrossRef]
Stepanov, A.; Dubrovin, K.; Sorokin, A.; Aseeva, T. Predicting Soybean Yield at the Regional Scale Using Remote Sensing and Climatic Data. Remote Sens. 2020, 12, 1936. [Google Scholar] [CrossRef]
Lee, J.S.; Pottier, E. Polarimetric SAR Radar Imaging: From Basic to Applications; CRC Press: Boca Raton, FL, USA, 2009; p. 438. [Google Scholar]
Barakat, R. Degree of polarization and the principal idempotents of the coherency matrix. Opt. Commun. 1977, 23, 147–150. [Google Scholar] [CrossRef]
Shapiro, S.S.; Wilk, M.B. An analysis of variance test for normality (complete samples). Biometrika 1965, 52, 591–611. [Google Scholar] [CrossRef]
Prudente, V.H.R.; Skakun, S.; Oldoni, L.V.; Xaud, H.A.M.; Xaud, M.R.; Adami, M.; Sanches, I.D. Multisensor approach to land use and land cover mapping in Brazilian Amazon. ISPRS J. Photogramm. Remote Sens. 2022, 189, 95–109. [Google Scholar] [CrossRef]
Tufail, R.; Ahmad, A.; Javed, M.A.; Ahmad, S.R. A machine learning approach for accurate crop type mapping using combined SAR and optical time series data. Adv. Space Res. 2022, 69, 331–346. [Google Scholar] [CrossRef]
Ghojogh, B.; Crowley, M. Linear and Quadratic Discriminant Analysis: Tutorial. arXiv 2019, arXiv:1906.02590. [Google Scholar]
Choubin, B.; Moradi, E.; Golshan, M.; Adamowski, J.; Sajedi-Hosseini, F.; Mosavi, A. An ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Sci. Total Environ. 2019, 651, 2087–2096. [Google Scholar] [CrossRef] [PubMed]
Zhang, H. The Optimality of Naive Bayes. In Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference; FLAIRS: Miami Beach, FL, USA, 2004. [Google Scholar]
Castro, W.; De-la-Torre, M.; Avila-George, H.; Torres-Jimenez, J.; Guivin, A.; Acevedo-Juárez, B. Amazonian cacao-clone nibs discrimination using NIR spectroscopy coupled to naïve Bayes classifier and a new waveband selection approach. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2022, 270, 120815. [Google Scholar] [CrossRef] [PubMed]
Hechenbichler, K.; Schliep, K. Weighted k-Nearest-Neighbor Techniques and Ordinal Classification. Master’s Thesis, Massey University, Palmerston North, New Zealand, 2004. [Google Scholar] [CrossRef]
Ohmann, J.L.; Gregory, M.J.; Roberts, H.M.; Cohen, W.B.; Kennedy, R.E.; Yang, Z. Mapping change of older forest with nearest-neighbor imputation and Landsat time-series. For. Ecol. Manag. 2012, 272, 13–25. [Google Scholar] [CrossRef]
Wilson, B.T.; Lister, A.J.; Riemann, R.I. A nearest-neighbor imputation approach to mapping tree species over large areas using forest inventory plots and moderate resolution raster data. For. Ecol. Manag. 2012, 271, 182–198. [Google Scholar] [CrossRef]
Seiffert, C.; Khoshgoftaar, T.; Van Hulse, J.; Napolitano, A. RUSBoost: Improving Classification Performance when Training Data is Skewed. In Proceedings of the 19th IEEE International Conference on Pattern Recognition, Tampa, FL, USA, 8–11 December 2008. [Google Scholar]
Piiroinen, R.; Heiskanen, J.; Mottus, M.; Pellikka, P. Classification of crops across heterogeneous agricultural landscape in Kenya using AisaEAGLE imaging spectroscopy data. Int. J. Appl. Earth Obs. Geoinf. 2015, 39, 1–8. [Google Scholar] [CrossRef]
Ben-Hur, A.; Weston, J. A User’s Guide to Support Vector Machines; Data Mining Techniques for the Life Sciences; Springer: Berlin/Heidelberg, Germany, 2010; pp. 223–239. [Google Scholar]
Asgarian, A.; Soffianian, A.; Pourmanafi, S. Crop type mapping in a highly fragmented and heterogeneous agricultural landscape: A case of central Iran using multi-temporal Landsat 8 imagery. Comput. Electron. Agric. 2016, 127, 531–540. [Google Scholar] [CrossRef]
Ouzemou, J.-E.; El Harti, A.; Lhissou, R.; El Moujahid, A.; Bouch, N.; El Ouazzani, R.; Bachaoui, E.M.; El Ghmari, A. Crop type mapping from pansharpened Landsat 8 NDVI data: A case of a highly fragmented and intensive agricultural system. Remote Sens. Appl. Soc. Environ. 2018, 11, 94–103. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Blickensdörfer, L.; Schwieder, M.; Pflugmacher, D.; Nendel, C.; Erasmi, S.; Hostert, P. Mapping of crop types and crop sequences with combined time series of Sentinel-1, Sentinel-2 and Landsat 8 data for Germany. Remote Sens. Environ. 2022, 269, e112831. [Google Scholar] [CrossRef]
Tran, K.H.; Zhang, H.K.; McMaine, J.T.; Zhang, X.; Luo, D. 10 m crop type mapping using Sentinel-2 reflectance and 30 m cropland data layer product. Int. J. Appl. Earth Obs. Geoinf. 2022, 107, 102692. [Google Scholar] [CrossRef]
Bishop, Y.M.; Fienberg, S.E.; Holland, P.W. Discrete Multivariate Analysis: Theory and Practice; Springer: New York, NY, USA, 1975; p. 568. [Google Scholar]
Konduri, V.S.; Kumar, J.; Hargrove, W.W.; Hoffman, F.M.; Ganguly, A.R. Mapping crops within the growing season across the United States. Remote Sens. Environ. 2020, 251, e112048. [Google Scholar] [CrossRef]
Dietterich, T.G. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural. Comput. 1998, 10, 1895–1923. [Google Scholar] [CrossRef] [PubMed]
Singh, R.; Patel, N.R.; Danodia, A. Deriving Phenological Metrics from Landsat-OLI for Sugarcane Crop Type Mapping: A Case Study in North India. J. Indian Soc. Remote Sens. 2022, 50, 1021–1030. [Google Scholar] [CrossRef]
Arias, M.; Campo-Bescós, M.Á.; Álvarez-Mozos, J. Crop Classification Based on Temporal Signatures of Sentinel-1 Observations over Navarre Province, Spain. Remote Sens. 2020, 12, 278. [Google Scholar] [CrossRef]

Figure 1. Study area: (a) location of study area and Sentinel-1 track; (b) the experimental fields of the FEARI used in the study.

Figure 2. Flowchart of the study. The red dashed line indicates processing in batch mode, applied to each interferometric wide swath (IW1, IW2, IW3) separately.

Figure 3. Seasonal course of VIs for the studied crops: (a) soybean, (b) oat, (c) buckwheat, and (d) timothy grass.

Figure 4. Coefficients of variation for SAR VI times series for different crops in the Khabarovsk Territory, 2021: (a) soybean, (b) oat, (c) buckwheat, and (d) timothy grass.

Figure 5. Confusion matrix for classification, based on QD.

Figure 6. Classification results for soybean fields: (a) Field 103; (b) Field 102.

Figure 7. Classification results for oat fields: (a) Field 51; (b) Field 162.

Figure 8. Classification results for buckwheat Field 93b.

Figure 9. Classification results for Field 92.

Figure 10. Classification results for timothy grass on Field 90.

Table 1. Number of fields and total areas of crops by crop type, ha.

	Oat	Soybean	Buckwheat	Timothy Grass	Total
Fields	9	18	12	4	43
Area, ha	181.4	460.8	243.1	124.5	1009.8

Table 2. Shapiro–Wilk’s W test statistic and p-values for averaged crop time series (n = 13).

Crop	W	p-Value
Oat	0.91885	0.24 *
Soybean	0.93707	0.42 *
Buckwheat	0.88407	0.08 *
Timothy grass	0.90323	0.15 *

* Significant at the 0.05 probability level, indicating that the null hypothesis, that the data came from a normally distributed population, cannot be rejected.

Table 3. Distribution of crop pixels in training and test data.

Dataset	Metric	Oat	Soybean	Buckwheat	Timothy Grass	Total
Training	Number	9380	16,459	8783	4906	39,528
	Share, %	24	42	22	12	100
Test	Number	1775	9141	1305	1827	14,048
	Share, %	13	65	9	13	100

Table 4. Classifiers’ parameters.

Method	Parameters
Fine Tree	Maximum Number of splits: 3321 Split criterion: Maximum Deviance Reduction Optimizer: Bayesian optimization
Quadratic Discriminant Analysis	Covariance Structure: Full
Gaussian Naïve Bayes	Distribution names: Kernel Kernel type: Triangle Support: Unbounded
K Nearest Neighbors	Number of neighbors: 30 Distance metric: Euclidean Distance weight: Squared inverse
RUSBoost	Maximum number of splits: 5455 Number of learners: 72 Learning rate: 0.1
Support Vector Machine	Kernel: Radial Basis Function C (regularization): 1 Gamma (kernel coefficient): 0.07
Random Forest	Estimators: 100 Criterion: Gini Min_samples_split: 2

Table 5. Characteristics of VI time series in the Khabarovsk Territory in 2021.

Crop	Indicator	VI
Crop	Indicator	DpRVI	RVI	VH/VV	NDVI
Soybean	${\bar{V I}}_{m a x} \pm Δ {\bar{V I}}_{m a x}$	0.59 ± 0.08	0.92 ± 0.16	0.31 ± 0.07	0.80 ± 0.11
	${V A R}_{V I}$	6.7	9.1	11.7	6.8
	${\bar{D O Y}}_{m a x} \pm Δ {\bar{D O Y}}_{m a x}$	263.4 ± 8.2	261.0 ± 21.3	260.7 ± 21.4	238.2 ± 15.0
	${V A R}_{D O Y}$	1.5	3.8	4.4	3.4
Oat	${\bar{V I}}_{m a x} \pm Δ {\bar{V I}}_{m a x}$	0.69 ± 0.11	0.96 ± 0.34	0.44 ± 0.20	0.69 ± 0.24
	${V A R}_{V I}$	6.2	11.3	17.8	8.1
	${\bar{D O Y}}_{m a x} \pm Δ {\bar{D O Y}}_{m a x}$	193.3 ± 33.8	203.0 ± 35.4	200.7 ± 42.4	200.3 ± 34.8
	${V A R}_{D O Y}$	6.8	7.8	8.7	5.9
Buckwheat	${\bar{V I}}_{m a x} \pm Δ {\bar{V I}}_{m a x}$	0.63 ± 0.08	0.94 ± 0.15	0.40 ± 0.09	0.71 ± 0.11
	${V A R}_{V I}$	5.9	8.7	9.9	6.9
	${\bar{D O Y}}_{m a x} \pm Δ {\bar{D O Y}}_{m a x}$	262.4 ± 15.0	257.1 ± 16.6	259.7 ± 18.5	254.0 ± 23.7
	${V A R}_{D O Y}$	2.6	3.7	4.5	4.3
Timothy	${\bar{V I}}_{m a x} \pm Δ {\bar{V I}}_{m a x}$	0.68 ± 0.08	0.94 ± 0.13	0.42 ± 0.08	0.83 ± 0.09
	${V A R}_{V I}$	4.9	6.7	8.0	5.4
	${\bar{D O Y}}_{m a x} \pm Δ {\bar{D O Y}}_{m a x}$	214.2 ± 16.4	212.7 ± 17.9	216.7 ± 19.3	180.1 ± 24.3
	${V A R}_{D O Y}$	3.6	4.4	5.5	5.2

Table 6. Accuracy assessment and learning times for ML algorithms.

Method	FT	QD	Gaussian NB	Fine KNN	RUSBoost	RF	SVM
Time, seconds	61.8	30.3	3612.5	526.4	95.9	422.1	1325.6
Test accuracy, %	71.1	81.9	73.4	72.4	74.4	76.7	79.1
κ	0.49	0.67	0.52	0.50	0.54	0.58	0.62

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sorokin, A.; Stepanov, A.; Dubrovin, K.; Verkhoturov, A. Enhancement of Comparative Assessment Approaches for Synthetic Aperture Radar (SAR) Vegetation Indices for Crop Monitoring and Identification—Khabarovsk Territory (Russia) Case Study. Remote Sens. 2024, 16, 2532. https://doi.org/10.3390/rs16142532

AMA Style

Sorokin A, Stepanov A, Dubrovin K, Verkhoturov A. Enhancement of Comparative Assessment Approaches for Synthetic Aperture Radar (SAR) Vegetation Indices for Crop Monitoring and Identification—Khabarovsk Territory (Russia) Case Study. Remote Sensing. 2024; 16(14):2532. https://doi.org/10.3390/rs16142532

Chicago/Turabian Style

Sorokin, Aleksei, Alexey Stepanov, Konstantin Dubrovin, and Andrey Verkhoturov. 2024. "Enhancement of Comparative Assessment Approaches for Synthetic Aperture Radar (SAR) Vegetation Indices for Crop Monitoring and Identification—Khabarovsk Territory (Russia) Case Study" Remote Sensing 16, no. 14: 2532. https://doi.org/10.3390/rs16142532

APA Style

Sorokin, A., Stepanov, A., Dubrovin, K., & Verkhoturov, A. (2024). Enhancement of Comparative Assessment Approaches for Synthetic Aperture Radar (SAR) Vegetation Indices for Crop Monitoring and Identification—Khabarovsk Territory (Russia) Case Study. Remote Sensing, 16(14), 2532. https://doi.org/10.3390/rs16142532

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancement of Comparative Assessment Approaches for Synthetic Aperture Radar (SAR) Vegetation Indices for Crop Monitoring and Identification—Khabarovsk Territory (Russia) Case Study

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Acquisition and Processing

2.3. Generation of the DpRVI Time-Series Product

2.4. NDVI Time-Series

2.5. Estimation of SAR Time Series

2.6. Sample Filtering

2.7. Construction of the Training and Test Datasets

2.8. Crop Classification Methods

2.9. Accuracy Assessment of the Classification Results

2.10. Crop Identification at the Field Level

3. Results

3.1. Comparative Analysis of SAR Indices

3.2. Accuracy Estimation for ML Methods

3.3. Crop Identification

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI