Land-Use Mapping with Multi-Temporal Sentinel Images Based on Google Earth Engine in Southern Xinjiang Uygur Autonomous Region, China

Chen, Riqiang; Yang, Hao; Yang, Guijun; Liu, Yang; Zhang, Chengjian; Long, Huiling; Xu, Haifeng; Meng, Yang; Feng, Haikuan

doi:10.3390/rs15163958

Open AccessArticle

Land-Use Mapping with Multi-Temporal Sentinel Images Based on Google Earth Engine in Southern Xinjiang Uygur Autonomous Region, China

¹

School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China

²

Key Laboratory of Quantitative Remote Sensing in Agriculture of Ministry of Agriculture and Rural Affairs, Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China

³

Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration, Beijing 100083, China

⁴

Key Lab of Smart Agriculture System, Ministry of Education, China Agricultural University, Beijing 100083, China

⁵

College of Agriculture, Nanjing Agricultural University, Nanjing 210095, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(16), 3958; https://doi.org/10.3390/rs15163958

Submission received: 24 July 2023 / Revised: 2 August 2023 / Accepted: 7 August 2023 / Published: 10 August 2023

(This article belongs to the Special Issue Monitoring, Early Warning, and Scientific Management of Vegetation Pests and Diseases)

Abstract

:

Land-use maps are thematic materials reflecting the current situation, geographical diversity, and classification of land use and are an important scientific foundation that can assist decision-makers in adjusting land-use structures, agricultural zoning, regional planning, and territorial improvement according to local conditions. Spectral reflectance and radar signatures of time series are important in distinguishing land-use types. However, their impact on the accuracy of land-use mapping and decision making remains unclear. Also, the many spatial and temporal heterogeneous landscapes in southern Xinjiang limit the accuracy of existing land-use classification products. Therefore, our objective herein is to develop reliable land-use products for the highly heterogeneous environment of the southern Xinjiang Uygur Autonomous Region using the freely available public Sentinel image datasets. Specifically, to determine the effect of temporal features on classification, several classification scenarios with different temporal features were developed using multi-temporal Sentinel-1, Sentinel-2, and terrain data in order to assess the importance, contribution, and impact of different temporal features (spectral and radar) on land-use classification models and determine the optimal time for land-use classification. Furthermore, to determine the optimal method and parameters suitable for local land-use classification research, we evaluated and compared the performance of three decision-tree-related classifiers (classification and regression tree, random forest, and gradient tree boost) with respect to classifying land use. Yielding the highest average overall accuracy (95%), kappa (95%), and F₁ score (98%), we determined that the gradient tree boost model was the most suitable for land-use classification. Of the four individual periods, the image features in autumn (25 September to 5 November) were the most accurate for all three classifiers in relation to identifying land-use classes. The results also show that the inclusion of multi-temporal image features consistently improves the classification of land-use products, with pre-summer (28 May–20 June) images providing the most significant improvement (the average OA, kappa, and F₁ score of all the classifiers were improved by 6%, 7%, and 3%, respectively) and fall images the least (the average OA, kappa, and F₁ score of all the classifiers were improved by 2%, 3%, and 2%, respectively). Overall, these analyses of how classifiers and image features affect land-use maps provide a reference for similar land-use classifications in highly heterogeneous areas. Moreover, these products are designed to describe the highly heterogeneous environments in the study area, for example, identifying pear trees that affect local economic development, and allow for the accurate mapping of alpine wetlands in the northwest.

Keywords:

land use; Sentinel; Google Earth Engine; multi-temporal; classifiers; southern of Xinjiang

1. Introduction

A land-use map is a thematic map that shows the current status, geographical differences, and classification of land resources. These maps are essential for the dynamic prediction of land-use time series [1], land-use planning [2], natural disaster prevention and control, land management [3], and ecological and environmental protection [4]. For example, land-use mapping can provide data support for local cropland use and conservation management, the assessment of local food cultivation areas, and the prevention of a lack of grain growth in croplands [5]. Therefore, the accuracy of land-use maps significantly affects many aspects of land-use planning, including spatial planning.

Many researchers have recently released a number of well-recognized land-use products, such as the 10 m ESA World Cover 2020 (ESA 2020) [6], the ESRI Global LULC 2020 (ESRI 2020) (with 10 m resolution) [7], and the Google Dynamic World LULC (10 m) products [8], but the scales of these products are global and thus offer relatively poor mapping accuracy and contrasting results that are not specific to locally distinctive land-use types. Meanwhile, the accuracy in highly heterogeneous areas of global-scale products tends to be insufficient for satisfying the needs of local governments and society. For example, the Xinjiang Uygur Autonomous Region is the largest provincial-level administrative region in China in terms of land area. One of the focal spots in this region is the southern region of Xinjiang, near Bayingoleng Mongol Autonomous Prefecture, which is bordered by the Tianshan Mountains to the north. This region has high mountains and abundant biodiversity; the central part is a plain oasis, encompassing a large cotton-growing area and the famous Kolar pear plantations [9]. It contains the largest inland lake in China, Lake Bosten, while the southern part hosts the Taklamakan Desert, the second largest desert in the world, and the Tarim River, the longest inland river in China, which flows from west to east through the desert. Thus, this landscape is spatially and temporally heterogeneous, which helps explain the relatively poor accuracy of current land-use classification products [10]. At present, aside from the works that describe the region using global-scale land-use products, many researchers have depicted the spatial distribution of specific land-use classes in local regions, such as the spatial distribution of crops in the Baja Oasis agricultural area, the most characteristic of which is the spatial distribution of pear orchards [9]; however, the distribution is incomplete. As a result, reliable, targeted, large-scale land-use products are lacking in the Bayingoleng Mongol Autonomous Region.

At present, although satellite remote-sensing imaging technology is extensively used in land-use mapping on different scales [11,12], it remains challenging to use this technology for highly heterogeneous land-use mapping [10]. However, the availability and free use of Sentinel imagery have enabled the present research because Sentinel imagery not only actively acquires radar features (Sentinel 1, operating in the C-band) but also passively acquires optical features (Sentinel 2, from visible to short-wave infrared bands). Sentinel imagery has already been used for the fine mapping of crops [13,14], yield prediction [15], flood hazard monitoring [16,17], ecological monitoring and protection [18], etc. With larger study scales and the emergence of high spatial- and temporal-resolution images, the work related to remote-sensing images has become more demanding in terms of data pre-processing and computer capabilities [19,20]. Many previous studies have been limited by computational power and usually focus on small areas to reduce image-processing time [21,22]. The Google Earth Engine (GEE) is a cloud-based, planetary-level geospatial analysis platform that brings Google’s enormous computing power to bear on a variety of complex and far-reaching problems [23], including forest monitoring [24], surface water extraction [25], the fine mapping of crops [26], yield prediction [27], disaster monitoring [28], and urban-distribution mapping [29]. By using the GEE platform, a large number of publicly available remote-sensing image datasets can be accessed and processed, providing a stable data source for research that involves remote-sensing imagery [23].

The spectral reflectance of different growth stages is particularly useful for identifying heterogeneous vegetation, which has been well documented [30,31]. The spectral reflectance of overwintered crops in the winter differs from that of non-overwintered crops, and similar patterns exist between evergreen and deciduous trees [32]. At the same time, overwintered crops break dormancy and grow rapidly in the spring, whereas spring-sown crops are in the seedling stage, so the spectral reflectance of the two differs significantly [33,34]. The spectral reflectance of crops also varies within the same season due to differences in crop growth (e.g., phenological periods and growth rates) [34]. Also, the radar features of crops, such as those delineated in Sentinel 1 image features, can be used to differentiate garlic from winter wheat [35]. Nevertheless, compared with traditional mapping based on full-fertility imagery, it is unclear how different features at different periods affect mapping, and knowledge in this area would help users to understand how missing imagery affects the accuracy of land-use mapping, which would ameliorate land-use decisions. In addition, full-fertility spectral images are sometimes difficult to obtain due to weather conditions (e.g., clouds, rain, etc.).

Several machine-learning classifiers have successfully been used for land-use classification, including maximum likelihood [10], classification and regression tree (CART) [36], support vector machines (SVM) [11], random forest (RF) [10], artificial neural networks [37], and convolutional neural networks (CNNs) [38]. Abdi et al. [11] evaluated the performance of multiple machine-learning algorithms when applied to complex northern landscapes and found that the overall accuracy of SVM (75.8%) exceeded that of RF (73.9%). However, a study of land-use classification in the Sahel obtained more accurate results using RF (73.3%) than an integrated classifier (72.0%) or SVM (60.8%) [10]. RF is an integrated learning model based on the idea of “bagging” to integrate decision trees. Another common and efficient strategy pertaining to integrated learning models is “boosting,” whereby each tree learns from the residuals of all previous trees to improve the classification or regression results. Gradient tree boosting (GTB) is a commonly used boosting algorithm, and its performance in classification and regression was proven long ago [39]. These classifiers perform differently due to differences in image features, target classes, and classifier parameters; the most suitable classifiers, and even the best parameters, vary across geographic regions.

In conclusion, the spectral reflectance of different periods and radar features play a crucial role in identifying heterogeneous vegetation and distinguishing various land-use classes. However, the impact of the spectral and radar features of different periods on the accuracy of land-use mapping and decision making remains unclear. To address this issue, the present study aims to utilize multi-temporal Sentinel imagery data to achieve the following: (1) Assess the importance, contribution, and effect of the classification model of spectral and radar features in different periods and determine the optimal time to classify land use. Furthermore, to determine the optimal method and parameters suitable for local land-use classification research, we plan to (2) evaluate and compare the performance of three common classifiers (CART, RF, and GTB) related to decision trees for land-use classification. Finally, we (3) generate a reliable and targeted land-use product that is suitable for the highly heterogeneous environment in the southern Xinjiang Uygur Autonomous Region. The product not only depicts common land use categories but also enables the targeted mapping of pear orchards that affect the local economy, thereby providing data support for agricultural planning conducted by local governments.

2. Study Area and Data

2.1. Study Area

The study area is located in the Xinjiang Uygur Autonomous Region, north of Bayingoleng Mongol Autonomous Region, and covers an area of 150,440 km², as shown in Figure 1. This study area was selected because its landscape features are prevalent in the entirety of Xinjiang, making our findings relevant to a region larger than the study area (but still within Xinjiang). The study area is located in an area with arid and semi-arid subregions with diverse ecosystem types and various patterns of land use. Within this area, bare soil and rocks are frequently found in the central and southern parts of the study area, which constitutes the largest percentage of area. The next largest area, that of green vegetation, mainly includes forests and grasslands in the north, and cultivated crops, such as wheat, maize, cotton, vegetables, and fruit trees, in the central part of the study area. The harvesting of the Kolar pear is one of the local specialty industries that promotes the local economy, but the existing land-use products are incomplete in their portrayal of the areas in which this pear is grown [9]. There are also bodies of water (rivers and lakes), cities, permanent snow and ice in the north, and wetlands; however, existing products underestimate the acreage of wetlands. Furthermore, the ecosystems in the region are also fragile [40]. Climate change and the drive to improve local productivity have caused local land-use changes and ecological degradation [41]. For example, despite the lack of water resources, agriculture remains the primary economic sector and is responsible for the conversion of other land types (such as shrubland or even forest land) into agricultural land to increase yields and for the conversion of agricultural land into plantations to increase economic efficiency. At the same time, irrigation has caused ecological problems such as groundwater reduction, the degradation of the ecological functions of local rivers and lakes, a reduction in forested area, and the degradation of grassland [42]. Thus, this region lacks a reliable, well-targeted, and more accurate land-use product. The average elevation of the study area is 1640 m (772–4801 m) (data source: NASADEM), and the climate is moderate- and warm-temperate continental, which is mainly characterized by aridity and low rainfall, with an average annual precipitation over 100 mm, a high rate of evaporation, long sunshine hours, and an average annual temperature of 11.8 °C. The topography of the study area is complex, with high mountain areas coexisting with plains; the high mountain areas have spring and autumn seasons with no summer, whereas the plains have four distinct seasons.

2.2. Dataset and Preprocessing

2.2.1. Remote-Sensing Data

Sentinel-1 and Sentinel-2 are Earth observation satellites launched by the Copernicus program of the European Space Agency, and the constellation of the two satellites satisfies the revisit and coverage requirements [43]. Sentinel-1 is equipped with a C-band (5.405 GHz) synthetic aperture radar imager sensor with interferometric wide-area mode as the primary mode of operation (250 km wide), a dual-satellite revisit period of 6 days, and a resolution of 10 m. The Sentinel-1 ground-range-detected product (“COPERNICUS/S1_GRD” in GEE) is a data source that provides the radar signatures of different land-use types, and two main polarizations are available: vertical–vertical (VV) and vertical–horizontal (VH). The GEE platform preprocesses each image using the sentinel-1 toolbox, performing processes including thermal noise removal, radiometric calibration, and topographic correction using the SRTM 30 or ASTER digital elevation model (DEM) datasets for areas with latitudes greater than 60°, with the final, topographically corrected values converted to decibels via logarithmic scaling [44]. The Sentinel-2 satellite carries a multispectral imager sensor with 13 bands with a spectral range of 0.4 to 2.4 µm and a spatial resolution of 10 to 60 m. The dual-satellite revisit period is 5 days. Sentinel Imaging Level-2A products (“COPERNICUS/S2_SR” in GEE) yielded the spectral characteristics of different land types for this study, and the QA60 band with cloud-masking information served to remove cloud shadow pixels. In this method, bitwise operations are performed on the QA60 quality band to filter the pixel values and shade clouds, cirrus, rain, and snow [45]. All sentinel and DEM images were resampled to a 10 m grid using the nearest-neighbor sampling method.

The MODIS data used in this paper are the MOD09Q1 product (“MODIS/006/MOD09Q1” in GEE), which provides pixel-level spectral reflectance data in the red and near-infrared bands at 250 m resolution. The best value is selected from the 8-day synthesis period based on pixel quality (the reference metrics include observed coverage, viewing angle, cloud quality, aerosol, etc.) [46].

The NASA NASADEM Digital Elevation 30 m product (“NASA/NASADEM_HGT/001” in GEE) was used to reflect topographic features in the study area. This process constitutes the reprocessing of STRM data that improves accuracy by combining ancillary data from the ASTER GDEM, ICESat GLAS, and PRISM datasets, with the most notable processing improvements being the improved phase spread through a reduction in voids and the use of ICESat GLAS data for control [47]. The primary band of this dataset is the elevation band, which has a resolution of 30 m.

2.2.2. Reference Data

For this study, we collected 538 ground truth sample points of land use (i.e., GPS points, accuracy: ±1 m) during the field survey in the study area in August 2021. The sample points were mainly concentrated near roads and were used to define land-use classes. The photos of the reference GPS points were taken near roads. However, because the study area is relatively large and some areas are inaccessible, for safety reasons, we supplemented 989 sample points via manual visual interpretation using the Google Earth software. Therefore, the total number of sample points in this study is 1527, which includes nine classes: cropland, forest, Kolar pear, grassland, water, snow and ice, wetland, impervious area and bare soil, and rocky terrain. Section 3.1 describes these categories in detail, and the number of sample points in each category is shown in Table 1. A total of 70% of the random sample was used for training, while the remaining 30% was used for validation.

3. Methods

Figure 2 shows a technical flowchart of this work with the following detailed steps:

Step 1: Determination of the time window: Based on the measured sample points (Section 2.2.2) and MODIS images (Section 2.2.1), the time-series SG-NDVI change curve is constructed to determine the time window that can be used for land-use classification (Section 3.2.1).

Step 2: Aggregation of multi-temporal images: The identified time windows are combined with the sentinel image data, the spectral and radar features that can be used for classification are analyzed, seven schemes of integrating the multi-temporal features for classification are proposed (Section 3.2.2), and, finally, the multi-temporal multi-spectral imagery are aggregated.

Step 3: Classification of Multiple Schemes: Based on the seven aggregated multi-temporal images, three commonly used classification algorithms (CART, RF, and GTB) are used to perform the classification task (Section 3.3), and a total of 21 classification scenarios are generated.

Step 4: Validation and comparison: Multiple classification scenarios are validated by comparing the measured and classified classes of sample points (Section 3.4). In addition, the best classification results are compared with existing land-use products (Section 3.5).

3.1. Land-Use Classes

Our land-use mapping products were developed to ascertain the local characteristics of arid biota in Northwest China and can assist decision makers in spatial planning, such as in the identification of vegetation spatial structures (crops, grasslands, and forests with economic value and poplar trees and shrubs with ecological value), water management, and desertification management. Based on field visits and the opinions of local experts, we classified land use into the following nine classes (Figure 3):

Cropland: This class consists of land covered with cropland used mainly for maize, cotton, wheat, pepper, and beet farming.
Forest: This class consists of land planted with trees for ecological conservation, mainly consisting of evergreen coniferous forests in the northern part of the study area, Populus euphratica trees near the Tarim River in the northern part, and drought-tolerant shrubs such as Tamarix ramosissima Lcdcb and Haloxylon ammodendron.
Korla pear: This class includes the geographic areas of Kolar pear plantations for economic purposes.
Grassland: This class mainly includes geographic areas dominated by natural forbs (plants without stems or branches on the ground and lacking a solid structure), mainly encompassing alpine grasslands in the northern part of the study area, with the dominant grass species being Carex stipitiutriculata, Stipa mpurpurea, and Kobresia capillifolia.
Water: This class mainly includes geographic areas covered by a water body, including lakes such as Bosten Lake, the largest inland freshwater lake in China; rivers such as the Tarim River, the longest inland river in China’s Bayingoleng Mongol Autonomous Region section; and reservoirs.
Snow and ice: This class includes areas that are permanently covered by snow or glaciers.
Wetland: This class includes geographic areas dominated by natural herbaceous vegetation that is permanently or periodically inundated by water bodies, which mainly includes the reed wetlands around Bosten Lake in the eastern part of the study area and the grassland wetlands in the northern part of the study area.
Impervious area and bare soil: This class consists of land covered by buildings, roads, and bare sandy soil. Sandy soils are mainly distributed along the margins of the Taklamakan Desert in the southern part of the study area, and urban buildings are mainly distributed in the central part of the study area.
Rocky: This class refers to geographical areas covered by rocks, where very few grasses are present.

3.2. Image-Aggregation Scheme

3.2.1. Determination of the Time Window

To understand the phenological characteristics of the different types of vegetation in the study area, MODIS curves for 2021 were constructed by using MODIS products with a resolution of 250 m. MODIS products have been widely used for phenological window determination [35]. The MODIS-NDVI images were calculated on the GEE platform using the following formula:

NDVI = \frac{sur_refl_b 02 - sur_refl_b 01}{sur_refl_b 02 + sur_refl_b 01}

(1)

To eliminate noise from the MODIS-NDVI time series curves while ensuring that the curves characterize the vegetation growth process, we used a Savitzky–Golay filter (SG) to smooth the MODIS-NDVI time series curves; this process effectively eliminates the effect of noise in NDVI time series data [48]. A sample curve of 572 pixels comprising cropland (183), forest (155), Kolar pear plantations (91), and grassland (143) pixels was selected to construct an NDVI mean curve, which is shown in Figure 4. The curve provides the mean values at different times. The different temporal reflectance patterns resulting from the phenological periods of different vegetation types contain important classification information and possess a high capacity for discrimination.

As seen in Figure 4, the NDVI time series curve is almost a bell-shaped curve, with the NDVI increasing from low to high and then gradually decreasing with time, constituting a typical vegetation time-series spectral feature. The importance of using temporal phenological information to distinguish vegetation types is illustrated by the different trends in the NDVI values for multiple vegetation classes at different time points during the annual growing season (from May to October) of the major vegetation types in the study area.

From 5 May to 10 September, the NDVI values of pear orchards exceeded those of grasslands, and the NDVI values of grasslands exceeded those of forested lands because the poplar forests with low NDVI values reduced the overall NDVI in the forested land [49]. However, the NDVI values in the cropland varied in different periods, namely, from 5 May to 28 May, when the crops were all in the seedling stage with low cover and the spectral response of the bare soil produced the lowest NDVI values for the crops. From 28 May to 20 June, as the crops grew (and corn entered the plucking stage and cotton entered the bud stage), the crop volume and the cover gradually increased, and the influence of the soil gradually decreased, causing the NDVI of crops at this time to exceed that of forested land, although it remained less than that of orchards and grasslands. From 28 June to 10 September, the crops continued to grow rapidly (maize entered the stalking stage and cotton entered the boll stage), chlorophyll accumulated rapidly, and plant height, volume, and cover reached a maximum, so the NDVI value reached a maximum on 28 July. Thereafter, the crops gradually entered maturity (corn entered the milking stage and cotton entered the spitting stage), chlorophyll levels began to decrease, lutein levels increased, and leaves even began to fall off. From 25 September to 5 November, the vegetation NDVI decreased rapidly, the crops entered the harvesting period, the orchards presented the largest NDVI because the fruit trees still had some greenness in their leaves after harvesting, the grasses entered the wilting period, and the leaves turned yellow, so the NDVI values were at a minimum. The influence of evergreen coniferous forests kept the forested land NDVI at a higher level.

This scenario illustrates the importance of using temporal information to distinguish vegetation types. Therefore, in conjunction with the above analysis, we first identified four time windows that can be used for land-use classification: (1) the first window covers the period from 5 May to 27 May (Time1), (2) the second window covers the period from 28 May to 20 June (Time2), (3) the third window covers the period from 28 June to 10 September (Time3), and (4) the fourth window covers the period from 25 September to 5 November (Time4). Then, the nearest neighbor interpolation method was used to resample the three image datasets (Sentinel-1/-2 and elevation data) into images with a resolution of 10 m. Finally, multi-temporal cube datasets were generated using the median synthesis method based on the time windows (Time1~Time4) determined from the NDVI time series. To investigate the contribution and importance of the temporal phenological features to the classification model, seven experimental schemes (Table 2) were tested: experimental Schemes 1–4 compare the abilities of each temporal feature, and Schemes 5–7 analyze the contribution and importance of different phenological periods.

3.2.2. Temporal Differences in Characteristic Variables

A series of features was computed from the remote-sensing image data for classification, including radar features for Sentinel-1, spectral features for Sentinel-2, and elevation features, as detailed in Table 3. These features were integrated in a median way during the given periods (Time1~Time4). A concern faced in this regard was that the median synthetic image had a localized but extremely small area missing (0.07% and 0.25%) due to the short duration of Time1 and Time2. Therefore, to avoid affecting the overall results, the composite image corresponding to the years 2019 and 2020 was used for filling.

To illustrate the capability of these features to classify land use, we also compared the differences between these features across phenological periods. Figure 5 shows the time series of the average reflectance bands and the vegetation indices for all sample points, illustrating that the NDWI discriminates between water bodies and other classes, with multi-period NDWI values greater than 0.2 for water bodies, while snow and ice classes are close to 0 and all other classes have NDWI values less than 0. The values of TC_Wetness differ significantly from those of snow and ice and other classes, with the multi-period TC_Wetness values being greater than 0.5, whereas all the other classes presented values less than 0.1. The B11 and B12 bands from both the impervious surface and bare soil classes are higher than those of other classes, with the difference in the B12 band being more significant. The B12 band is greater than 0.3 over multiple periods for both the impervious surface and bare soil classes but is less than 0.3 for all other classes. In addition, the BRISI values at time3 and time4 have the advantage of distinguishing impervious surface and bare soil from other classes, with the difference exceeding 0.4 at time3 and 0.8 at time4. The SD_V and SD_VH of the rocky terrain and snow and ice classes are significantly different from those of the other classes (Figure 5f,g); thus, these radar features can be used as classification features to accomplish land-use classification.

3.3. Classification Algorithm

To assess the importance of different temporal features for classification and determine the optimal time for land-use classification, we used aggregated multi-temporal multispectral images as an input, and three classifiers were used to perform the land-use classification task, which generated a total of 21 classification scenarios (7 schemes × 3 classifier). These classifiers were selected because although they have some performance differences, in general, they are efficient and stable, often used in classification tasks [10,36,39], and suitable for studies such as those evaluating the importance of different temporal features for land-use classification.

3.3.1. Classification and Regression Trees

The CART model is a classification or regression algorithm for generating decision trees based on fuzzy mathematical ideas and is one of the most important and popular tools in the field of modern data mining. It has a wide range of applications [52], such as the fine mapping of crops [53], disease diagnosis [54], digital soil mapping [55], solar radiation prediction [56], and suspended sediment load prediction [57]. This algorithm uses the dichotomous recursive concept to divide a sample into two subsamples layer by layer so that each non-leaf node has two branches; subsequently, it employs the affiliation as a classification indicator taking values from 0 to 1 to yield the probability that a leaf node belongs to a given class. Finally, the classification is determined according to the leaf node’s affiliation. The CART model has two important parameters on the GEE platform: Max Nodes and Min Leaf Population. The maximum number of leaf nodes in each tree is unlimited by default due to the small number of features in this study. The minimum number of leaves was obtained from control-variable experiments.

3.3.2. Random Forest

RF is a machine-learning algorithm that integrates multiple decision trees to train, classify, and regress samples [58]. It is integrated by bagging; that is, the classification or regression results are voted on by multiple trees (for regression, the classification is averaged). The model has successfully been used in crop distribution mapping [59], land-use production [10], PM2.5 prediction [60], groundwater-level mapping [61], etc. The advantages of this model are its ability to be simply implemented and quickly trained, good generalizability, superior ability to learn the interactions between features compared with a single decision tree, reduced tendency to fall victim to overfitting, and ability to handle high-dimensional data and unbalanced datasets [62]. The main parameters of RF include Number of Trees and the Min Leaf Population, and the optimal parameters are obtained from subsequent experiments.

3.3.3. Gradient Tree Boost

GTB is an iterative classification or regression algorithm incorporating a decision tree as the base learner, allowing it to boost integrated learning [39]. The main idea behind the GTB is that all decision trees in the training set learn first from the residuals of all previous trees and use the negative gradient of the loss function in the current model to approximate the residuals in the boosting tree algorithm and thereby obtain a new regression or classification tree [63]. Such a serial decision tree approach reduces overfitting, allows for the handling of different types of data, and makes accurate predictions. This method has been applied in many scenarios, including the safety evaluation of steel trusses [64], document classification [65], and the classification of algal bloom species [66]. The main parameters of the GTB are (i) Number of Trees and (ii) Procedural Learning Rate. The optimal parameters are obtained from subsequent experiments.

3.4. Accuracy Assessment

The confusion matrix in the GEE serves to assess and validate the accuracy of land-use classifications, specifically by statistically comparing the classification results at the validation points against the true survey results. For each class, the confusion matrix yields the number of correctly classified classes and the number of misclassified classes (commission errors and omission errors are quantified via Producer Accuracy (PA) and User Accuracy (UA), respectively) [10]. As a result, the Overall Accuracy (OA), Kappa Coefficient (Kappa), and F₁ score of the classification results can be calculated, among which Overall Accuracy is the ratio of the number of correctly classified sample points to the number of overall sample points, indicating how close the results are to being accepted as true. The kappa coefficient is a ratio that represents the fraction of error reduction generated by a classification versus a completely random classification. The F₁ score is calculated to evaluate the recognition accuracy and recall of each class and the whole model [59].

3.5. Comparison with Existing Land-Use Products

To emphasize the importance and uniqueness of our products, three existing global-scale land-use products were collected for comparison with our products. These include ESA 2020, ESRI 2020, and Google Dynamic World LULC products generated from 1 January to 31 December 2021 (Google 2021). Although the validity and land-use classes of these products differ from those of our product, some of the land-use classes are comparable with each other because they do not change significantly during this period and some of their definitions are similar. First, the spatial distribution of land use is visually compared for several products. Second, we also compare the discrimination accuracy (F1 score) of similarly defined partial classes to evaluate the similarities and differences between different products.

4. Results

4.1. Performance of Experimental Schemes

4.1.1. Performance of Scenarios

Table 4 and Figure 6 present the overall accuracy and kappa result of the multiple classifiers in the multiple experimental scenarios. The table shows that the classification results of all the experimental scenarios are positive and acceptable, and, overall, the classification results of the validation dataset are all highly consistent with the real scenarios.

(1) The best accuracy (presenting OA, kappa, and F₁ scores of 95%, 95%, and 98%) was achieved with experimental Scheme 7 using the GTB classifier, and the worst classification result (S1-CART) exceeded 70% (presenting OA, kappa, and F₁ scores of 76%, 72%, and 86%, respectively).

(2) In each scheme, the GTB classifier outperformed RF, which, in turn, outperformed CART. The average accuracy of GTB for all schemes exceeded 90% (presenting OA, kappa, and GTB are 92%, 91% and 96%, respectively). The average OA, kappa, and F₁ scores for RF for all the schemes were 90%, 89%, and 95%, respectively. The CART classifier performed the worst, with OA and kappa values below 90%, and the average OA, kappa, and F₁ scores were 84%, 82%, and 91%, respectively. Differences are evident in the accuracy increase for the three models: the GTB model provides greater accuracy than the RF model, with an increase from 0–3% in terms of OA and kappa. The percentage increase is more significant when compared with that of the CART model (5–10% increase in OA, 6–12% increase in kappa, and 3–7% increase in F₁ score). Compared with the CART model, the RF model yielded a 5–7% increase in OA, a 6–9% increase in kappa, and a 3–5% increase in F₁ score. These results show that the difference in accuracy between the integrated classifiers (GTB and RF) based on decision trees is more significant than that between the integrated classifiers and the base classifiers.

(3) In the four schemes using images from individual growth periods (Schemes 1–4), the results of the three classifiers reveal that Scheme 4 discriminates the best between land classes, presenting the highest accuracy (OA values of 86%, 93%, and 94% using CART, RF, and GTB, respectively; kappa values of 84%, 91%, and 93%, respectively; and F₁ scores of 93%, 96%, and 97%, respectively.). The average OA, kappa, and F₁ score of all the classifiers are 91%, 90%, and 95%, respectively.

(4) Schemes 5–7 were designed to investigate the role of multi-phenological information in land-use mapping. When only using Time1 (S1), nine images were used to obtain land-use maps with both OA and kappa values below 90% and presenting the lowest-accuracy F₁ score. Upon adding time2, time3, and time4 features, the classification images gradually fused into multi-temporal images with band numbers of 18, 25, and 33. The accuracy increased correspondingly, with an average increase of 4% in OA for all classifiers, a minimum increase of 2% (S6 vs. S7), and a maximum increase of 6% (S1 vs. S5). Regarding the kappa value, the average increase is 5% for all classifiers, the minimum increase is 3% (S6 vs. S7), and the maximum increase is 7% (S1 vs. S5). For the F₁ score, the average increase is 2% for all classifiers, the minimum increase is 2% (S5 vs. S6 or S6 vs. S7), and the maximum increase is 3% (S1 vs. S5). To be specific, the inclusion of time2 features (S5 vs. S1) improved the average OA, kappa, and F₁ score of all the classifiers by 6%, 7%, and 3%, respectively. The addition of time2 features (S6 vs. S5) increased the average OA, kappa, and F₁ score of all the classifiers by 4%, 4%, and 2%, respectively. Likewise, the time4 feature improved accuracy (S7 vs. S6: OA, kappa, and F₁ score were improved by 2%, 3%, and 2%, respectively).

4.1.2. Sensitivity of Parameters

To analyze the sensitivity of the CART, RF, and GTB classifiers and specify the optimal parameters for land-use classification in this study, the main parameters (CART: min of leaf population; RF: number of trees and min of leaf population; GTB: number of trees and learn rate) of the classifiers were iterated individually by using a grid search method, and we analyzed how the variation of these parameters affects the accuracy of the algorithms (Figure 7, Figure 8 and Figure 9). The kappa of the CART model always increases and then decreases overall as the minimum leaf parameter increases, and this phenomenon was strongest in Scheme 7. The optimal parameters for all the schemes vary between 3 and 23, with most of the optimal parameters being no greater than 11, except for that of Scheme 2 (optimal min of leaf population is 23).

Figure 8 shows that the RF model is sensitive to the parameters, with the difference in the kappa value varying from 20% to 32% across all schemes. Thus, the min of leaf population parameter more significantly affects the model than the tree number parameter because the accuracy changes dramatically upon increasing the minimum leaf tree parameter, whereas the accuracy changes very slightly upon increasing the tree number parameter. In this case, the optimal parameter for the minimum leaf population is 1 in all schemes in this study except for Scheme 6 (optimal min of leaf population is 3), whereas the optimal parameters of the trees in different experiments vary widely (the optimal parameters of S1 to S7 are 50, 90, 270, 110, 80, 470, and 250, respectively).

Figure 9 shows that the sensitivity of the GTB model to the parameters is relatively low, with no variation in kappa exceeding 10% in all schemes. When both parameters are low (i.e., a tree less than 100 or a learning rate less than 0.01), the other parameter has less of an impact on the model, with only a slight change in kappa. On the contrary (i.e., a tree greater than 100 or a learning rate greater than 0.01), the accuracy changes substantially upon varying the other parameter. For the tree parameters, GTB and RF are similar in that the optimal tree parameters vary significantly across schemes (those of S1 to S7 are 240, 270, 500, 440, 480, 410, and 140, respectively), and the number of trees is always greater than 100. The differences in the optimal learning rates across multiple schemes are relatively small (those of S1 to S7 are 0.027, 0.019, 0.015, 0.027, 0.014, 0.027, and 0.026, respectively), with a variation of less than 1%.

4.2. Mapping Results

4.2.1. Visualization of Mapping Results

Land-use maps of all the scenarios were prepared, and all the results can be seen in Figure S1 of the Supplementary Materials. Most of the maps replicate the general landscape features, but there are spatial differences in certain categories. Specifically, the S2-CART scenario erroneously identified some agricultural fields in the eastern part and some grasslands in the western part as wetlands (Figure S1(C1)). There are significant spatial variations in the distribution of water bodies in the southeast corner of the study area, most notably with respect to Figure S1(G3) generated using S6-GTB.

The best land-use map for the study area was obtained using the fused multi-time image (Scheme 7) and the GTB algorithm, yielding the highest OA of 95% and a kappa value of 95%. Table 5 shows the confusion matrix for Scheme 7 and the F1 scores for the land-use classes. The results show that our land-use products not only yield high OA but also produce an F1 score above 90% for all classes, where the class with the highest accuracy (100%) is cropland and the class with the lowest accuracy is rocky terrain (91%). Notably, cropland was successfully identified with distinct boundaries from other land cover classes (Figure 10d,g,h)), which was anticipated since cropland exhibits noticeable differences in NDVI values during the summer (Time3) compared to other vegetation types (Figure 4). All the Kolar pear samples (27 points) were accurately classified, with a producer’s accuracy (PA) of 100%. However, some samples from forest and other land cover classes were misclassified as Kolar pear areas. Our mapping results revealed an omission error of over 10% for the grassland and rocky terrain classes (the PA values of both were 89% and 87%). This finding can be attributed to the fact that certain grassland pixels were not recognized and misinterpreted as forest and bare soil due to low grassland coverage during early growth stages (Time1) and spectral similarities between withered grassland and bare soil at later stages (Time4), as well as sparse, small shrubs within forest cover that resembled dense grassland. Additionally, 7% of the rock samples (four points) were not identified and were misclassified as bare soil. However, considering the similar composition between rocks and bare soil, this misclassification can be deemed acceptable, especially given the absence of existing land-use products depicting rock distribution (see Section 4.3). Another 4% of the rocky samples (were points) were erroneously classified as snow and ice, which was likely due to temporal changes in land-use types (rock being covered by ice/snow during certain periods).

Figure 10 shows the land-use distribution in the study area, revealing that our land-use products replicate most of the landscape characteristics of the study area. The northern part of the study area is alpine, lying south of the Tianshan Mountains, with land types dominated by grasslands, forest, snow and ice, wetlands, and rocks. The most notable area consisted of the largest alpine grasslands in China in the northwest, the Bayanbulak grasslands and grass wetlands (Figure 10b), and the permanent snow and ice near Daleng Darshan in the north-central region. The central part of the study area is a desert and oasis coexistence area dominated by cropland, Kolar pear trees, water, impervious area and bare soil, and rocks, the most notable of which are the eastern part of the study area, the large area of cropland near the Kolar Oasis Plain, the famous Kolar pear plantations (Figure 10e), and Bosten Lake. The southern part of the study area is located at the northern edge of the Taklamakan Desert (Figure 10i), the largest desert in China, with bare soil, forest, cropland, and water serving as the dominant land types. Finally, the Tarim River, the longest inland river in China, crosses the entire study area from west to east (Figure 10g), and China’s largest euphratica forest protection area is near the Tarim River.

4.2.2. Results of Area Statistics for Different Classes

We have calculated the area of land use classes based on the results of multiple mapping (Section 4.1.1), and the results are shown in Table S1. There is some variation in the area statistics of the different classes for all the mapping results, as indicated by the variation rates (the ratio of different pixels in the two images to the total pixels) between the mapping results (Figure S2). The average difference among all scenarios was 18% (27,079 km²), with the maximum difference being 33% (49,645 km², CART-S1 vs. CART-S4) and the minimum difference being 6% (9026 km², RF-S1 vs. GTB-S3), and the mapping differences also demonstrate the validity of mapping accuracy (Section 4.1.1) for different scenarios. The numbers of mapping variances less than 10%, 20%, and 30% were 21, 135, and 208, respectively. Only two mapping differences (CART-S4 vs. CART-S1/CART-S2) exhibited differences greater than 30%.

Based on the S7-GTB map (Figure 10), the area of the different land-use categories in the study area is estimated to be 75,585 km² of impervious area and bare soil (50.24%), 26,991 km² of rocky area (17.94%), 25,166 km² of grassland (16.73%), 7514 km² of cropland (4.99%), 6736 km² of forest (4.48%), 2509 km² of water (1.67%), 2432 km² of snow and ice (1.62%), 2266 km² of wetland (1.51%), and 1240 km² of Kolar pear plantations (0.82%). We also quantified the change in area of the different classes of the other scenarios compared to S7-GTB (the best scenario). As illustrated in Figure 11, upon comparing all classes to the optimal scenario (S7-GTB), 86 (48%) scenarios demonstrated a change in area of less than 10%, while 129 (72%) scenarios exhibited an area change below 20%. Furthermore, 154 (86%) scenarios experienced an area change below 30%. The number of differences exceeding 50% was 17 (9%), most of which (12) were related to the CART model. Notably, the area difference between RF-S5 and S7-GTB is minimal when all classes are considered, and the CART-S1 model is the model with the largest difference.

4.3. Comparison with Existing Land-Use Products

Table 6, Figure 12 and Figure 13 compare the F₁ scores and visualization of certain classes of our product with those of three global land-use products. Since the product classes are not identical, this comparison must be made carefully. Four land-use classes appear to be reasonably comparable (with similar definitions) across all products: grassland, water, snow and ice, and wetland. In contrast, cropland, forest, Kolar pear, impervious area and bare soil, and rocky terrain are much less comparable, so we do not discuss these classes in more detail.

The first notable difference is that the three land-use products, especially the ESA 2020 and Google 2021 products (both of which have F₁ scores below 70% (see Table 6)), significantly underestimate wetland area (F₁ scores below 85%), with the Bayinbruck grassland/wetland area in the northwestern part of the study area being the most significant mistake (Figure 13). In contrast, our product accurately captures the distribution of this wetland (F1 score of 97%), and the wetland distribution is consistent with that reported in Xu’s study [67]. In addition, Figure 11 demonstrates that the three analyzed publicly available products produce relatively low F1 scores (56–73%) for grasslands when using our validation sample points; these scores are much lower than those output by our product (92%). The grassland area identified by our product is larger than that identified by the ESRI 2020 and Google 2021 products and smaller than that identified by the ESA product (Figure 13). Our product and the Google product can accurately identify snow and ice (both with an F₁ score of 97%), whereas the other two products underestimate snow and ice area, especially the ESA2020 product (F₁ score of 68%). The geographical locations of water were accurately mapped—with high F₁ scores (86~98% for our product)—by all products, especially with respect to Bosten Lake, which lies in the eastern part of the study area. In this work, we merged the “tree” and “shrub” classes from the other products into the forest class (F₁ score of 95%), which presented a smaller estimated area than the other products (particularly ESRI2020) due to different reference definitions. Conversely, our product features targeted depictions of local agricultural specialty plantations, Kolar pear plantations, and rocks for hydrological analysis, whereas Kolar pear plantations are identified as cropland by the other three products.

5. Discussion

5.1. Advantages of Multi-Temporal Image Integration

For remote-sensing crop mapping or land-use classification mapping, reasonable image integration methods can improve mapping accuracy and efficiency. Image integration methods include (1) the fusion of equal-time-interval images, along with the incorporation of high-temporal-resolution (10 days) time series data, is often used for the fine mapping of multiple crops [68]. For example, You [69] compared the performance of sentinel time series images at 10-, 15-, 20-, and 30-day resolutions for multiple crop mapping and found that the 10-day synthetic time series dataset allowed for more accurate early identification of multiple crops. (2) The fusion of images within a single fertility period or a full fertility period is often suitable for mapping single crops (e.g., rice, ginger, and garlic) due to the differences in the fertility periods of different crops. In this work, our target land-use types consisted of four vegetation types: cropland, forest, pear orchard, and grassland. To obtain high-accuracy maps, a Modis product with 8-day resolution was used to generate NDVI time series curves and thereby determine four key image aggregation periods for the four species of vegetation. The subsequent classification results prove the reliability of image aggregation based on NDVI time series curves: the average OA of the three classifiers for different times exceeds 80%, with OAs of 82%, 87%, 89%, and 91% and kappa values of 79%, 86%, 87%, and 90%, respectively. Tian and Schulz also claim that NDVI time series curves are useful for determining the critical effective time window [10,35].

In this study, the best classification accuracy was obtained using Scheme 7. The input classification features include integrated images of all key periods, indicating that the use of integrated images of different periods boosts classification accuracy. The most helpful images were the time2 fusion images, which improved the average OA and kappa values of all three classifiers by over 5%. The addition of period 4 images had the least impact on classification accuracy, increasing the average OA and kappa values of all three classifiers by only about 2%. This analysis of the relative importance of fused images from different periods can provide a reference for similar work on land-use classification using multi-period images.

5.2. Importance of Feature Variables

To understand the relative importance of the input features of the classification model, we will now discuss the importance of the feature variables in Scheme 7. The “Explain” function on the GEE platform can be used to calculate the importance scores of the features. During the construction of the classification algorithm, the bootstrap sampling technique was used to select a training subset of all datasets and obtain unselected data from out-of-bag data. Out-of-bag data are used to identify the weights of the feature variables and thus obtain the weighting of each feature variable in a simplified average manner [59].

Figure 14 shows the importance of the feature variables in Scheme 7. The total score of each feature variable varies somewhat, with a maximum score of 16% (NDVI) and a minimum score of 8% (VV_Std/B12), which indicates that the features contribute differently to the model. The NDVI is the most important feature of the model, which is exemplified by the fact that five classes related to vegetation (cropland, forest, Kolar pear, grassland, and wetlands dominated by grass) were identified in this work. Notably, the highest importance score of 8% for the NDVI occurred at Time1. Elevation plays a role in classification, presenting an importance score of 9%, which is due to the wide north–south topographic variation in the study area, where almost all grasslands are on the plateau, whereas poplars in the forest appear mainly near the desert at lower elevations. Figure 10 also shows the importance scores for all features at different times, with total scores of 25% for Time1, 19% for Time2, 25% for Time3, and 22% for Time4. This result indicates that the characteristics of Time1 and Time3 are relatively important for classification in Scheme 7.

5.3. Limitations and Prospects

This work maps the distribution of the main land-use types in the northern region of the Bayingoleng Mongol Autonomous Region, which contains vegetation, sandy soils, waterbodies, etc. For specific purposes such as hydrologic modeling and drought assessment, the role of sandy soils is similar to that of built-up areas [10], so we have not described in detail the distribution of built-up areas in the study area. The Kolar pear is one of the main cash crops in the study area, with a unique position and an important impact on local economic development and social life, and information regarding its distribution is highly valuable to local decision makers for planning, monitoring, and evaluating agricultural activities, such as in pre-harvest preparation. Unfortunately, the depiction of pear orchards is often neglected. In addition, compared with the other three products (ESA 2020, ESRI 2020, and Google 2021), we consider herein the importance of rocky areas in hydrological analysis; for example, rocks can directly affect water infiltration and evaporation and thus runoff [70]. Therefore, we mapped the distribution of local rock patches, which is of interest because the study area covers an arid and semi-arid zone where water management and planning are of paramount importance.

The results of previous studies have shown that the parameters of machine learning strongly impact the accuracy of a model. For example, splitting in the RF model also affects the number of nodes in a single tree and the final classification result [71], and these parameters may seriously affect the accuracy of the model. In this study, we also analyzed the sensitivity of different classifiers to the input parameters, in which the optimal value of the minimum leaf population in the CART model was less than 10 in several schemes, and the accuracy of many models decreased when this parameter exceeds 10. Thus, we recommend that the minimum-leaf tree of the CART classification model for land use have a value of less than 10. In contrast, in the RF model, the minimum leaf population is 1, and the optimal value of the tree parameter is unknown, so this parameter needs to be adjusted to local conditions. The GTB model is less sensitive to its parameters than the RF and CART models, and the best classification results were obtained (Scheme 7) using this model because it requires the fewest trees (140). This may be related to the number of input features because the more effective features tend to have the lowest number of decision trees. Parameter sensitivity analysis of the classifiers not only determined the optimal parameters for this work but also provides a reference for the land-use classification of other highly heterogeneous areas.

The emergence of the GEE cloud-computing platform has rendered large-scale land-use mapping quicker and more convenient [27,28]. The impressive computing power of Google servers supports this study, and the free and publicly available Sentinel images that cover many years allow us to produce multi-year time-series land-use products [68]. Multi-year land-use data can reveal the characteristics of and variations in cover under different states and can also help in the analysis of the correlation between land use and driving forces, thereby facilitating the prediction of future land-use trends in a given study area. In addition, although this study uses numerous sample points (including field surveys and visual interpretation generation), the validity of these sample points in other land-use products is uncertain because land use is dynamic. Currently, strong requirements apply for sample points in supervised classification, but multi-year sample points are often difficult to obtain due to economic and security considerations. Therefore, a study of samples migration or a classification study with limited samples would be worth conducting [72,73].

6. Conclusions

To develop reliable land-use products for the highly heterogeneous environment in the southern Xinjiang Uygur Autonomous Region, this study not only assessed and compared the performance of multiple classifiers for highly heterogeneous land-use classification using multi-temporal Sentinel images but also explored how multi-temporal features affect land-use classification models. The following conclusions can be drawn from the results of this study: (1) The GTB classifier demonstrates significant potential for land-use classification, exhibiting superior mapping accuracy compared to RF and CART in various scenarios. Specifically, in scenario 7, the GTB model achieved the highest overall accuracy (95%), Kappa coefficient (95), and F1 score (98%). This performance highlights the model’s exceptional ability to accurately classify and map land use, underscoring its superiority over alternative methodologies. (2) The autumnal image features exhibit strong discriminability for land-use types. When using the autumn features (Time 4), all three classifiers yielded their highest accuracies (OA values of 86%, 93%, and 94% using CART, RF, and GTB, respectively, and kappa values of 84%, 91%, and 92%, respectively). This finding underscores the significance of autumnal imagery in accurately differentiating land-use types. (3) The inclusion of multi-temporal image features consistently improved classification performance. With the integration of multi-temporal features, all the classifiers achieved an average increase of 4% in overall accuracy, with a minimum increase of 2% and a maximum increase of 6%. In terms of the kappa coefficient, an average improvement of 5% was observed, with the maximum improvement reaching 7%. These findings highlight the significant positive impact of incorporating multi-temporal information on the accuracy and reliability of land-use classification. The analysis of the impact of these classifiers and image features on land-use maps offers valuable insights for conducting similar land-use classifications in highly heterogeneous regions. Moreover, our findings contribute to the production of precise land-use maps in diverse environments within the study area. In addition, we achieved the accurate mapping of significant features such as Korla pear plantations, which exert a substantial influence on local economic development, as well as the alpine wetlands located in the northwest region. These outcomes have significant implications for understanding and managing complex landscapes with diverse land-use patterns.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs15163958/s1, Figure S1. Mapping result for Schemes 1~7 using Classification and Regression Trees (CART), Random Forest (RF) and Gradient Tree Boost (GTB); S1: using Time1 Features, S2: using Time2 Features, S3: using Time3 Features, S4: using Time4 Features, S5: using Time1 and Time2 Features, S6: using Time1, Time2 and Time3 Features, S7: using Time1, Time2, Time3 and Time4 Features; Figure S2. Rates of variation in all classes between results of multiple classifications. (a) Heat map and (b) statistical charts of difference rates. Variation rates: Ratio of different pixels in the two images to the total pixels. Table S1. Area statistics for different classes of multiple classification results.

Author Contributions

Conceptualization, R.C. and H.F.; investigation, R.C. and H.Y.; methodology, R.C. and Y.L.; Data curation, R.C., H.X. and Y.M.; Software, R.C., Y.L. and C.Z.; Validation and Visualization, H.Y. and Y.L; Formal analysis, C.Z. and H.L.; Supervision and Resources, Y.L. and H.X.; Writing—original draft, R.C. and H.F.; Funding acquisition, G.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key scientific and technological projects of Heilongjiang province (2021ZXJ05A05), the National Natural Science Foundation of China (42171303).

Data Availability Statement

MODIS, Sentinel-1 and Sentinel-2A/B data are openly available via the Google Earth Engine (https://earthengine.google.com, accessed on 6 August 2023). Additional data supporting the results of this study can be obtained from the first author (Riqiang Chen, [email protected]) upon reasonable request.

Conflicts of Interest

The authors declare that they have no known competing financial interest or personal relationship that could have appeared to influence the work reported in this paper.

References

Anand, J.; Gosain, A.K.; Khosa, R. Prediction of land use changes based on Land Change Modeler and attribution of changes in the water balance of Ganga basin to land use change using the SWAT model. Sci. Total Environ. 2018, 644, 503–519. [Google Scholar] [CrossRef] [PubMed]
Teferi, H.H.; Giday, K.; Orshoven, J.V.; Muys, B.; Witlox, F. Analysis of Land Use Land Cover Dynamics and Driving Factors in Desa’a Forest in Northern Ethiopia. Land Use Policy 2020, 101, 105039. [Google Scholar] [CrossRef]
Mlba, B.; At, C.; Nh, D.; Dtm, E.; Ea, E.; Mt, C.; Tm, F.; Aaf, C.; Ds, B.; Mya, B. Exploring land use/land cover changes, drivers and their implications in contrasting agro-ecological environments of Ethiopia. Land Use Policy 2019, 87, 104052. [Google Scholar] [CrossRef]
Dadashpoor, H.; Azizi, P.; Moghadasi, M. Land use change, urbanization, and change in landscape pattern in a metropolitan area. Sci. Total Environ. 2019, 655, 707–719. [Google Scholar] [CrossRef] [PubMed]
Saputra, M.H.; Lee, H.S. Prediction of Land Use and Land Cover Changes for North Sumatra, Indonesia, Using an Artificial-Neural-Network-Based Cellular Automaton. Sustainability 2019, 11, 3024. [Google Scholar] [CrossRef] [Green Version]
Zanaga, D.; De Kerchove, R.V.; Keersmaecker, W.D.; Souverijns, N.; Brockmann, C.; Quast, R.; Wevers, J.; Grosu, A.; Paccini, A.; Vergnaud, S.; et al. ESA WorldCover 10 m 2020 v100; VITO: Genève, Switzerland, 2021. [Google Scholar] [CrossRef]
Karra, K.; Kontgis, C.; Statman-Weil, Z.; Mazzariello, J.C.; Mathis, M.; Brumby, S.P. Global land use/land cover with Sentinel 2 and deep learning. In Proceedings of 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 4704–4707. [Google Scholar]
Brown, C.F.; Brumby, S.P.; Guzder-Williams, B.; Birch, T.; Hyde, S.B.; Mazzariello, J.; Czerwinski, W.; Pasquarella, V.J.; Haertel, R.; Ilyushchenko, S.; et al. Dynamic World, Near real-time global 10 m land use land cover mapping. Sci. Data 2022, 9, 251. [Google Scholar] [CrossRef]
Sun, L.; Chen, J.; Guo, S.; Deng, X.; Han, Y. Integration of time series sentinel-1 and sentinel-2 imagery for crop type mapping over oasis agricultural areas. Remote Sens. 2020, 12, 158. [Google Scholar] [CrossRef] [Green Version]
Schulz, D.; Yin, H.; Tischbein, B.; Verleysdonk, S.; Adamou, R.; Kumar, N. Land use mapping using Sentinel-1 and Sentinel-2 time series in a heterogeneous landscape in Niger, Sahel. ISPRS J. Photogramm. Remote Sens. 2021, 178, 97–111. [Google Scholar] [CrossRef]
Abdi, A.M. Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data. GISci. Remote Sens. 2020, 57, 1–20. [Google Scholar] [CrossRef] [Green Version]
Xie, G.; Niculescu, S. Mapping and monitoring of land cover/land use (LCLU) changes in the crozon peninsula (Brittany, France) from 2007 to 2018 by machine learning algorithms (support vector machine, random forest, and convolutional neural network) and by post-classification comparison (PCC). Remote Sens. 2021, 13, 3899. [Google Scholar]
Wang, Z.; Zhang, H.; He, W.; Zhang, L. Cross-phenological-region crop mapping framework using Sentinel-2 time series Imagery: A new perspective for winter crops in China. ISPRS J. Photogramm. Remote Sens. 2022, 193, 200–215. [Google Scholar] [CrossRef]
Xie, G.; Niculescu, S. Mapping crop types using sentinel-2 data machine learning and monitoring crop phenology with sentinel-1 backscatter time series in pays de Brest, Brittany, France. Remote Sens. 2022, 14, 4437. [Google Scholar] [CrossRef]
Hunt, M.L.; Blackburn, G.A.; Carrasco, L.; Redhead, J.W.; Rowland, C.S. High resolution wheat yield mapping using Sentinel-2. Remote Sens. Environ. 2019, 233, 111410. [Google Scholar] [CrossRef]
Perrou, T.; Garioud, A.; Parcharidis, I. Use of Sentinel-1 imagery for flood management in a reservoir-regulated river basin. Front. Earth Sci. 2018, 12, 506–520. [Google Scholar] [CrossRef]
Lam, C.-N.; Niculescu, S.; Bengoufa, S. Monitoring and mapping floods and floodable areas in the Mekong Delta (Vietnam) using time-series sentinel-1 images, convolutional neural Network, multi-layer perceptron, and random forest. Remote Sens. 2023, 15, 2001. [Google Scholar] [CrossRef]
Xu, H.; Wang, M.; Shi, T.; Guan, H.; Fang, C.; Lin, Z. Prediction of ecological effects of potential population and impervious surface increases using a remote sensing based ecological index (RSEI). Ecol. Indic. 2018, 93, 730–740. [Google Scholar] [CrossRef]
Ma, Y.; Wang, L.; Liu, P.; Ranjan, R. Towards building a data-intensive index for big data computing—A case study of Remote Sensing data processing. Inf. Sci. 2015, 319, 171–188. [Google Scholar] [CrossRef]
Wang, L.; Ma, Y.; Yan, J.; Chang, V.; Zomaya, A.Y. pipsCloud: High performance cloud computing for remote sensing big data management and processing. Future Gener. Comput. Syst. 2018, 78, 353–368. [Google Scholar] [CrossRef] [Green Version]
Zhang, R.; Wang, S.a.; Gao, W.; Sun, W.; Wang, J.; Niu, L. Remote-sensing classification method of county-level agricultural crops using time-series NDVI. Trans. Chin. Soc. Agric. Mach. 2015, 46, 246–252. [Google Scholar]
Wu, J.; Lu, Y.; Li, C.; Li, Q. Fine classification of county crops based on multi-temporal images of Sentinel-2A. Trans. Chin. Soc. Agric. Mach. 2019, 50, 194–200. [Google Scholar] [CrossRef]
Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
Chen, S.; Woodcock, C.E.; Bullock, E.L.; Arévalo, P.; Torchinava, P.; Peng, S.; Olofsson, P. Monitoring temperate forest degradation on Google Earth Engine using Landsat time series analysis. Remote Sens. Environ. 2021, 265, 112648. [Google Scholar] [CrossRef]
Wang, Y.; Li, Z.; Zeng, C.; Xia, G.-S.; Shen, H. An urban water extraction method combining deep learning and Google Earth engine. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 769–782. [Google Scholar] [CrossRef]
Xiong, J.; Thenkabail, P.S.; Gumma, M.K.; Teluguntla, P.; Poehnelt, J.; Congalton, R.G.; Yadav, K.; Thau, D. Automated cropland mapping of continental Africa using Google Earth Engine cloud computing. ISPRS J. Photogramm. Remote Sens. 2017, 126, 225–244. [Google Scholar] [CrossRef] [Green Version]
Jin, Z.; Azzari, G.; You, C.; Di Tommaso, S.; Aston, S.; Burke, M.; Lobell, D.B. Smallholder maize area and yield mapping at national scales with Google Earth Engine. Remote Sens. Environ. 2019, 228, 115–128. [Google Scholar] [CrossRef]
DeVries, B.; Huang, C.; Armston, J.; Huang, W.; Jones, J.W.; Lang, M.W. Rapid and robust monitoring of flood events using Sentinel-1 and Landsat data on the Google Earth Engine. Remote Sens. Environ. 2020, 240, 111664. [Google Scholar] [CrossRef]
Liu, X.; Hu, G.; Chen, Y.; Li, X.; Xu, X.; Li, S.; Pei, F.; Wang, S. High-resolution multi-temporal mapping of global urban land using Landsat images based on the Google Earth Engine Platform. Remote Sens. Environ. 2018, 209, 227–239. [Google Scholar] [CrossRef]
Blickensdörfer, L.; Schwieder, M.; Pflugmacher, D.; Nendel, C.; Erasmi, S.; Hostert, P. Mapping of crop types and crop sequences with combined time series of Sentinel-1, Sentinel-2 and Landsat 8 data for Germany. Remote Sens. Environ. 2022, 269, 112831. [Google Scholar] [CrossRef]
Simonetti, D.; Simonetti, E.; Szantoi, Z.; Lupi, A.; Eva, H.D. First Results From the Phenology-Based Synthesis Classifier Using Landsat 8 Imagery. IEEE Geosci. Remote Sens. Lett. 2015, 12, 1496–1500. [Google Scholar] [CrossRef]
Zhu, Y.; Yang, G.; Yang, H.; Wu, J.; Lei, L.; Zhao, F.; Fan, L.; Zhao, C. Identification of apple orchard planting year based on spatiotemporally fused satellite images and clustering analysis of foliage phenophase. Remote Sens. 2020, 12, 1199. [Google Scholar] [CrossRef] [Green Version]
Dong, J.; Fu, Y.; Wang, J.; Tian, H.; Fu, S.; Niu, Z.; Han, W.; Zheng, Y.; Huang, J.; Yuan, W. Early-season mapping of winter wheat in China based on Landsat and Sentinel images. Earth Syst. Sci. Data 2020, 12, 3081–3095. [Google Scholar] [CrossRef]
Dong, C.; Zhao, G.; Qin, Y.; Wan, H. Area extraction and spatiotemporal characteristics of winter wheat–summer maize in Shandong Province using NDVI time series. PLoS ONE 2019, 14, e0226508. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Tian, H.; Pei, J.; Huang, J.; Li, X.; Wang, J.; Zhou, B.; Qin, Y.; Wang, L. Garlic and winter wheat identification based on active and passive satellite imagery and the google earth engine in northern china. Remote Sens. 2020, 12, 3539. [Google Scholar] [CrossRef]
García, V.J.; Márquez, C.O.; Isenhart, T.M.; Rodríguez, M.; Crespo, S.D.; Cifuentes, A.G. Evaluating the conservation state of the páramo ecosystem: An object-based image analysis and CART algorithm approach for central Ecuador. Heliyon 2019, 5, e02701. [Google Scholar] [CrossRef] [Green Version]
Islam, K.; Rahman, M.F.; Jashimuddin, M. Modeling land use change using cellular automata and artificial neural network: The case of Chunati Wildlife Sanctuary, Bangladesh. Ecol. Indic. 2018, 88, 439–453. [Google Scholar] [CrossRef]
Zaabar, N.; Niculescu, S.; Kamel, M.M. Application of convolutional neural networks with object-based image analysis for land cover and land use mapping in coastal areas: A case study in Ain Témouchent, Algeria. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. Appl. Soc. Environ. 2022, 15, 5177–5189. [Google Scholar] [CrossRef]
Orieschnig, C.A.; Belaud, G.; Venot, J.-P.; Massuel, S.; Ogilvie, A. Input imagery, classifiers, and cloud computing: Insights from multi-temporal LULC mapping in the Cambodian Mekong Delta. Eur. J. Remote Sens. 2021, 54, 398–416. [Google Scholar] [CrossRef]
Sun, G.; Lu, H.; Zheng, J. Spatio-temporal variation of ecological vulnerability in Xinjiang and driving force analysis. Arid Zone Res. 2022, 39, 258–269. [Google Scholar] [CrossRef]
Feng, Y.; Luo, G.; Zhou, D. Effects of land use change on landscape pattern of a typical arid watershed in the recent 50 years: A case study on Manas River Watershed in Xinjiang. Acta Ecol. Sin. 2010, 30, 4295–4305. [Google Scholar]
Long, A.; Zhang, P.; Hai, Y.; Deng, X.; Li, J.; Wang, J. Spatio-Temporal Variations of Crop Water Footprint and Its Influencing Factors in Xinjiang, China during 1988–2017. Sustainability 2020, 12, 9678. [Google Scholar] [CrossRef]
Wang, Q.; Shi, W.; Li, Z.; Atkinson, P.M. Fusion of Sentinel-2 images. Remote Sens. Environ. 2016, 187, 241–252. [Google Scholar] [CrossRef] [Green Version]
Mullissa, A.; Vollrath, A.; Odongo-Braun, C.; Slagter, B.; Balling, J.; Gou, Y.; Gorelick, N.; Reiche, J. Sentinel-1 SAR Backscatter Analysis Ready Data Preparation in Google Earth Engine. Remote Sens. 2021, 13, 1954. [Google Scholar] [CrossRef]
Li, J.; Wang, L.; Liu, S.; Peng, B.; Ye, H. An automatic cloud detection model for Sentinel-2 imagery based on Google Earth Engine. Remote Sens. Lett. 2022, 13, 196–206. [Google Scholar] [CrossRef]
Lu, S.; Jia, L.; Zhang, L.; Wei, Y.; Baig, M.H.A.; Zhai, Z.; Meng, J.; Li, X.; Zhang, G. Lake water surface mapping in the Tibetan Plateau using the MODIS MOD09Q1 product. Remote Sens. Lett. 2017, 8, 224–233. [Google Scholar] [CrossRef]
NASA-JPL. NASADEM Merged DEM Global 1 Arc Second V001. 2020, Distributed by NASA EOSDIS Land Processes DAAC. 2020. Available online: https://doi.org/10.5067/MEaSUREs/NASADEM/NASADEM_HGT.001 (accessed on 21 November 2022).
Cao, R.; Chen, Y.; Shen, M.; Chen, J.; Zhou, J.; Wang, C.; Yang, W. A simple method to improve the quality of NDVI time-series data by integrating spatiotemporal information with the Savitzky-Golay filter. Remote Sens. Environ. 2018, 217, 244–257. [Google Scholar] [CrossRef]
Nie, Y.; He, X.; Tan, Y.; Jiang, J.; Liu, X. Spatio-temporal Evolution of Natural Vegetation in Aksu River Basin and Its Response to Ecological Water Transport. J. Yangtze River Sci. Res. Inst. 2022, 39, 61–67. [Google Scholar] [CrossRef]
Teng, J.; Xia, S.; Liu, Y.; Yu, X.; Duan, H.; Xiao, H.; Zhao, C. Assessing habitat suitability for wintering geese by using Normalized Difference Water Index (NDWI) in a large floodplain wetland, China. Ecol. Indic. 2021, 122, 107260. [Google Scholar] [CrossRef]
Cao, Y.; Tao, Y.; Deng, L. An impervious surface index construction for restraining bare land. Remote Sens. Land Resour. 2020, 32, 71–79. [Google Scholar] [CrossRef]
Lee, J.S.H.; Wich, S.; Widayati, A.; Koh, L.P. Detecting industrial oil palm plantations on Landsat images with Google Earth Engine. Remote Sens. Appl. Soc. Environ. 2016, 4, 219–224. [Google Scholar] [CrossRef] [Green Version]
Chen, Y.; Song, X.; Wang, S.; Huang, J.; Mansaray, L.R. Impacts of spatial heterogeneity on crop area mapping in Canada using MODIS data. ISPRS J. Photogramm. Remote Sens. 2016, 119, 451–461. [Google Scholar] [CrossRef]
Ghiasi, M.M.; Zendehboudi, S.; Mohsenipour, A.A. Decision tree-based diagnosis of coronary artery disease: CART model. Comput. Methods Programs Biomed. 2020, 192, 105400. [Google Scholar] [CrossRef]
Heung, B.; Ho, H.C.; Zhang, J.; Knudby, A.; Bulmer, C.E.; Schmidt, M.G. An overview and comparison of machine-learning techniques for classification purposes in digital soil mapping. Geoderma 2016, 265, 62–77. [Google Scholar] [CrossRef]
Srivastava, R.; Tiwari, A.; Giri, V. Solar radiation forecasting using MARS, CART, M5, and random forest model: A case study for India. Heliyon 2019, 5, e02692. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Choubin, B.; Darabi, H.; Rahmati, O.; Sajedi-Hosseini, F.; Kløve, B. River suspended sediment modelling using the CART model: A comparative study of machine learning techniques. Sci. Total Environ. 2018, 615, 272–281. [Google Scholar] [CrossRef] [PubMed]
Schonlau, M.; Zou, R.Y. The random forest algorithm for statistical learning. Stata J. 2020, 20, 3–29. [Google Scholar] [CrossRef]
Li, C.; Chen, W.; Wang, Y.; Wang, Y.; Ma, C.; Li, Y.; Li, J.; Zhai, W. Mapping Winter Wheat with Optical and SAR Images Based on Google Earth Engine in Henan Province, China. Remote Sens. 2022, 14, 284. [Google Scholar] [CrossRef]
Zamani Joharestani, M.; Cao, C.; Ni, X.; Bashir, B.; Talebiesfandarani, S. PM2. 5 prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data. Atmosphere 2019, 10, 373. [Google Scholar] [CrossRef] [Green Version]
Naghibi, S.A.; Ahmadi, K.; Daneshi, A. Application of support vector machine, random forest, and genetic algorithm optimized random forest models in groundwater potential mapping. Water Resour. Manag. 2017, 31, 2761–2775. [Google Scholar] [CrossRef]
Talukdar, S.; Singha, P.; Mahato, S.; Shahfahad; Pal, S.; Liou, Y.-A.; Rahman, A. Land-Use Land-Cover Classification by Machine Learning Classifiers for Satellite Observations—A Review. Remote Sens. 2020, 12, 1135. [Google Scholar] [CrossRef] [Green Version]
Ouma, Y.; Nkwae, B.; Moalafhi, D.; Odirile, P.; Parida, B.; Anderson, G.; Qi, J. Comparison of machine learning classifiers for multitemporal and multisensor mapping of urban lulc features. Int. Arch. Photogramm. Remote Sens. 2022, 43, 681–689. [Google Scholar] [CrossRef]
Truong, V.-H.; Vu, Q.-V.; Thai, H.-T.; Ha, M.-H. A robust method for safety evaluation of steel trusses using Gradient Tree Boosting algorithm. Adv. Eng. Softw. 2020, 147, 102825. [Google Scholar] [CrossRef]
Ehrentraut, C.; Ekholm, M.; Tanushi, H.; Tiedemann, J.; Dalianis, H. Detecting hospital-acquired infections: A document classification approach using support vector machines and gradient tree boosting. Health Inf. J. 2018, 24, 24–42. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ghatkar, J.G.; Singh, R.K.; Shanmugam, P. Classification of algal bloom species from remote sensing data using an extreme gradient boosted decision tree model. Int. J. Remote Sens. 2019, 40, 9412–9438. [Google Scholar] [CrossRef]
Xu, X.; Wang, X.; Zhu, X.; Jia, H.T.; Han, D.l. Landscape Pattern Changes in Alpine Wetland of Bayanbulak Swan Lake during 1996-2015. J. Nat. Resour. 2018, 33, 1897–1911. [Google Scholar] [CrossRef]
You, N.; Dong, J.; Huang, J.; Du, G.; Zhang, G.; He, Y.; Yang, T.; Di, Y.; Xiao, X. The 10-m crop type maps in Northeast China during 2017–2019. Sci. Data 2021, 8, 1–11. [Google Scholar] [CrossRef] [PubMed]
You, N.; Dong, J. Examining earliest identifiable timing of crops using all available Sentinel 1/2 imagery and Google Earth Engine. ISPRS J. Photogramm. Remote Sens. 2020, 161, 109–123. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, M.; Niu, J.; Li, H.; Xiao, R.; Zheng, H.; Bech, J. Rock fragments and soil hydrological processes: Significance and progress. Catena 2016, 147, 153–166. [Google Scholar] [CrossRef]
Fang, P.; Zhang, X.; Wei, P.; Wang, Y.; Zhao, J. The Classification Performance and Mechanism of Machine Learning Algorithms in Winter Wheat Mapping Using Sentinel-2 10 m Resolution Imagery. Appl. Sci. 2020, 10, 5075. [Google Scholar] [CrossRef]
Hu, Y.; Zeng, H.; Tian, F.; Zhang, M.; Wu, B.; Gilliams, S.; Li, S.; Li, Y.; Lu, Y.; Yang, H. An interannual transfer learning approach for crop classification in the Hetao Irrigation district, China. Remote Sens. 2022, 14, 1208. [Google Scholar] [CrossRef]
Wen, Y.; Li, X.; Mu, H.; Zhong, L.; Chen, H.; Zeng, Y.; Miao, S.; Su, W.; Gong, P.; Li, B. Mapping corn dynamics using limited but representative samples with adaptive strategies. ISPRS J. Photogramm. Remote Sens. 2022, 190, 252–266. [Google Scholar] [CrossRef]

Figure 1. (a) Location of the study area. (b) Topographic map of the study area (NASA Digital Elevation Model imagery). (c) Distribution of the survey sample sites and Sentinel-2 Satellite imagery, 1 June to 30 August 2021, median RGB composite (R: band4, G: band3, B: band2).

Figure 2. Technical flowchart of this work.

Figure 3. Photographs of different land-use classes.

Figure 4. MODIS-SG-NDVI time series curve for different vegetation types (cropland, forest, Korla pear, and grassland) in the study area in 2021. The curves provide the mean NDVI values of the sample points.

Figure 5. Time-series variation of different spectral properties for each class. (a–g) Time-series variation curves for different land use categories for B11, B12, TC_Wetness, NDWI, BRISI, SD_VV and SD_VH features, respectively.

Figure 6. Accuracy of multiple classifiers in multiple experimental schemes.

Figure 7. Accuracy metric of CART for various algorithm parameters.

Figure 8. Accuracy metric of RF for various algorithm parameters: (a–g) Scheme 1 to Scheme 7, respectively.

Figure 9. Accuracy metric of GTB for various algorithm parameters: (a–g) Scheme 1 to Scheme 7, respectively.

Figure 10. Land-use map created by using the fused multi-time image (Scheme 7) and the GTB algorithm. (a) Distribution of the whole study area. (b–i) Detailed distribution of the seven regions in the study area. Sentinel RGB image is on the left, and land-use map is on the right.

Figure 11. The changes in area coverage for other scenarios compared to S7-GTB map (the best map). Positive values represent greater area compared to the S7-GTB map, and negative values represent less area. (a) Change in area coverage compared to S7-GTB; (b) Statistics on change rates.

Figure 12. F₁ scores yielded by our products compared with those of the three free products for certain classes.

Figure 13. Comparison of our product with three other free and publicly available land-use products. (a) Sentinel-2 Satellite imagery, 1 June to 30 August 2021, median RGB-composite (R: band4, G: band3, B: band2). (b) Our land-use map using Scheme 7, (c) ESA 2020, (d) ESRI 2020, and (e) Google 2021.

Figure 14. Importance of feature variables in Scheme 7.

Table 1. The sample data used in this study.

	Land Use Classes
	Cro.	For.	Kor.	Gra.	Wat.	Sno.	Wet.	Imp.	Roc.	Total
Field survey	91	9	37	48	39	2	73	195	44	538
Visual interpretation	92	146	54	95	122	135	78	117	150	989
Total	183	155	91	143	161	137	151	312	194	1527
Percentage	12%	10%	6%	9%	11%	9%	10%	20%	13%	100%
Training	125	108	64	98	112	103	98	224	140	1069
Validation	58	47	27	45	49	34	56	88	54	458

Note: Cro., For, Kor., Gra., Wat., Sno., Wet., Imp., and Roc. stand for cropland, forest, grassland, water, snow and ice, wetland, impervious area and bare soil, and rocky terrain, respectively.

Table 2. Experimental schemes used in this work.

Schemes	Temporal Datasets Used (☑) for Classification
Schemes	Time1	Time2	Time3	Time4
1	☑
2		☑
3			☑
4				☑
5	☑	☑
6	☑	☑	☑
7	☑	☑	☑	☑

Note: ☑ indicates that the imagery of the period is used.

Table 3. Features used for classification and their descriptions.

Platform	Band or Index	Wavelength or Formula	Reference
Sentinel-1	SD_VV	The standard deviation of VV	[10]
Sentinel-1	SD_VH	The standard deviation of VH	[10]
Sentinel-2	SWIR 1 (B11)	Short-wave infrared	[27]
	SWIR 2 (B12)	Short-wave infrared	[27]
	NDVI	(NIR − Red)/(NIR + Red)	[27]
	NDWI	(Green − NIR)/(Green + NIR)	[50]
	TC_Wetness	0.1509 Blue + 0.1973 Green + 0.3279 Red + 0.3406 NIR − 0.7112 SWIR1 − 0.4572 SWIR2	[10]
	BRISI	(ISBAI − BAI)/(ISBAI + BAI),ISBAI = (2SWIR1 − (Red + NIR)/2) /(2SWIR1 + (Red + NIR)/2),BAI = ((Red + SWIR1) − (Blue + Green + SWIR2))/(Red + SWIR1) + (Blue + Green + SWIR2)	[51]
NASADEM	Elevation	NASADEM Digital Elevation (30 m)	[47]

Table 4. Accuracy of multiple classifiers in multiple experimental schemes.

Schemes		CART			RF			GTB
Schemes	OA	Kappa	F1 Score	OA	Kappa	F1 Score	OA	Kappa	F1 Score
S1	76%	72%	86%	83%	81%	91%	86%	84%	93%
S2	81%	79%	90%	87%	85%	93%	89%	87%	94%
S3	86%	84%	92%	92%	91%	96%	93%	92%	96%
S4	86%	84%	93%	93%	92%	96%	94%	93%	97%
S5	82%	80%	90%	89%	88%	94%	91%	90%	95%
S6	88%	86%	93%	93%	92%	96%	94%	93%	97%
S7	90%	89%	95%	95%	95%	98%	95%	95%	98%

Table 5. Confusion matrix of Scheme 7 using GTB.

	Classification Result of Land-Use Map for S7-GTB
	Cro.	For.	Kor.	Gra.	Wat.	Sno.	Wet.	Imp.	Roc.	Sum	PA	F1
Cro.	58	0	0	0	0	0	0	0	0	58	100%	100%
For.	0	46	1	0	0	0	0	0	0	47	98%	95%
Kor.	0	0	27	0	0	0	0	0	0	27	100%	95%
Gra.	0	2	0	40	0	0	0	3	0	45	89%	92%
Wat.	0	0	0	0	48	0	0	0	1	49	98%	98%
Sno.	0	0	0	0	0	34	0	0	0	34	100%	97%
Wet.	0	1	1	0	1	0	53	0	0	56	95%	97%
Imp.	0	1	1	1	0	0	0	84	1	88	95%	93%
Roc.	0	0	0	1	0	2	0	4	47	54	87%	91%
Sum	58	50	30	42	49	36	53	91	49	OA = 95%, Kappa = 95%
UA	100%	92%	90%	95%	98%	94%	100%	92%	96%	F₁ score = 98%

Note: Cro., For, Kor., Gra., Wat., Sno., Wet., Imp., and Roc. stand for cropland, forest, grassland, water, snow and ice, wetland, impervious area and bare soil, and rocks, respectively.

Table 6. Comparison of F₁ scores yielded by our products with those of the three free products for certain classes.

	ESA 2020	ESRI 2021	Google 2021	Our Product
Grassland	69%	56%	73%	92%
Water	81%	86%	93%	98%
Snow and Ice	61%	89%	97%	97%
Wetland	68%	84%	56%	97%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, R.; Yang, H.; Yang, G.; Liu, Y.; Zhang, C.; Long, H.; Xu, H.; Meng, Y.; Feng, H. Land-Use Mapping with Multi-Temporal Sentinel Images Based on Google Earth Engine in Southern Xinjiang Uygur Autonomous Region, China. Remote Sens. 2023, 15, 3958. https://doi.org/10.3390/rs15163958

AMA Style

Chen R, Yang H, Yang G, Liu Y, Zhang C, Long H, Xu H, Meng Y, Feng H. Land-Use Mapping with Multi-Temporal Sentinel Images Based on Google Earth Engine in Southern Xinjiang Uygur Autonomous Region, China. Remote Sensing. 2023; 15(16):3958. https://doi.org/10.3390/rs15163958

Chicago/Turabian Style

Chen, Riqiang, Hao Yang, Guijun Yang, Yang Liu, Chengjian Zhang, Huiling Long, Haifeng Xu, Yang Meng, and Haikuan Feng. 2023. "Land-Use Mapping with Multi-Temporal Sentinel Images Based on Google Earth Engine in Southern Xinjiang Uygur Autonomous Region, China" Remote Sensing 15, no. 16: 3958. https://doi.org/10.3390/rs15163958

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Land-Use Mapping with Multi-Temporal Sentinel Images Based on Google Earth Engine in Southern Xinjiang Uygur Autonomous Region, China

Abstract

1. Introduction

2. Study Area and Data

2.1. Study Area

2.2. Dataset and Preprocessing

2.2.1. Remote-Sensing Data

2.2.2. Reference Data

3. Methods

3.1. Land-Use Classes

3.2. Image-Aggregation Scheme

3.2.1. Determination of the Time Window

3.2.2. Temporal Differences in Characteristic Variables

3.3. Classification Algorithm

3.3.1. Classification and Regression Trees

3.3.2. Random Forest

3.3.3. Gradient Tree Boost

3.4. Accuracy Assessment

3.5. Comparison with Existing Land-Use Products

4. Results

4.1. Performance of Experimental Schemes

4.1.1. Performance of Scenarios

4.1.2. Sensitivity of Parameters

4.2. Mapping Results

4.2.1. Visualization of Mapping Results

4.2.2. Results of Area Statistics for Different Classes

4.3. Comparison with Existing Land-Use Products

5. Discussion

5.1. Advantages of Multi-Temporal Image Integration

5.2. Importance of Feature Variables

5.3. Limitations and Prospects

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI