Identification and Monitoring of Irrigated Areas in Arid Areas Based on Sentinel-2 Time-Series Data and a Machine Learning Algorithm

Yu, Lixiran; Xie, Hong; Xu, Yan; Li, Qiao; Jiang, Youwei; Tao, Hongfei; Aihemaiti, Mahemujiang

doi:10.3390/agriculture14101693

Open AccessArticle

Identification and Monitoring of Irrigated Areas in Arid Areas Based on Sentinel-2 Time-Series Data and a Machine Learning Algorithm

by

Lixiran Yu

^1,2,

Hong Xie

³,

Yan Xu

⁴,

Qiao Li

^1,2,

Youwei Jiang

^1,2,

Hongfei Tao

^1,2,* and

Mahemujiang Aihemaiti

^1,2

¹

College of Hydraulic and Civil Engineering, Xinjiang Agricultural University, Urumqi 830052, China

²

Xinjiang Key Laboratory of Hydraulic Engineering Security and Water Disasters Prevention, Urumqi 830052, China

³

Changji Water Conservancy Management Station (Santunhe River Basin Management Office), Changji 831100, China

⁴

Xinjiang Cold and Arid Zone Water Resources and Ecological Hydraulic Engineering Research Center (Academician Expert Workstation), Urumqi 830052, China

^*

Author to whom correspondence should be addressed.

Agriculture 2024, 14(10), 1693; https://doi.org/10.3390/agriculture14101693

Submission received: 5 August 2024 / Revised: 25 September 2024 / Accepted: 26 September 2024 / Published: 27 September 2024

(This article belongs to the Section Agricultural Water Management)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Accurate monitoring of irrigation areas is of great significance to ensure national food security and rational utilization of water resources. The low resolution of the Moderate Resolution Imaging Spectroradiometer and Landsat data makes the monitoring accuracy insufficient for actual demand. Thus, this paper proposes a method of extracting the irrigated area in arid regions based on Sentinel-2 long time-series imagery to realize the accurate monitoring of irrigation areas. In this paper, a typical irrigation area in the arid region of Northwest China–Xinjiang Santun River is selected as the study area. The long time series Sentinel-2 remote sensing data are used to classify the land use of the irrigation area. The random forest, CART decision tree, and support vector machine algorithms are used to combine the field collection of the typical irrigation point and non-irrigated sample points. The irrigation area is extracted by calculating the Normalized Vegetation Index (NDVI), Soil-Adjusted Vegetation Index (SAVI), and Optimized Soil-Adjusted Vegetation Index (OSAVI) time series data as the classification parameters. The results show that (1) the irrigated area of the dryland irrigation region can be effectively extracted using the SAVI time-series data through an object-oriented approach combined with the random forest algorithm. (2) The extracted irrigated areas were 44,417, 42,915, 43,411, 48,908, and 47,900 hm² from 2019 to 2023, and the overall accuracies of the confusion matrix validation were 94.34%, 90.22%, 92.03%, 93.23%, and 94.63%, with kappa coefficients of 0.9011, 0.8887, 0.8967, 0.9009, and 0.9265, respectively. The errors of the irrigated area compared with the statistical data were all within 5%, which demonstrated the effectiveness of the method in extracting the irrigated area. This method provides a reference for extracting irrigated areas in arid zones.

Keywords:

Sentinel-2A; random forest; land use classification; object-oriented; irrigated area

1. Introduction

Freshwater is one of the most precious and threatened resources required by humans and terrestrial organisms. With economic growth, urbanization, and agricultural expansion, global demand for freshwater continues to rise. However, the quality and quantity of global freshwater resources are degrading, especially in developing countries [1]. China, with only 6% of the world’s freshwater resources and 9% of the earth’s arable land, supports 21% of the global population, highlighting the crucial role of agricultural irrigation. With approximately 1.13 × 10⁷ hm² of irrigated farmland, China serves as a significant reserve base for food production and arable land resources. Therefore, enhancing agricultural modernization and promoting high-quality development in irrigation areas are of great significance for achieving efficient water use and ensuring food security in China [2,3]. Xinjiang Autonomous Region is located in the arid inland region of China, with agricultural water consumption accounting for the vast majority of the region’s total water usage [4]. As a result, accurately identifying the actively irrigated areas in irrigation districts is crucial for achieving efficient agriculture, rational water resource management, scientific policy-making, and climate change adaptation. However, traditional methods for estimating the irrigated area in irrigation districts often rely on intensive manual ground surveys and data collection, which not only consumes a significant amount of time but also incurs high economic costs [5]. The statistics related to irrigated areas also sometimes rely on historical irrigation experience based on rough estimation rather than precise data analysis. Moreover, due to the fixed cycle of statistical analysis, the irrigation area data obtained may not be updated in real time, thus failing to timely and accurately reflect the actual situation amidst rapid changes in the agricultural environment. Consequently, this method hampers the timeliness and effectiveness of irrigation management.

In recent years, with the continuous development of remote sensing technology, which has the advantages of large coverage, short detection cycle, and low cost, remote sensing has been increasingly used in farmland object recognition and irrigation area extraction, but it is also accompanied by many challenges [6,7,8]. In 2006, the International Water Management Institute successfully developed the world’s first Global Irrigation Area Distribution Map with a resolution of 10 km, which provided a scientific method for the use of satellite remote sensing technology. Researchers have gained rich experience in the production of maps of the distribution of irrigation areas at different scales and with different levels of accuracy [9]. With the continuous progress of remote sensing technology, researchers can now obtain more kinds of remote sensing data. For example, the Landsat series satellites, the Moderate Resolution Imaging Spectroradiometer (MODIS), and the Sentinel series satellites have significantly improved the temporal, spatial, and spectral resolution of these data, which provides solid data support for accurate and efficient acquisition of mapped irrigation areas. Many scholars have carried out a series of studies based on MODIS data and Normalized Vegetation Index (NDVI) data sets. Ambika et al. [10] used MODIS 250 m resolution NDVI data and land use/land cover data to produce high-resolution irrigation area maps of agro-ecological zones in India. Md et al. [11] combined irrigated agricultural acreage statistics and 250 m annual NDVI data from MODIS to create a spatial irrigation map of U.S. counties. Liu et al. [12] developed a spatial distribution map of irrigated cultivated land in China in 2010 based on time-series NDVI data. Wang et al. [13] estimated the irrigated area within the Loop irrigation area using MODIS day-by-day data and field-measured thresholds. Although MODIS data have the advantage of high spatial resolution, its 250 m resolution may not be fine enough for irrigation monitoring of small farmland plots or complex terrain. At the same time, land use/land cover data are usually used as an important input for irrigation area extraction, and related errors or changes may mislead the identification of irrigation areas and ultimately affect the accuracy of the results. Therefore, for farmland in small areas or plots, a higher requirement for image resolution exists. Ketchum et al. [14] used Landsat satellite images and a variety of auxiliary data to train a random forest (RF) classifier and draw a 30 m resolution irrigation map. Simões et al. [15] used multi-temporal Landsat Thematic Mapper data to estimate the irrigated area of rice in southeastern Brazil while using supervised classification. Wang et al. [16] proposed a plot-scale irrigation farmland extraction method based on Sentinel-2 time-series images and evaluated the generalization ability of their model. However, the above scholars mainly identified these irrigated areas based on the NDVI. Vegetation indices such as the NDVI are widely used in irrigation monitoring, but they are affected by many factors, such as soil type, climatic conditions, and crop type. To solve this problem, the Water Regulation Green Index (WGI) proposed by Deines et al. [17] showed higher sensitivity in irrigation monitoring. Xu et al. [18] extracted the irrigated area of the Baoji Gorge irrigation area by calculating indices such as the NDVI, an Enhanced Vegetation Index, a green chlorophyll vegetation index, and the WGI and found that the WGI had higher recognition accuracy.

The research areas of the above scholars mainly focused on semi-arid areas without fully considering the geographic characteristics of arid areas. In contrast, this study uniquely focuses on irrigated areas in arid regions, especially the irrigation areas in arid regions that do not need to consider rain-fed crops and the sparse distribution of crops in arid regions, so that fine monitoring is directly targeted to the irrigated areas. Secondly, this study not only uses high-resolution Sentinel-2 multi-spectral images as data sources but also innovatively combines three advanced machine learning algorithms, namely classification and regression tree (CART), random forest (RF), and support vector machine (SVM), to optimize the land use classification results. It significantly improved the accuracy and efficiency of land use classification and provided strong technical support for the next step of accurate monitoring and management of irrigation area in arid region. Finally, according to the research results of Liang et al. [19] based on different vegetation indices in the extraction of vegetation biomass in desert areas, this study innovatively adopted the Soil-Adjusted Vegetation Index (SAVI), which was selected in comparison with the commonly used Normalized Degradation Vegetation Index (NDVI) and Optimized Soil-Adjusted Vegetation Index (OSAVI).

In summary, the study focuses on irrigated areas in arid regions to fill the gap of remote sensing monitoring in agricultural management and water resource planning in arid regions. Sentinel-2 imagery is used, combined with the CART, RF, and SVM algorithms, to optimize land use classification and improve accuracy and efficiency. Innovative use of the SAVI to reduce soil background interference and accurately monitor the irrigated area in arid zone irrigation areas is applied in remote sensing monitoring of the Santun River irrigation area, a typical irrigation area in arid zone of Xinjiang, China. It provides new technical support for the sustainable development of agriculture and water resource management in arid zones, and enriches the ecological environment monitoring system in arid zones.

2. Materials and Methods

2.1. Study Area Overview

The Santun River Irrigation District is located in the middle section of the northern foot of the Tianshan Mountain and the southern margin of the Junggar Basin in Xinjiang, China. Its geographical position is between 43°6′30″–45°20′ N and 86°24′33″–87°37′ E. The area covers 260 km from north to south and 31 km from east to west, with a watershed area of 4466 hm². The Santun River Basin is located in the hinterland of the Eurasian continent on the southern edge of the Junggar Basin, far from the ocean, and experiences a mid-temperate continental arid climate. The total irrigation area of the Santun River Irrigation District covers 73,000 hm². The study area is located in the Santun River irrigation area and experiences a relatively high rate of land use. The productive land is mainly agricultural and forest lands, with agricultural land mainly planted with wheat, corn, cotton, and other crops. Most of the non-productive land is inhabited and supports a variety of industrial and mining enterprises, government institutions and schools, and transportation facilities. Therefore, the abundance of land cover types in the study area is conducive to research. The geographical location of the study area is shown in Figure 1.

2.2. Data Sources

2.2.1. Sentinel-2 Imagery

Sentinel-2 is a satellite launched by the Global Environment and Security Monitoring Program of the European Space Agency. Sentinel-2A is a medium-resolution multispectral imaging satellite launched by the European Space Agency on 23 June 2015. The satellite is equipped with a multi-spectral imager, flying at an altitude of 786 km, with a total of 13 spectral bands, a spatial resolution of 10–60 m, and a field-of-view of 290 km [20]. The data used in the present study were downloaded from https://dataspace.copernicus.eu/explore-data/data-collections/sentinel-data/sentinel-2 (accessed on 23 September 2024). The Sentinel-2 imagery used in this study was acquired from 2019–2023, over a total of 45 days, and the study area required three views of imagery stitched together, for a total of 135 views of the time-series imagery. Only images with less than 5% cloud cover were selected; then, the image data were cropped using the vector boundary of the study area. The data level of this series is L2A with good quality, which has been preprocessed by radiometric calibration, orthographic correction, and atmospheric correction. All of the bands were resampled to a 10 m × 10 m resolution using the GCS-WGS-84 spatial coordinate system; the Sentinel-2 full band was used for band fusion. The image capture times are shown in Table 1.

2.2.2. Sample Data

In this study, a systematic field survey was conducted in the study area during the crop ripening season from May to November 2023, including the use of a global positioning system (GPS) to accurately locate sample points and acquire ground information through unmanned aerial vehicle (DJI Mavic 3 Classic) images to collect sample data in 2023. The sample points of field sampling in 2023 were mainly cultivated land, including both irrigated and non-irrigated areas, with the coordinates of typical irrigation and non-irrigation points shown in Table 2. At the same time, high-resolution Google Earth images and Amap were used to select samples based on stratified random sampling; the sample points were evenly distributed in each entire image as far as possible by considering the number of each ground object type. Based on the actual situation of crop distribution and phenology in the study area, this study analyzed the spectral curves and index characteristics of the field sample points after data collection was completed in 2023 and subsequently randomly added visual interpretation points. The distribution of sample points and the number of samples from 2019 to 2023 are shown in Figure 2. In this study, the sample data were randomly divided into training and validation samples using a ratio of 7:3.

2.2.3. Land Use Classification System

Based on the Classification of Land Use Status of China, combined with Sentinel-2 images, Google Earth, and Amap data, the classification system of the study area was determined using six types of land: cultivated land; forest land; transportation land; water and water conservation facility land; urban, village, industrial, and mining land; and other land (Table 3).

2.2.4. Statistics

Statistical data for the irrigated area of the Santun River Irrigation District for the five years investigated (2019, 2020, 2021, 2022, and 2023) were obtained from the survey collection from the Santun River Irrigation District Basin Management Office (www.xjsth.cn, accessed on 23 September 2024) in the city of Changji. In addition, in order to ensure the comprehensiveness and accuracy of the data, reference was made to the data provided by the Changji Prefecture Bureau of Statistics (www.cj.gov.cn, accessed on 23 September 2024) on the major Xinjiang regional sown area of crops and the cropping structure of crops in previous years (https://tjj.xinjiang.gov.cn/tjj/nyypu/list_nj1.shtml, accessed on 23 September 2024). In addition, the annual report on government information disclosure work (https://www.cj.gov.cn/p1/zfxxgknb.html, accessed on 23 September 2024) published on the official website of the Changji Municipal People’s Government (www.cjs.gov.cn, accessed on 23 September 2024) was consulted as a policy guide for the analysis of the causes of the irrigated area, as well as for data verification.

2.3. Software Used in This Study

In this study, the Sentinel-2 remote sensing images were preprocessed using SNAP software (https://step.esa.int/main/download/snap-download/, accessed on 23 September 2024), a Sentinel series image processing platform provided by the ESA, including band fusion, cropping and resampling. Subsequently, multi-scale segmentation and object-oriented classification are performed by eCognition (https://geospatial.trimble.com/en/products/software/trimble-ecognition, accessed on 23 September 2024), followed by algorithms such as RF, SVM, and CT. eCognition’s object-oriented approach focuses on image objects, utilizes object and interclass information to enhance classification accuracy, and supports a variety of classification algorithms and powerful post-processing features.

3. Research Methods

Based on the object-oriented method, three time-series data sets based on NDVI, SAVI, and OSAVI data were constructed using 2019–2023 time-series Sentinel-2A images. Multi-scale segmentation was carried out by combining the optimal segmentation scale. Next, RF, CART decision tree, and SVM algorithms were used to extract the features of the study area with high precision. On this basis, the accuracy of the three classification algorithms was evaluated according to a confusion matrix, and the best classification algorithm was selected. The irrigation area of the study area was verified and extracted according to the ground survey data. The technology roadmap is shown in Figure 3.

3.1. Multi-Scale and Optimal Segmentation Scale Selections

Image multi-scale segmentation is very important in object-oriented classification. However, the results are affected by many factors such as the number of bands, shape parameters, and scale parameters, which leads to an excessive or insufficient segmentation scale. To determine the optimal segmentation scale, it is usually necessary to consider the features of ground objects and image characteristics in the study area and optimize the results through repeated testing and simulation [21,22].

An Estimation of Scale Parameter (ESP) tool, a bottom-up region merging technique, can generate optimal segmentation parameters through an iterative segmentation process. In this study, the ESP tool was used to evaluate the optimal segmentation parameters of the images in the study area. The optimal segmentation scale can be more accurately identified and selected according to the homogeneity Local Variance (LV) and its rates of change (ROCs) of the segmentation object to improve the accuracy and reliability of land use information extraction [23]. The LV and ROC calculation formulas are, respectively, given as follows:

L V = \frac{1}{m} \times \sum_{1}^{m} {(C_{L} - {\bar{C}}_{L})}^{2},

(1)

ROC = \frac{L_{i} - L_{i - 1}}{L_{i - 1}} \times 100 %,

(2)

where

C_{L}

is the luminance value of a single image in the

L

th band,

{\bar{C}}_{L}

is the mean brightness of all objects in the

L

th band of the image,

m

is the total number of image objects; ROC is the rate of change of

L V

, %,

L_{i}

is the average standard deviation of the _ith object layer of the target layer, and

L_{i - 1}

is the average standard deviation of the _{$i - 1$} object layer of the target layer.

3.2. Feature Data Set Construction

To improve the accuracy of land use classification and irrigation area identification in the study area, a total of 30 features, including index, shape, and texture features were selected to form the feature data set (Table 4). In this study, 13 bands of Sentinel data were used as the original bands, and three drought features were used in the index features: the Normalized Difference Vegetation Index (NDVI), Soil-Adjusted Vegetation Index (SAVI), and Optimized Soil-Adjusted Vegetation Index (OSAVI). At the same time, the Normalized Difference Built-Up Index (NDBI) and Modified Normalized Difference Water Index (MNDWI) were used to ensure the accuracy of land use classification in the study area. Among all vegetation indices, the red-edge band is unique to Sentinel data and the red-edge index features are particularly sensitive to changes in vegetation health. In this study, not only are the conventional vegetation indices considered, but also three red-edge indices related to the crop growth process are added: Normalized Vegetation Red Edge1 (NDVIre1), Normalized Vegetation Red Edge2 (2NDVIre2), and Normalized Vegetation Red Edge3 (NDVIre3). Additionally, using the GLCM-based texture feature calculation function provided by eCognition software, four texture features were used in this study to analyze spatial variations within the data while avoiding redundancy. Finally, to make the features used for classification and recognition more rich and effective in separating the target features, this study adopts five kinds of shape features for testing the image features of the study area.

3.3. Object-Oriented Classification Algorithms

In this study, three algorithms, RF, CART decision tree, and SVM, were used to classify the ground objects in the study area and automatically extract object-oriented land use information. Then, based on the characteristics of the study area, the most suitable classification algorithm was selected to classify the irrigation area of the study area. Random forest [24,25,26] is a new classification algorithm proposed by American scientist Leo Breiman, which can efficiently deal with multi-dimensional feature data sets and seek the optimal solution of category assignment through cross-validation of sample features. Random forest has the advantages of rapid training speed, insensitivity to sample size, high classification accuracy, and strong anti-noise ability and is one of the machine learning algorithms widely used in intelligent learning when analyzing large amounts of agricultural remote sensing data. The CART decision tree algorithm [27,28] is a classical supervised learning algorithm that can be used for both classification and regression tasks. Based on the decision tree structure, CART generates a tree model that can effectively classify or regress the data by selecting the optimal features for partitioning. By analyzing the pixel feature value, CART sets the appropriate segmentation value of each node, to compare the classification of hierarchical and successive classification technology. The main characteristics of the CART decision tree algorithm are that it is easy to understand and implement while being suitable for a variety of practical problems in the field. The SVM [29] is a binary classification model whose basic model is a linear classifier defined on the feature space with maximum intervals, and these maximum intervals differentiate it from perceptual machines; the method involves constructing the optimal classification hyperplane of the attribute space by using the principle of minimizing structural risk to obtain the globally optimal classifier. Finally, this ensures that the expected probability of risk for the whole sample space satisfies a certain upper bound.

3.4. Identification of Irrigation Areas in the Study Area

Land use classification is crucial for identifying irrigated areas. The accuracy of land use classification directly affects the accuracy of subsequent irrigation area identification. Improving the accuracy of land use classification can help researchers effectively distinguish irrigated lands from other land use types. The Santun River irrigation area lies in an area where it is not possible to grow rainfed crops. The main crops in the irrigation area, cotton, wheat, and corn, as well as garden and forest land, all need irrigation. However, some farmland wasteland and abandoned land exist in the irrigated area; these areas do not need irrigation but still have a small amount of vegetation cover. To effectively distinguish between irrigated and non-irrigated areas, it is necessary to accurately capture the differences in crop spectral properties and indicators that result from irrigation. Therefore, this study used the long-term NDVI, SAVI, and OSAVI data sets for each year from 2019 to 2023 and used the Savitzky–Golay (S–G) filtering method to reconstruct the NDVI, SAVI, and OSAVI time-series data acquired from the Sentinel-2A satellite. The spectral and phase characteristics of the measured irrigation and non-irrigation points were analyzed.

Among the vegetation indices, the NDVI is suitable for monitoring the growth dynamics of vegetation and can measure the photosynthetically active biomass in plants. However, NDVI is sensitive to soil brightness and atmospheric effects and may be disturbed by these factors, affecting the accuracy of monitoring results. Compared with the NDVI, soil and atmosphere have a relatively small impact on the SAVI, so the SAVI can better correct the soil background effect and improve the estimation accuracy of vegetative biomass. In arid regions, SAVI data are effectively resistant to effects from soil background noise and can be used to improve the detection accuracy of measures of vegetation cover under these conditions. Meanwhile, the OSAVI is a recently developed index that is more flexible to changes in high soil background values. In semi-arid grassland areas, the OSAVI is feasible as a potential indicator for estimating green biomass and vegetation cover, which can better adapt to changes in soil background and improve the monitoring effect in arid areas.

At the same time, the S–G filter can effectively smooth the NDVI time-series data, remove the noise and outliers, and make the change trend of this vegetation index clearer and more stable. The S–G filtering method can better extract the long-term change trend of the vegetation index, and S–G filtering has a better smoothing effect and long-term trend extraction ability in processing the vegetation indices based on Sentinel-2 images [30,31,32].

3.5. Classification Accuracy Evaluation

Overall accuracy (OA), which refers to the ratio of correctly classified pixels to the total number of pixels, is a commonly used metric to measure the result of change detection. The kappa coefficient is a parameter that can measure the classification accuracy more accurately and can better reflect the consistency of two pixels. Producer accuracy measures the proportion of the number of correctly classified pixels distributed along the diagonal of a confusion matrix, while user accuracy measures the ratio of the number of correctly classified pixels in the classification result to the actual number of pixels in a category [33]. The overall accuracy and kappa coefficient values are affected by the producer accuracy and user accuracy. Therefore, this study calculated the classification results of land objects in the study area classified by RF, a CART decision tree, and an SVM. Producer’s accuracy (PA), user’s accuracy (UA), overall accuracy (OA), and the kappa coefficient were used for quantitative evaluation. The calculation formulas are as follows:

PA = \frac{X_{i i}}{X_{+ i}} \times 100 %,

(3)

UA = \frac{X_{i i}}{X_{i +}} \times 100 %,

(4)

OA = \frac{\sum_{i = 1}^{k} X_{i i}}{X},

(5)

Kappa = \frac{N \sum_{i = 1}^{k} X_{i i} - \sum_{i = 1}^{k} (X_{i +} X_{+ i})}{N^{2} - \sum_{i = 1}^{k} (X_{i +} X_{+ i})},

(6)

where

N

is the total number of pixels;

X_{i i}

refers to the correct classification value of class

i

;

X_{i +}

refers to the total number of pixels in this category;

X_{+ i}

refers to the total number of pixels of this category in the reference data.

4. Results and Analysis

4.1. Optimal Scale Optimization Results

Image segmentation is the primary step for segmenting an image into spatially continuous and spectrally homogeneous units to match the classification objective in object-oriented analysis. After segmentation, the results must be evaluated to select the optimal segmentation scale. Selecting the optimal scale can make the segmentation of the target complete while ensuring the internal homogeneity of the object [34].

To ensure the rationality and reliability of parameter settings in multi-scale segmentation, this study carried out segmentation processing tests and comparative analysis by setting different combinations of parameters. The following parameter settings were determined: shape index weight = 0.1, compactness index weight = 0.5, and weight of the band layer = 1. The segmentation scale parameters obtained four ROC peaks according to the ESP tool were 70, 123, 127, and 136 (Figure 4). The larger the segmentation scale, the larger the segmented patches were prone to under-segmentation phenomena; otherwise, over-segmentation phenomena occurred. After comparison experiments at different segmentation scales, 70 was selected as the segmentation scale parameter for use in this study (Figure 4). The goal was to ensure that the segmentation process can accurately capture the multi-scale features in the image while maintaining the stability and reliability of the segmentation results.

4.2. Analysis of Temporal Characteristics of Vegetation Indices

The three vegetation indices, the NDVI, SAVI, and OSAVI, were calculated based on Sentinel-2 time-series data. As can be seen from Figure 5, the smoothing of the vegetation index time-series curve using the Savitzky–Golay filter can effectively reduce the influence of noise; eliminate jaggedness, abruptness, and anomalies; and make the trend of the vegetation index smoother and clearer. In Figure 5, the horizontal coordinates indicate the date of the remote sensing imagery used in 2023, and the vertical coordinates indicate the value of the vegetation index. The translation was conducted using DeepL.com (free version). The time-series curve of each vegetation index after filtering and smoothing can more fully reflect the time-series characteristics of crops. At the same time, different vegetation indices show different curve fluctuation trends for different ground objects. In the comparison between irrigated and non-irrigated sites, the time-series of the three vegetation indices had roughly the same change trend, but a comparison between Figure 5a–c show that the change trend of irrigated sites was significantly greater than that of non-irrigated sites, especially in the critical period of crop growth from April to June and during the harvest period from August to October. However, compared with the three groups of vegetation index data from June to August, the change trends of the SAVI and OSAVI were more significant than those of the NDVI, indicating that the SAVI and OSAVI could better capture the characteristics of non-irrigated points to achieve more accurate discrimination between non-irrigated points and other land features.

Another core content of this study is the identification of non-irrigation sites (agricultural wasteland and abandoned land) and other land features. In the sowing period of crops from March to May, the difference in spectral characteristics of crop land is small, and it is difficult to identify non-irrigated areas. May to October is the critical period for crop growth, and it is also the critical period for identifying agricultural wasteland and abandoned land. During this period, although agricultural wasteland and abandoned land do not need irrigation, they still support a small amount of vegetation cover. From March to May, the NDVI and OSAVI did not have a good identification effect, and the time-series curves overlapped in many places (Figure 6a,c). During this period, the identification ability of NDVI time-series data for urban villages, industrial and mining land, and non-irrigated areas was weak, while the identification ability of OSAVI time-series data for other land and non-irrigated areas was also weak. During the period from May to October, the NDVI was better than the OSAVI in identifying non-irrigated areas, and non-irrigated areas were difficult to distinguish from urban, village, industrial, and mining land when using the OSAVI time-series. Figure 6b shows that non-irrigated areas can be distinguished from the other four land features during March to May. Meanwhile, the SAVI was also effective in distinguishing non-irrigated areas from other land features from May to October. Therefore, the SAVI is one of the more suitable indices for distinguishing non-irrigated areas from other land features and ground objects.

4.3. Accuracy Analysis of Land Use Classification and Optimization of the Classification Algorithm in the Study Area

By analyzing the classification results in Figure 7, it is obvious that these algorithms differ in the classification process. From the classification results, the CART decision tree algorithm shows more obvious misclassification between forest land and cropland between 2019 and 2023. The most noticeable misclassification by the CART decision tree algorithm occurred in forest land in 2019 and in towns, villages, and industrial and mining land in 2023. The SVM algorithm misclassified transportation land, urban village, and industrial and mining land into cultivated land and forest land in 2022, and the cultivated land was misclassified into transportation land in 2019. These results indicate that the SVM algorithm has some problems in terms of stability. In general, the CART decision tree algorithm performed better in distinguishing urban, village, and industrial and mining land from transportation land. The RF algorithm had the best performance among the three classification algorithms, and its recognition of urban villages and industrial and mining land was more stable. The recognition effect of cultivated land and forest land was also better than that of the CART decision tree and SVM algorithms. Therefore, according to the results of this study, the RF algorithm is stable and has high accuracy for land use classification in the study area.

According to the accuracy evaluation results of the three algorithms, RF, CART decision tree, and SVM from 2019 to 2023 shown in Figure 8, the RF algorithm performed the best, and its kappa coefficient and overall accuracy were higher than those of the other two algorithms in all years. From 2019 to 2023, the kappa coefficient of RF was 0.8294, 0.8881, 0.9207, 0.8982, and 0.886, and the overall accuracy was 92.85%, 88.81%, 92.07%, 89.82%, and 88.6%, respectively. The CART decision tree algorithm followed; its kappa coefficient fluctuated between 0.6756 and 0.747, and the overall accuracy varied between 84.37% and 86.4% for all years. The kappa coefficient of SVM was between 0.4514 and 0.7421, and the overall accuracy was between 71.87% and 85.71% for all years. In particular, as shown in Figure 8c, the overall accuracy and kappa coefficient of the RF algorithm were the highest in 2021, reaching 95.16% and 0.9207, respectively. Figure 8b shows that the overall accuracy and kappa coefficient of the CART decision tree algorithm peaked in 2020, which were 86.4% and 0.7937, respectively. In summary, for the identification of irrigation areas in the study area, the RF algorithm performed the best and is the preferred algorithm.

4.4. Identification of Irrigation Areas and Area Changes in the Study Area

Based on the optimization results of the three different algorithms, the RF algorithm was used to identify the irrigated area in the study area from 2019 to 2023; the recognition accuracy of the irrigated area in the study area is shown in Figure 9. The kappa coefficient in the five years from 2019 to 2023 was generally higher than 0.9, with coefficients from 2019 to 2023 at 0.9011, 0.8887, 0.8967, 0.9009, and 0.9265, respectively, showing a relatively high level of consistency. The overall accuracy (OA) was generally higher than 93%, with OA from 2019 to 2023 at 94.34%, 90.22%, 92.03%, 93.23%, and 94.63%, respectively. This means that the model correctly classified most of the samples.

We compared the statistical data and the error for the irrigation area calculated by this research method. The statistical data over the five years were controlled within 5%, especially for the difference between the irrigation area calculated in 2019 and the statistical data of only 109 hm², indicating that this method is feasible for extracting the irrigation area of the study area. Table 5 shows that from 2019 to 2023, the calculated irrigation and statistical areas increased by nearly 3483 and 4341 hm², respectively. The irrigation area mapping of the study area from 2019 to 2023 is shown in Figure 10.

5. Discussion

5.1. Comparison and Analysis of Classification Algorithms

Sentinel-2A remote sensing data from 2019 to 2023 were used in this study for image classification, and three different classification algorithms were compared. When dealing with the data from 2021, the RF algorithm (see Section 4.3 above) also had good performance in other years. The RF algorithm can deal with complex data while maintaining a high level of accuracy and strong stability. This is consistent with the result of Wang et al. [35], who found that the RF algorithm performed best in the extraction of cotton planting area in the Mosuowan Reclamation area of Xinjiang. This finding is also consistent with the results of Wang et al. [36], who adopted the RF algorithm in their study of land use classification in the agricultural area of Sheli Township, Da’an City, Baicheng City, China. The accuracy of the RF algorithm was found to be better than the SVM algorithm, which is also consistent with the results of the present experiment. In the test of the present study, the SVM algorithm showed the most unstable performance of the three algorithms used here. Over the five-year study period, 2020 had the highest kappa coefficient of 74.21%, while 2023 had the lowest accuracy of only 45.14%. It was concluded that the SVM algorithm has high computational complexity, sensitive parameter adjustment, and is not suitable for large-scale data sets. These are factors that should be considered when selecting a classifier. The overall accuracy of the CART decision tree algorithm remained above 80% in the five-year study period, but its performance was the worst in 2022, even with more sample points, with an overall accuracy of 82.37% and a kappa coefficient of 71.76%. Although the CART decision tree algorithm can be constructed rapidly, it may suffer from overfitting when dealing with complex data sets. The quality of the data, including completeness, accuracy, outliers, and class distribution will affect the process and the final quality of the decision tree to a certain extent. This result is consistent with the research results of Bai et al. [37]; the decision tree algorithm strongly depends on the training sample data set.

In summary, choosing an appropriate classifier algorithm requires consideration of a combination of factors such as accuracy, speed, ease of use, and the data set used. Random forest is a good choice for Sentinel-2A image classification, especially when dealing with complex data sets. However, if the data set is relatively simple, CART decision trees may be a faster option. The sensitivity to parameters and computational complexity of an SVM may require more work to tune parameters, and it is not suitable for large-scale data sets. When choosing a classifier, it is necessary to consider the specific application scenarios and data characteristics comprehensively.

5.2. Analysis of the Effect of Vegetation Index on Extraction of Irrigated Area

The arid study area was located in the Santun River irrigation area of Xinjiang, China. However, problems continue to affect the area, such as improper allocation of water resources in arid landscapes, resulting in a lack of water that makes some cultivated land unsuitable for crops, forming agricultural wasteland. In addition, an imperfect drainage system has resulted in soil salinization, causing cultivated land to be abandoned. Therefore, the timely monitoring of agricultural wasteland and abandoned land that was previously arable is very important. Vegetation indices obtained through remote sensing techniques, such as the NDVI, SAVI, and OSAVI, play an important role in monitoring and assessing vegetative conditions in irrigated areas [38,39,40]. The NDVI is one of the most commonly used vegetation indices in studies employing remotely sensed data but may underestimate the growth condition of vegetation. Both the SAVI and OSAVI are modified versions of the NDVI which reduce the influence of the soil background by introducing soil conditioning factors, especially in areas with low vegetation cover or high soil reflectance. The SAVI uses an adjustable soil adjustment parameter L to correct the NDVI, while the OSAVI uses a fixed value of L, which simplifies the calculation process and makes these indices more robust in the case of high remotely-sensed soil background values. The results of the present study are consistent with the research results of Wang et al. [41] and Zhang et al. [42], who also used the SAVI in arid areas of Xinjiang. In the identification of irrigated areas in arid regions, the SAVI vegetation index can help researchers identify agricultural wasteland and abandoned land while accurately identifying irrigated areas. When classifying features for arid regions similar to our study area, the use of the random forest algorithm combined with the SAVI index for irrigated area extraction is undoubtedly a worthwhile approach in view of the complex diversity of the feature types in the region and the prevalence of agrarian wasteland and abandonment in cropland.

5.3. Analysis of the Causes of Changes in Irrigation Area in the Study Area from 2019 to 2023

The overall irrigated areas and woodland areas in the study area showed an increasing trend during the five years from 2019 to 2023 (Figure 11). Table 4 shows how the irrigated area in the study area increased by a total of 4341 hm² between 2019 and 2023. This increase was influenced by several factors. First, the spatial extent of forest land in the study area showed an overall increase of 2906 hm² from 2019 to 2023. The forest coverage is relatively low in Xinjiang because of the dry climate and lack of precipitation. To improve this situation, the government agencies of Xinjiang have launched a number of ecological protection and restoration projects, including natural forest protection and afforestation projects [43].

Second, the Santun River Irrigation District in Changji City has carried out the “14th Five-Year Plan” large-scale irrigation district renewal and modernization project, along with the construction of digital twin irrigation districts, aiming to improve the irrigation conditions in the district and increase the efficiency of water resource utilization [44]. In addition, climate change and increasing food demand are among the factors driving the observed increase in irrigated area in the study region. With a lack of precipitation caused by climate change, agricultural production is under greater pressure, so it is necessary to expand the irrigated area to increase food production to meet market demand. In summary, the Santun River Irrigation District in Changji City underwent significant modernization and intelligent upgrading from 2019 to 2023, and these changes should lead to an increase in irrigation area.

5.4. Uncertainty Analysis and Prospects

In this study, although the long time-series remote sensing images from 2019 to 2023 were used, field surveys were carried out for the sample points in 2023. For the sample point data of other years, the spectral curve and index characteristics of the field sample points in 2023 could be compared with data from other years, and the analysis was combined with Google Earth image and Amap data. Despite the good performance of the results, however, except for the data of 2023, some errors and uncertainties in the data may be present for other years.

Sentinel-2A remote sensing images are easily affected by weather. Clouds, fog, rain, water vapor, and other atmospheric particles and conditions will scatter and absorb light waves emitted or reflected by the sensor, thereby reducing the contrast and definition of remotely sensed imagery. In this case, the ground truth information could not be effectively captured, so some data were missing. If these missing or degraded remote sensing data are used for feature construction and model training, the true characteristics of the surface may be obscured by clouds and other occlusions, so the features constructed based on these data will not be adequate to represent all the attributes of the ground objects, thus affecting the interpretation ability of the model. At the same time, the problem of data quality is one of the important factors affecting the accuracy of remote sensing image classification. Missing or inaccurate input data will cause errors in the classification model, which will reduce the accuracy of the classification.

Supervised classification is a widely used classification method in the field of remote sensing image processing. With continuous technological advances, in addition to the traditional supervised classification algorithm, many new classification algorithms have emerged, such as the improved U-Net remote sensing image extraction algorithm [45,46]. These new algorithms are usually based on machine learning and deep learning theory [47,48,49], which can better handle complex data and improve the accuracy and efficiency of classification. Therefore, future research will consider the novel algorithms discussed above.

6. Conclusions

Based on Sentinel-2A time-series data from 2019 to 2023, this paper conducted irrigation area monitoring research using the Santun River irrigation area as the study area and comprehensively used vegetation index data sets, texture features, and shape features to carry out land use classification research. The recognition accuracy of three classification algorithms, specifically RF, CART decision tree, and SVM, was analyzed. The classification algorithm was optimized to identify the irrigation area in the study area, and the extraction ability of NDVI, SAVI, and OSAVI time-series data in the study area was explored. The following conclusions were drawn:

(1): The SAVI has more advantages than the OSAVI and NDVI in the identification of agricultural wasteland and abandoned land in arid areas. Therefore, it is best to use the SAVI for the identification of irrigated areas in arid regions.
(2): Compared with the CART decision tree and SVM, RF performed well in classification accuracy, processing speed, user-friendliness, and the prevention of overfitting. In this study, using RF, the overall accuracy of land use classification reached 94.22%, and the kappa coefficient reached 0.92. The overall accuracy of irrigation area identification reached 94.63%, and the kappa coefficient reached 0.92.
(3): From 2019 to 2023, the total irrigated area increased by 4341 hm², and the forest land increased by 2906 hm². In 2022, the maximum irrigated area reached 50,686 hm², and the forest land area reached 9306 hm².

Author Contributions

Conceptualization, Y.X. and Y.J.; methodology, Y.X.; formal analysis, Q.L.; data curation, H.X.; writing—original draft, L.Y.; visualization, L.Y.; supervision, H.T. and M.A.; project administration, H.T. and M.A.; funding acquisition, H.T. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financially supported by the Xinjiang Uygur Autonomous Region Major Project (2023A02002-1); the National Natural Science Foundation of China (52369513); the Xinjiang Key Laboratory of Water Conservancy Engineering Safety and Water Disaster Prevention Open Project (ZDSYS-YJS-2023-07, ZDSYS-YJS-2023-10); the Top-level Project of the Belt and Road Water and Sustainable Development Science and Technology Fund of the National Key Laboratory of Water Disaster Defense (2020491611); and a group-supporting study abroad project funded by the Xin-jiang Uygur Autonomous Region People’s Government in 2019.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The Sentinel-2 data used in the present study were downloaded from https://dataspace.copernicus.eu/explore-data/data-collections/sentinel-data/sentinel-2 (accessed on 23 September 2024).

Acknowledgments

The authors would like to express their special thanks to Qi Li for her great support and help in this research, and we would also like to thank the editor, Owen Zhang, and three anonymous reviewers whose valuable comments greatly improved this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chen, F.; Zhao, H.; Roberts, D.; Van de Voorde, T.; Batelaan, O.; Fan, T.; Xu, W. Mapping center pivot irrigation systems in global arid regions using instance segmentation and analyzing their spatial relationship with freshwater resources. Remote Sens. Environ. 2023, 297, 113760. [Google Scholar] [CrossRef]
Kang, S. National water conservation initiative for promoting water-adapted and green agriculture and highly-efficient. China Water Resour. 2019, 13, 1–6. [Google Scholar] [CrossRef]
Deng, M.; Tao, W.; Wang, Q.; Su, L.; Ma, C.; Ning, S. Theory and Technical Guarantee System Construction of Modern Ecological Irrigation District in Northwest China. Trans. Chin. Soc. Agric. Mach. 2022, 53, 1–13. [Google Scholar] [CrossRef]
Li, F.; Fan, J.; Guan, X.; Liu, H.; Ying, F. Status and countermeasures of agricultural water saving development in irrigation areas of Xinjiang. J. Huazhong Agric. Univ. 2024, 43, 93–98. [Google Scholar] [CrossRef]
Zhang, W.; Shao, J. Development and Prospect of Remote Sensing Monitoring Technology for Agricultural Irrigation. Water Sav. Irrig. 2019, 4, 102–108. [Google Scholar]
Arancibia, P.L.J.; Ticehurst, C.J.; Yu, Y.; McVicar, R.T.; Marvanek, S.P. Feasibility of monitoring floodplain on-farm water storages by integrating airborne and satellite LiDAR altimetry with optical remote sensing. Remote Sens. Environ. 2024, 302, 113992. [Google Scholar] [CrossRef]
Xie, Y.; Lark, J.T.; Brown, F.J.; Gibbs, H.K. Mapping irrigated cropland extent across the conterminous United States at 30m resolution using a semi-automatic training approach on Google Earth Engine. ISPRS J. Photogramm. Remote Sens. 2019, 155, 136–149. [Google Scholar] [CrossRef]
Ozdogan, M.; Yang, Y.; Allez, G.; Cervantes, C. Remote Sensing of Irrigated Agriculture: Opportunities and Challenges. Remote Sens. 2010, 2, 2274–2304. [Google Scholar] [CrossRef]
ThenKabail, P.S.; Biradar, C.M.; Noojipady, P.; Dheeravath, V.; Li, Y.; Velpuri, M. Global irrigated area map(GIAM), derived from remote sensing, for the end of the last millennium. Int. J. Remote Sens. 2009, 30, 3679–3733. [Google Scholar] [CrossRef]
Ambika, A.k.; Wardlow, B.; Mishra, V. Remotely sensed high resolution irrigated area mapping in India for 2000 to 2015. Sci. Data 2016, 3, 160118. [Google Scholar] [CrossRef]
Pervez, S.M.; Brown, F.J. Mapping Irrigated Lands at 250-m Scale by Merging MODIS Data and National Agricultural Statistics. Remote Sens. 2010, 2, 2388–2412. [Google Scholar] [CrossRef]
Liu, Y.; Wu, W.; Li, Z.; Zhou, Q. Extracting irrigated cropland spatial distribution in China based on time-series NDVI. Trans. Chin. Soc. Agric. Eng. 2017, 33, 276–284. [Google Scholar] [CrossRef]
Wang, L.; Cai, M.; Bai, X.; Ji, S.; Jia, B.; Feng, X.; Chang, X. Estimation of Irrigation Area in Hetao Irrigation District Based on MODIS Analysis. J. Inn. Mong. Norm. Univ. (Nat. Sci. Ed.) 2015, 50, 44–52. [Google Scholar] [CrossRef]
Ketchum, D.; Jencso, K.; Maneta, P.M.; Melton, F.; Jones, O.M.; Huntington, J. IrrMapper: A Machine Learning Approach for High Resolution Mapping of Irrigated Agriculture Across the Western U. S. Remote Sens. 2020, 12, 2328. [Google Scholar] [CrossRef]
Simões, C.J.S.; Júnior, P.S.N. Spatial evolution of irrigated areas using remote sensing–the Medium Paraíba do Sul Valley, Southeast of Brazil. Rev. Ambiente Água 2007, 1, 72–83. [Google Scholar] [CrossRef]
Wang, Y.; Zhou, Y.; Wang, S. Parcel-scale Irrigated Farmland Mapping using Time-series Sentinel-2 Images. Water Sav. Irrig. 2023, 11, 91–98. [Google Scholar] [CrossRef]
Deines, M.J.; Kendall, D.A.; Hyndman, W.D. Annual Irrigation Dynamics in the U.S. Northern High Plains Derived from Landsat Satellite Data. Geophys. Res. Lett. 2017, 44, 9113–9520. [Google Scholar] [CrossRef]
Xu, C.; Lyu, J.; Liu, Y.; Jiang, Y. Remote sensing monitoring method of cultivated land irrigation area in Baojixia Irrigation District based on GEE. J. Drain. Irrig. Mach. Eng. (JDIME) 2022, 40, 1167–1172. [Google Scholar] [CrossRef]
Liang, B.; Liu, X.; Hao, Y.; Chu, B.; Tang, Z. Extraction of desert vegetation information based on five vegetation indices. Arid Zone Res. 2023, 40, 647–654. [Google Scholar] [CrossRef]
Xiang, X.; Du, J.; Jacinthe, P.A.; Zhao, B.; Zhou, H.; Liu, H.; Song, K. Integration of tillage indices and textural features of Sentinel-2A multispectral images for maize residue cover estimation. Soil Tillage Res. 2022, 221, 105405. [Google Scholar] [CrossRef]
Liu, S.; Zhu, H. Object-oriented land use classification based on ultra-high resolution images taken by unmanned aerial vehicle. Trans. Chin. Soc. Agric. Eng. 2020, 36, 87–94. [Google Scholar] [CrossRef]
Zhao, Y.; Tian, Z.; Li, W.; Xue, Z.; Zhu, J. Study on the refined classification method of mangrove tree species based on Sentinel-2MSI images combined with object-oriented. Mar. Sci. Bull. 2023, 25, 43–62. [Google Scholar] [CrossRef]
Drăguţ, L.; Eisank, C. Automated object-based classification of topography from SRTM data. Geomorphology 2012, 141–142, 21–33. [CrossRef]
Wang, Y.; Jin, S.; Dardanelli, G. Vegetation Classification and Evaluation of Yancheng Coastal Wetlands Based on Random Forest Algorithm from Sentinel-2 Images. Remote Sens. 2024, 16, 1124. [Google Scholar] [CrossRef]
Mateen, S.; Nuthammachot, N.; Techato, K. Random forest and artificial neural network-based tsunami forests classi-fication using data fusion of Sentinel-2and Airbus Vision-1 satellites: A case study of Garhi Chandan, Pakistan. Open Geosci. 2024, 16, 20220595. [Google Scholar] [CrossRef]
Wei, P.; Ye, H.; Qiao, S.; Liu, R.; Nie, C.; Zhang, B.; Song, L.; Huang, S. Early Crop Mapping Based on Sentinel-2 Time-Series Data and the Random Forest Algorithm. Remote Sens. 2023, 15, 3212. [Google Scholar] [CrossRef]
Xia, T.; Ji, W.; Li, W.; Zhang, C.; Wu, W. Phenology-based decision tree classification of rice-crayfish fields from sentinel-2 imagery in qianjiang, china. Int. J. Remote Sens. 2021, 42, 8124–8144. [Google Scholar] [CrossRef]
Tariq, A.; Yan, J.; Gagnon, A.; Khan, M.R.; Mumtaz, F. Mapping of cropland, cropping patterns and crop types by combining optical remote sensing images with decision tree classifier and random forest. Geo-Spat. Inf. Sci. 2023, 26, 302–320. [Google Scholar] [CrossRef]
Mabula, M.J.; Kisanga, D.; Pamba, S. Application of machine learning algorithms and Sentinel-2 satellite for improved bathymetry retrieval in Lake Victoria, Tanzania. Egypt. J. Remote Sens. Space Sci. 2023, 26, 619–627. [Google Scholar] [CrossRef]
Du, B.; Zhang, J.; Wang, Z.; Mao, D.; Zhang, M.; Wu, B. Crop Mapping based on Sentinel-2A NDVI Time Series Using Object-Oriented Classification and Decision Tree Model. J. Geo-Inf. Sci. 2019, 21, 740–751. [Google Scholar] [CrossRef]
Zhang, S.; Yang, L.; Ye, D.; Zhang, F.; Bai, Y.; Li, H.; Chen, J.; Fang, K. Extraction and dynamics of planting structure in Hetao Irrigation District of Inner Mongolia from 2000 to 2021 using deep learning. Trans. Chin. Soc. Agric. Eng. (Trans. CSAE) 2023, 39, 142–150. [Google Scholar] [CrossRef]
Wu, X.; Hua, S.; Zhang, S.; Gu, L.; Ma, C.; Li, C. Extraction of Winter Wheat Distribution Information Based on Multi-phenological Feature Indices Derived from Sentinel-2 Data. Trans. Chin. Soc. Agric. Mach. 2023, 54, 207–216. [Google Scholar]
Zhu, X.; Ning, X.; Wang, H.; Zhang, H. Land use classification for optimization segmentation based on high-precision land cover data. Sci. Surv. Mapp. 2021, 46, 140–149. [Google Scholar] [CrossRef]
Fan, L.; Wang, Y.; Zhu, H.; Zhang, J. Remote Sensing for the Planting Area of Major Grain Crops in Complex Terrain Regions by Integrating Multiple Spectral Indices with Topographic Features. Chin. J. Agrometeorol. 2023, 44, 845–856. [Google Scholar] [CrossRef]
Wang, H.; Zhang, Z.; Kang, X.; Lin, J.; Yin, C.; Ma, L.; Huang, C.; Lyu, X. Cotton planting area extraction and yield prediction based on Sentinel-2A. Trans. Chin. Soc. Agric. Eng. 2022, 38, 205–214. [Google Scholar] [CrossRef]
Wang, D.; Jiang, Q.; Li, Y.; Guan, H.; Zhao, P.; Xi, J. Land use classification of farming areas based on time series Sen-tinel-2A/B data and random forest algorithm. Remote Sens. Land Resour. 2020, 32, 236–243. [Google Scholar] [CrossRef]
Bai, X. Research of Remote Sensing Image Classification Based on Decision Tree Method. Master’s Thesis, Inner Mongolia Normal University, Hohhot, China, 2013. [Google Scholar] [CrossRef]
Balata, D.; Gama, I.; Domingos, T.; Proença, V. Using Satellite NDVI Time-Series to Monitor Grazing Effects on Vegetation Productivity and Phenology in Heterogeneous Mediterranean Forests. Remote Sens. 2022, 14, 2322. [Google Scholar] [CrossRef]
Ren, H.; Zhou, G.; Zhang, F. Using negative soil adjustment factor in soil-adjusted vegetation index(SAVI) for aboveground living biomass estimation in arid grasslands. Remote Sens. Environ. 2018, 209, 439–445. [Google Scholar] [CrossRef]
Fern, R.R.; Foxley, E.A.; Bruno, A.; Morrison, M.L. Suitability of ndvi and osavi as estimators of green biomass and coverage in a semi-arid rangeland. Ecol. Indic. 2018, 94, 16–21. [Google Scholar] [CrossRef]
Wang, N.; Zhou, M.; Wei, X.; Guo, Y. Extraction of vegetation cover and optimization of vegetation indices in a deser thinterland oasis. Bull. Soil Water Conserv. 2022, 42, 197–205+213. [Google Scholar] [CrossRef]
Zhang, J.; Zhang, L.; Zhang, X.; Li, J.; Yuan, X. Inversion of Aboveground Biomass of Desert Shrub Vegetation in Junggar Basin. J. Xinjiang Agric. Univ. 2019, 42, 202–209. [Google Scholar]
Cao, W. Measures for Forestry Protection and Development in Xinjiang Area. Agric. Sci. Technol. Equip. 2020, 2020, 22–23. [Google Scholar] [CrossRef]
Zhang, Y.; Bian, X.; Zhang, H.; Li, N.; Gao, Q.; Zhang, B. Research on the Application Prospect of Digital Twin in Large Irrigation Area. J. Irrig. Drain. 2022, 41 (Suppl. S2), 71–76. [Google Scholar] [CrossRef]
Xiang, J.; Xing, Y.; Wei, W.; Jiang, J.; Mo, D. Dynamic Detection of Forest Change in Hunan Province Based on Senti-nel-2Images and Deep Learning. Remote Sens. 2023, 15, 628. [Google Scholar] [CrossRef]
Li, G.; Cui, J.; Han, W.; Zhang, H.; Huang, S.; Chen, H. Crop type mapping using time-series sentinel-2 imagery and u-net in early growth periods in the hetao irrigation district in china. Comput. Electron. Agric. 2022, 203, 107478. [Google Scholar] [CrossRef]
Da Costa, L.B.; De Carvalho, O.L.F.; De Albuquerque, A.O.; Gomes, R.A.T.; Guimaraes, R.F.; Abilio, D.C.J.O. Deep semantic segmentation for detecting eucalyptus planted forests in the brazilian territory using sentinel-2 imagery. Geocarto Int. 2022, 37, 6538–6550. [Google Scholar] [CrossRef]
do Nascimento Bendini, H.; Fieuzal, R.; Carrere, P.; Clenet, H.; Galvani, A.; Allies, A.; Ceschia, É. Estimating Winter Cover Crop Biomass in France Using Optical Sentinel-2 Dense Image Time Series and Machine Learning. Remote Sens. 2024, 16, 834. [Google Scholar] [CrossRef]
Liu, P.; Ren, C.; Wang, Z.; Jia, M.; Yu, W.; Ren, H.; Xia, C. Evaluating the Potential of Sentinel-2 Time Series Imagery and Machine Learning for Tree Species Classification in a Mountainous Forest. Remote Sens. 2024, 16, 293. [Google Scholar] [CrossRef]

Figure 1. Overview map of the study area: location (a) Xinjiang, China; (b) within the Changji Hui Autonomous Prefecture, Xinjiang Autonomous Region, China; (c) within the Santun River Basin; and (d) Santun River irrigation area.

Figure 2. Distribution and number of sample sites in the study area from 2019 to 2023. (a) Distribution of sample points for each year from 2019 to 2023. (b) Total sample points from 2019 to 2023. (c) Number of validation sample points from 2019 to 2023.

Figure 3. Workflow of data preprocessing and data set construction, construction of classification algorithms and ground object classification, and irrigation area extraction in the study area.

Figure 4. Optimal segmentation scale for estimation of scale parameter.

Figure 5. Vegetation index time-series curves for eight time points from 24 March to 25 October, 2023. (a) 2023 Normalized Difference Vegetation Index (NDVI); (b) 2023 Soil-Adjusted Vegetation Index (SAVI); (c) 2023 Optimized Soil-Adjusted Vegetation Index (OSAVI).

Figure 6. Comparison of vegetation index time-series curves of transportation land; towns and villages; industrial and mining land; other lands; and non-irrigated points. (a) 2023 Normalized Difference Vegetation Index (NDVI); (b) 2023 Soil-Adjusted Vegetation Index (SAVI); (c) 2023 Optimized Soil-Adjusted Vegetation Index (OSAVI).

Figure 7. Land use classification diagram for each year from 2019 to 2023.

Figure 9. Accuracy of irrigated area identification in the study area for each year from 2019 to 2023.

Figure 11. Spatial extent (km²) of irrigated area and woodland in the study area for each year from 2019 to 2023.

Table 1. Sentinel-2 image capture schedule.

Serial Number	Original Date	Serial Number	Original Date	Serial Number	Original Date
1	19 April 2019	16	16 August 2020	31	5 October 2021
2	28 June 2019	17	31 August 2020	32	10 October 2021
3	28 July 2019	18	30 September 2020	33	18 April 2022
4	27 August 2019	19	5 October 2020	34	18 May 2022
5	21 September 2019	20	15 October 2020	35	22 July 2022
6	6 October 2019	21	30 October 2020	36	5 September 2022
7	16 October 2019	22	13 April 2021	37	15 November 2022
8	31 October 2019	23	23 May 2021	38	24 March 2023
9	8 April 2020	24	2 July 2021	39	28 April 2023
10	28 April 2020	25	1 August 2021	40	18 May 2023
11	8 May 2020	26	11 August 2021	41	27 June 2023
12	28 May 2020	27	31 August 2021	42	12 July 2023
13	2 June 2020	28	5 September 2021	43	16 August 2023
14	22 June 2020	29	15 September 2021	44	5 October 2023
15	17 July 2020	30	25 September 2021	45	25 October 2023

Table 2. Coordinates of typical irrigation and non-irrigation points.

Typical Sample Point Name	Coordinates
Irrigation point 1	43°55′48″ N, 87°1′24″ E
Irrigation point 2	43°59′37″ N, 87°7′1″ E
Irrigation point 3	44°6′39″ N, 87°19′34″ E
Irrigation point 4	44°10′17″ N, 87°14′50″ E
Non-irrigation point 1	44°8′12″ N, 87°3′34″ E
Non-irrigation point 2	44°0′30″ N, 87°11′18″ E
Non-irrigation point 3	44°13′30″ N, 87°11′7″ E
Non-irrigation point 4	44°0′41″ N, 87°13′8″ E

Table 3. Classification system table of study areas.

Types	Description	Sentinal-2A Image	Amap	Google Earth
01 Land under cultivation	The overall hue and color in the remote sensing images of cultivated land usually appears in rich green, brown, and yellow tones based on different plant phenological stages; the texture and structure in images of cultivated land usually exhibit clear continuous block patterns.
03 Woodlands	The overall tone and color in remote sensing images of forest land usually appears in dark green and light green tones; the texture and structure in images of cultivated land usually exhibit a clear stripped pattern.
11 Land for water and water conservation facilities	Lakes and rivers usually exhibit naturally curved or locally straight shapes; reservoirs usually exhibit a more regular geometry, and dams exhibit elongated shapes that are perpendicular to the flow of a stream or river.
12 Land for transportation	In remote sensing images, transportation corridors often show an obvious linear or network structure. Railways and roads appear as continuous lines with a relatively fixed width in the image.
20 Towns and villages and industrial and mining land	The hue of urban land is usually relatively uniform, the scale of rural residential land is usually small, and the distribution is relatively scattered; remote sensing images of industrial and mining land usually show regular shapes and structures, such as rectangles and circles.
23 Other land	Other land types typically have a relatively uniform hue, with good uniformity of spectral reflectance, and show distinct shapes and individual structures in texture and structure.

Table 4. Feature data sets.

Feature Name	Characteristic Variable	Description/Calculation Formula
Spectral Band	B1, B2, B3, B4, B5, B6, B7, B8, B8A, B9, B10, B11, B12	Full Sentinel-2 raw band
Shape Feature	Length/width	Pixel aspect ratio.
	Compactness	Describes the compactness of an image object.
	Density	Describes the distribution of image objects in the pixel space.
	Roundness	Describes how similar an image object is to an ellipse.
	Rectangular Fit	Describes how well an image object matches rectangles of similar size and scale.
Texture	Homogeneity	Reflects the consistency or smooth texture in the image. A higher homogeneity index means that the texture of an image is more evenly distributed in space and the details change less.
	Dissimilarity	Used to measure the dissimilarity of textures in an image. Reflects the contrast or grayscale difference between pairs of pixels in an image.
	Entropy	Reflects the organization structure and arrangement properties of the object surface, which is used to measure the uncertainty or confusion of the pixel gray level in an image.
	Correlation	A statistic describing the degree of correlation of gray levels between pixels, measuring the linear correlation between the gray levels of pixels in an image, that is, the change trend of gray levels.
Vegetation Index	SAVI	Soil-Adjusted Vegetation Index [(B8 − B4) × (1 + L)/(B8 + B4 + L)]
	OSAVI	Optimized Soil-Adjusted Vegetation Index [(B8 − B4) × 1.16/(B8 + B4 + 0.16)]
	NDVI	Normalized Difference Vegetation Index [(B8 − B3)/(B8 + B3)]
	NDBI	Normalized Difference Built-Up Index [(B11 − B8)/(B11 + B8)]
	MNDWI	Modified Normalized Difference Water Index [(B3 − B11)/(B3 + B11)]
Red edge index	NDVI_re1	Normalized Vegetation Red Edge1 [(B8A − B5)/(B8A + B5)]
	NDVI_re2	Normalized Vegetation Red Edge2 [(B8A − B6)/(B8A + B6)]
	NDVI_re3	Normalized Vegetation Red Edge3 [(B8A − B7)(B8A + B7)]

Note: Sentinel-2 band was used in all calculations of vegetation/red edge index.

Table 5. Calculation of irrigated area and statistical data comparison table.

Year	Calculated Irrigated Area (hm²)	Statistics (hm²)	Difference (hm²)	Value of Error (%)
2023	47,900	48,867	967	1.97%
2022	48,908	50,686	1778	3.5%
2021	43,411	43,993	582	1.32%
2020	42,915	44,746	1831	4.09%
2019	44,417	44,526	109	0.24%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yu, L.; Xie, H.; Xu, Y.; Li, Q.; Jiang, Y.; Tao, H.; Aihemaiti, M. Identification and Monitoring of Irrigated Areas in Arid Areas Based on Sentinel-2 Time-Series Data and a Machine Learning Algorithm. Agriculture 2024, 14, 1693. https://doi.org/10.3390/agriculture14101693

AMA Style

Yu L, Xie H, Xu Y, Li Q, Jiang Y, Tao H, Aihemaiti M. Identification and Monitoring of Irrigated Areas in Arid Areas Based on Sentinel-2 Time-Series Data and a Machine Learning Algorithm. Agriculture. 2024; 14(10):1693. https://doi.org/10.3390/agriculture14101693

Chicago/Turabian Style

Yu, Lixiran, Hong Xie, Yan Xu, Qiao Li, Youwei Jiang, Hongfei Tao, and Mahemujiang Aihemaiti. 2024. "Identification and Monitoring of Irrigated Areas in Arid Areas Based on Sentinel-2 Time-Series Data and a Machine Learning Algorithm" Agriculture 14, no. 10: 1693. https://doi.org/10.3390/agriculture14101693

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identification and Monitoring of Irrigated Areas in Arid Areas Based on Sentinel-2 Time-Series Data and a Machine Learning Algorithm

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area Overview

2.2. Data Sources

2.2.1. Sentinel-2 Imagery

2.2.2. Sample Data

2.2.3. Land Use Classification System

2.2.4. Statistics

2.3. Software Used in This Study

3. Research Methods

3.1. Multi-Scale and Optimal Segmentation Scale Selections

3.2. Feature Data Set Construction

3.3. Object-Oriented Classification Algorithms

3.4. Identification of Irrigation Areas in the Study Area

3.5. Classification Accuracy Evaluation

4. Results and Analysis

4.1. Optimal Scale Optimization Results

4.2. Analysis of Temporal Characteristics of Vegetation Indices

4.3. Accuracy Analysis of Land Use Classification and Optimization of the Classification Algorithm in the Study Area

4.4. Identification of Irrigation Areas and Area Changes in the Study Area

5. Discussion

5.1. Comparison and Analysis of Classification Algorithms

5.2. Analysis of the Effect of Vegetation Index on Extraction of Irrigated Area

5.3. Analysis of the Causes of Changes in Irrigation Area in the Study Area from 2019 to 2023

5.4. Uncertainty Analysis and Prospects

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI