A Study on the Object-Based High-Resolution Remote Sensing Image Classification of Crop Planting Structures in the Loess Plateau of Eastern Gansu Province

Yang, Rui; Qi, Yuan; Zhang, Hui; Wang, Hongwei; Zhang, Jinlong; Ma, Xiaofang; Zhang, Juan; Ma, Chao

doi:10.3390/rs16132479

Open AccessArticle

A Study on the Object-Based High-Resolution Remote Sensing Image Classification of Crop Planting Structures in the Loess Plateau of Eastern Gansu Province

by

Rui Yang

,

Yuan Qi

^*,

Hui Zhang

,

Hongwei Wang

,

Jinlong Zhang

,

Xiaofang Ma

,

Juan Zhang

and

Chao Ma

Key Laboratory of Remote Sensing of Gansu Province, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou 730000, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(13), 2479; https://doi.org/10.3390/rs16132479

Submission received: 30 May 2024 / Revised: 26 June 2024 / Accepted: 4 July 2024 / Published: 6 July 2024

Download

Browse Figures

Versions Notes

Abstract

The timely and accurate acquisition of information on the distribution of the crop planting structure in the Loess Plateau of eastern Gansu Province, one of the most important agricultural areas in Western China, is crucial for promoting fine management of agriculture and ensuring food security. This study uses multi-temporal high-resolution remote sensing images to determine optimal segmentation scales for various crops, employing the estimation of scale parameter 2 (ESP2) tool and the Ratio of Mean Absolute Deviation to Standard Deviation (RMAS) model. The Canny edge detection algorithm is then applied for multi-scale image segmentation. By incorporating crop phenological factors and using the L1-regularized logistic regression model, we optimized 39 spatial feature factors—including spectral, textural, geometric, and index features. Within a multi-level classification framework, the Random Forest (RF) classifier and Convolutional Neural Network (CNN) model are used to classify the cropping patterns in four test areas based on the multi-scale segmented images. The results indicate that integrating the Canny edge detection algorithm with the optimal segmentation scales calculated using the ESP2 tool and RMAS model produces crop parcels with more complete boundaries and better separability. Additionally, optimizing spatial features using the L1-regularized logistic regression model, combined with phenological information, enhances classification accuracy. Within the OBIC framework, the RF classifier achieves higher accuracy in classifying cropping patterns. The overall classification accuracies for the four test areas are 91.93%, 94.92%, 89.37%, and 90.68%, respectively. This paper introduced crop phenological factors, effectively improving the extraction precision of the shattered agricultural planting structure in the Loess Plateau of eastern Gansu Province. Its findings have important application value in crop monitoring, management, food security and other related fields.

Keywords:

Gaofen images; object-based image classification; random forest; convolutional neural network; crop classification

1. Introduction

Located in the northwest of China, the Loess Plateau is a typical traditional agricultural area subject to less precipitation, where agricultural development is dominated by dry farming [1]. The Loess Plateau of eastern Gansu Province, as a typical rain-fed agricultural region, is located in a fragile zone of the ecological environment in China, where the precipitation is unevenly distributed in time and space, with a large annual difference and unbalanced annual distribution. Corresponding to a gradual increase in precipitation from northwest to southeast, the crop distribution in this region is subject to an obvious gradient difference from southeast to northwest [2]. To name a few, apples that need plenty of sunshine and moisture are mainly distributed in the southeast; buckwheat that is cold and drought-tolerant and has a short growing period is mainly distributed in the northwest; corn and wheat are the main crop types in the Loess Plateau of eastern Gansu Province, accounting for more than 50% of the total area. They are distributed across the plateau region, with more in the southeast and less in the northwest. The fine classification of crop planting structures is of great significance for agricultural decision-making, sustainable agricultural development, and food security [3].

Boasting a short detection period, low costs, wide coverage, and many other advantages, remote sensing technologies have become one of the most popular methods for mapping crop spatial distribution at a regional scale [4,5,6,7]. At present, the scholars in China and internationally have conducted in-depth and extensive studies on crop planting structure recognition mainly around key issues such as the number of remote sensing images, image object features [8,9,10,11], and classifier selection [12,13].

Among the classification methods based on the number of remote sensing images, crop recognition methods are roughly divided into two categories based on single-temporal images and time-series images, respectively. The former is designed to identify the key growing season for crops [14], and then obtain the spatial distribution information of crops [15,16,17]. At present, medium spatial resolution images represented by Landsat and Sentinel are among the most widely used remote sensing data [18]. With suitable spatial resolutions and spectral bands, they contribute to an increase in the accuracy of regional crop identification. However, as the spectrum of crops is affected by many factors such as the crop type, soil background, farming activities, as well as cloudy and rainy weather, and the information obtained from single-temporal images is limited, misclassification or absent classifications in crop identification results frequently emerge, which can not meet the needs of high-precision crop mapping. Compared with single-temporal image classification, multi-temporal remote sensing images can map crop classifications more accurately based on a full use of the phenological characteristics of crops. China has implemented a major project for a high-resolution earth observation system, and successfully launched a line of domestic high-resolution satellites (GF-1, GF-2, and GF-6) [19,20]. The collaboration of domestic high-resolution satellites can effectively improve the ability to monitor crops, and provide great potential for fine identifications of crops subject to fragmented distribution in complex areas of the Loess Plateau. For example, Zhou et al. [21] found that they were able to effectively extract the information of crops in rainy areas from multi-temporal GF-1 images; Xia et al. [22] used multi-temporal GF-6 images for intensive observations of crops in cloudy and rainy areas, and worked out the rule of seasonal changes of typical crops, thus achieving a high-precision mapping of crop planting structures.

Crop classification methods based on object features mainly include pixel-level classification and object-based image classification. Pixel-level classification focuses on locality and ignores the correlation between objects. Salt-and-pepper noise due to the spectrum difference of in-class pixels and “mixed pixels” resulting from the interclass pixel proximity effect have compromised the classification accuracy of crops significantly, making it difficult to meet the need for practical applications [16,17]. With the segmented object as its basic unit, object-based image classification is more targeted and has the advantages of reducing the amount of data involved in the operation, smoothening the image noise, and further collecting crop information to classify by introducing the features of the object shape, texture, etc. For example, Su et al. [23] made full use of 58 spatial feature factors of crops, and completed the high-precision classification of crops in the region using the object-based image classification method. Karimi et al. [24] enabled high-precision crop classification of multi-temporal Landsat remote sensing images with the object-oriented classification method. Common crop remote sensing classifiers include, but are not limited to, Maximum Likelihood Method, Support Vector Machine (SVM), and Random Forest [25,26,27]. In recent years, deep learning has gradually become the mainstream algorithm in the image pattern recognition area by virtue of its hierarchical representation of features, efficient operation, and end-to-end automatic learning [28,29]. The Convolutional Neural Network (CNN), as one of the fastest-developing algorithms in deep learning, has been widely used in the crop classification task [30]. Both models demonstrate good applicability in interpreting crop planting structures, and the classification results largely correspond to the actual distribution. The classification results exhibit good separability, with fewer instances of misclassification and omission, allowing for better a extraction of crop information in areas with higher heterogeneity.

As one of the prerequisites of object-based image classification, the optimal scale segmentation of crops has a direct impact on the size of generated objects and the precision of crop extraction [31]. ESP2 (estimation of scale parameter 2, ESP 2) is the most commonly used optimal segmentation scale tool that is able to screen out the optimal segmentation scale of different crops [32]. However, in light of the small spectral difference between different types of crops and the great influence of noise in the segmentation results, it is difficult to separate crops correctly only using multi-resolution segmentation technology. Canny edge detection is able to obtain more accurate edge information and filter noise in less time [33]. In addition to combining multiple-feature learning, object-based image classification also increases the dimension of feature space and reduces the efficiency of data processing, of which secondary features may result in noise and even lower the classification accuracy. For this reason, the efficient construction of the feature space has become a key factor of object-based image classification. For example, F. Low et al. [34] used Random Forest (RF) to obtain the best features of crop classification, improving computational efficiency and classification accuracy. Chen Zhulin et al. [35] used three feature dimensionality reduction methods for the dimensionality reduction of crop features, i.e., Random Forest (RF), Muti-Information (MI) and L1 regularization. They found that the L1 regularization feature dimensionality reduction method performed best in crop classification.

In conclusion, the object-based classification and multi-temporal feature classification of high-resolution images are the mainstream directions of current studies on crop remote sensing recognition. How to integrate the two is the current key scientific problem and technical difficulty. Based on the time series remote sensing data sources of GF-1, GF-2, and GF-6, this paper conducted a case study on four representative test areas from the southeast to the northwest of the Loess Plateau of eastern Gansu Province, constructed the optimal segmentation scale set of different crops using the ESP2 tool, RMAS model and Canny Edge Detection, performed feature selection with an L1 regularization logistic regression model and the sophisticated classification of crops in the test areas using the object-based and Random Forest methods, and used a Convolutional Neural Network to cross-verify the results, with an aim to provide new ideas and methods for the regional classification of rain-fed crops in the Loess Plateau of eastern Gansu Province.

2. Materials and Methods

2.1. Overview of the Study Area

The city of Qingyang is located in the Shaan-Gan-Jin plateau zone of the Loess Plateau in China, which is also known as the Loess Plateau of eastern Gansu Province. Located in the east of Gansu Province, the Loess Plateau gully area is in the middle reaches of the Yellow River, with a geographical location of 106°20′–108°45′E and 35°15′–37°10′N, and a total area of 27,119 km² [36]. As a typical dry farmland subject to rain-fed agriculture in Northwestern China, the southern region of the city of Qingyang has a mild climate that features good coordination between light, heat, and water resources, making it the best climate area for developing agricultural production in eastern Gansu Province, known as the “Granary of eastern Gansu”. The timely and accurate acquisition of crop type, area, and spatial distribution information is of great significance for improving the crop planting structure, and implementing the precise management of agricultural production. According to the characteristics of crop distribution, four representative test areas in the city of Qingyang were selected, which were located in Huanxian County (I), Zhenyuan County (II), Xifeng District (III), and Ningxian County (IV), respectively, shown by the gradient distribution from northwest to southeast. Among them, crops in test areas I and II were mainly corn, wheat, and buckwheat, while those in test areas III and IV were mainly corn, wheat, and apples (Figure 1).

The spatial distribution of precipitation in the Loess Plateau of eastern Gansu Province is characterized by more precipitation in the southeast and less in the northwest. As a result, the spatial distribution of crop planting structures also follows certain rules, with buckwheat and other typical crops tolerant to cold and drought distributed in the northwest, and apples and winter wheat in the southeast [37]. Phenology refers to different growth periods of crops in the growing season, during which the growth state and physiological characteristics of crops are translated to different remote sensing features. These features include reflectivity, the vegetation index, etc., which can be used for the recognition of different types of crops [38]. Therefore, in crop classification and recognition, the accuracy and reliability of crop classification can be improved by using phenological changes. Under the double effect of the climate and the crop cycle, different crops vary greatly in phenological periods. To name a few, winter wheat is usually sown in mid-September and ripened in late June of the following year. Corn is sown in mid-April and ripened in late September. Buckwheat is sown in mid-July and ripened in early September. Apples blossom in April–May and ripen in September–October (Figure 2).

2.2. Data

High-resolution remote sensing images have attracted wide attention by virtue of their ability to provide more detailed information on objects. Gaofen-1, 2, and 6 are the high-resolution remote sensing satellites launched and operated by China in recent years. These satellites, capable of acquiring high-precision remote sensing data [19,39] can be used to study the classification and identification of the planting structures of crops in the city of Qingyang. This experiment conducted a study on crop classification with high-resolution images. The Gaofen-1 Satellite, launched in 2013, has a panchromatic image resolution of 2 m and a single satellite imaging swath width of more than 60 km. The Gaofen-2 Satellite, launched in 2014, has a panchromatic image resolution of 1 m and a single satellite imaging swath width of more than 45 km. The Gaofen-6 Satellite, launched in 2018, has a panchromatic image resolution better than 2 m and a width larger than 90 km (Table 1). With good technical indicators and data quality, these high-resolution remote sensing satellites can meet the needs of remote sensing classification and the recognition of crop planting structures in the city of Qingyang.

2.3. Methodology

This experiment classified four typical crops—corn, wheat, buckwheat, and apples—using data sources from high-resolution images. In addition to crop types, the four test areas also include natural vegetation and artificial construction land, but no lake water systems. Considering the different phenological periods of the crops, this study proposes a multi-level crop classification method. It involves the multi-level segmentation of images using different segmentation scale parameters, forming an inherited hierarchical structure between object layers. By combining the phenological characteristics of each crop at different periods, classification at different levels is achieved.

The main steps for multi-level crop classification are as follows: First, create Level 1 based on scale-1 segmentation, distinguishing artificial construction land, bare land, and vegetation areas using the NDVI < 0.15 threshold. In vegetation areas, classify the crops using images of crops at corresponding phenological periods as the base layer. In other areas, re-distinguish artificial construction land, bare land, and vegetation areas according to the threshold conditions. Successively create Level 2 and Level 3 in the vegetation areas, repeating the process until the classification of different crops is completed (Figure 3).

Based on multi-level classification, the classification steps at each layer mainly involve determining the optimal segmentation scale parameters for different crops using the ESP2 optimal scale evaluation tool and RMAS model with eCognition (eCognition Developer 10.3) software and data sources from high-resolution images. It also includes segmenting different crops by combining the Canny Edge Detection results from different phenology periods. This study selected 40 feature factors, using changes in the phenological periods of different crops (the NDVI ratio during phenological periods) as the phenological feature factor, due to the significant differences in phenological periods among crops. The optimal spatial features for the remaining 39 spatial feature factors were selected using the L1-regularized logistic regression model. Finally, the phenological feature factors and the selected spatial features were used to facilitate subsequent crop classification.

This study selected 150 samples each of corn, wheat, buckwheat, and apples, as well as 40 samples of natural vegetation, all of which were obtained through field surveys. To ensure sufficient samples for verification, the samples were divided into the training and test sets in a 1:1 ratio. The RF algorithm was employed to interpret and classify the crop planting structure in four test areas of the city of Qingyang, in conjunction with a CNN and an object-based optimal segmentation scale. Cross-validation was performed for the results of both methods. The confusion matrix, overall accuracy, and Kappa coefficient were used to evaluate the accuracy of the two model algorithms. Figure 4 shows the technical flow chart of this test.

2.3.1. Canny Edge Detection

The Canny edge detection algorithm aims to ensure that the detected edge is close to the actual edge and to maintain its continuity. It minimizes break-points and effectively controls the probability of misses and false detection. The steps of the algorithm are as follows:

There is usually noise in the images, which, along with the gradient value of nearby pixels, is relatively large, and easy to be mistakenly identified as an edge. To solve this, the Gaussian smoothing filter is employed to lower the noise [40]. The core of the Gaussian smoothing filter is the Gaussian function, the formula of which is expressed as follows:

G (x, y) = \frac{1}{2 π σ^{2}} e x p (- \frac{x^{2} + y^{2}}{2 σ^{2}})

(1)

Wherein (x, y) denotes the coordinates of a convolution with a Gaussian kernel and σ denotes the standard deviation.

Calculate the first-order derivative of an image in X and Y directions using a 2 × 2 convolution kernel to determine the gradient amplitude of the pixel and obtain the gradient of the image in the X and Y directions E_X E_y [41]. Then, the gradient magnitude M and gradient direction θ of each pixel can be calculated using the following formula:

M = \sqrt{E_{X}^{2} + E_{y}^{2}}

(2)

θ = a r c t a n \frac{E_{y}}{E_{X}}

(3)

To ensure the accuracy of edge detection, non-maximum suppression is necessary [42]. This process segments the gradient values and looks for the maximum value in the entire gradient magnitude map. Near these maxima, there may be areas called “Ridge Strips”, and these non-maxima need to be set to zero, retaining only the local gradient maximum. We enable non-maximum suppression by calculating the local maximum of the pixel and finally identifying the pixel with the largest local gradient for retention. Edge detection following the above steps identifies the true edges of an image with relative precision, but some edge pixels are still affected by various factors. To avoid false edges, the Hysteresis Threshold Detection Strategy is adopted. This strategy uses dual threshold detection to filter out unnecessary edge information from pixels [43,44].

2.3.2. Image Segmentation

Compared with single-scale segmentation, multi-resolution segmentation (MRS) is an image segmentation algorithm that forms multiple geographic object layers by fully utilizing information such as object structures, textures, spatial characteristics, and the relationship between adjacent objects, aiming to better reflect the actual physical form of land cover in remote sensing images [45,46]. This study employs the MRS algorithm of eCognition Developer 10.3 software, where the core parameters include layer weights, segmentation scale and homogeneity criterion combination parameters, with the latter two having the most significant influence on segmentation results. During execution, the MRS algorithm starts by merging a single pixel from the bottom up and fully optimizing the image segmentation according to homogeneity and heterogeneity, achieving pixel aggregation at different scales within the same image.

2.3.3. Optimal Segmentation Scale Estimation

Developed by Drǎguţ et al. [47], the estimation of scale parameter 2 (ESP2) is a tool used to estimate the maximum heterogeneity in image segmentation results at different scales [47]. This tool evaluates the change rate of segmentation scales based on the mean of global and local variances of the image segmentation results, with the peak value of the change rate curve representing the maximum heterogeneity of the entire image object. The diversity of images often results in multiple values for the optimal segmentation scale calculated by ESP2 [48]. Therefore, a quantitative evaluation of segmentation results using the ROC calculation method is required to better understand the reliability and accuracy of the segmentation results. The optimal segmentation scale is determined when the ROC curve reaches its maximum value. The formula for ROC is as follows:

R O C = [\frac{L - (L - 1)}{L - 1}] \times 100

(4)

Wherein L denotes the local variance of the target layer; L − 1 denotes the local variance of the layer next to the L target layer as the base.

For the spatial and spectral feature information of medium- and high-resolution remote sensing images, measure the internal homogeneity and heterogeneity of objects based on standard deviation and absolute mean difference. Quantify the methods affecting the quality of image segmentation results using the Ratio of Mean Difference to Neighbors (ABS) to Standard Deviation (RMAS) between segmentation objects and neighborhoods [49]. Using RMAS as the objective function, this method selects the optimal scale from the segmentation results of different sizes. When the internal segmented object is similar in spectral and textural characteristics, its internal standard deviation value is small and it is largely distinguishable from the adjacent objects. At this point, RMAS is at its maximum, and this scale is considered the optimal segmentation scale.

R M A S = \sum_{i = 1}^{f} w_{L} \frac{Δ C_{L}}{S_{L}}

(5)

Δ C_{L} = \frac{1}{k} \sum_{i = 1}^{m} \sum_{j = 1}^{m} k_{i j} | \bar{C_{L}} - C_{L i} |

(6)

S_{L} = \sqrt{\frac{1}{n - 1} \sum_{i = 1}^{n} {(C_{L i} - \bar{C_{L}})}^{2}}

(7)

Wherein L denotes the number of bands of remote sensing images, w_L denotes the band weights corresponding to different bands, f denotes the number of bands, ∆C_L denotes the absolute difference between the image object of the Lth band and the mean value of the neighborhood, S_L denotes the standard deviation of the image object of the Lth band, C_Li denotes the gray value of the ith pixel of the Lth band,

\bar{C_{L}}

denotes the band mean of the Lth band, N denotes the number of pixels in the segmented object, m denotes the number of objects adjacent to the segmented object, k denotes the boundary length of the segmented object, and k_ij denotes the length of the common side of the ith image object and the jth adjacent object.

2.3.4. Feature Factor

Due to the diversity of crops, they often exhibit similar spectral characteristics at certain stages. By using the Normalized Difference Vegetation Index (NDVI) as a phenological characteristic, it can screen baseline date images for specific crops and be used as a feature factor to effectively distinguish different crops at specific stages [50]. This study introduced the phenological characteristics of different crops as key factors for classification. Additionally, the study established a comprehensive object feature set by synthesizing the spectral, texture, and geometric features of images [51,52]. It preliminarily selected 39 spatial feature factors covering four kinds of features: spectrum, texture, geometry, and index. The details are as follows:

(1): Spectral features (SPECs): Includes the mean of four bands of the visible spectrum, namely, the mean of the red band (Mean_R), the mean of the green band (Mean_G), the mean of the blue band (Mean_B), the mean of the near-infrared band (Mean_NIR), and the maximum difference value (Max_diff), the brightness value (Briahtness), and the Standard Deviation (Std) of different bands.
(2): Texture features (GLCMs, GLDVs): Texture features refers to the spatial relationship between gray levels of adjacent pixels, which reflects a regional feature rather than that of a single pixel. It is determined by the distribution of a given pixel and its adjacent pixels. The most common methods for texture features include the Gray level co-occurrence matrix (GLCM) and the Gray level difference vector (GLDV). This paper selects the All dir GLCM Mean, GLCM Ent, GLCM Homo, GLCM Std, GLCM Dissim, GLCM Contrast, GLCM Ang. 2nd Moment and GLCM Corr, the all dir. GLDV and GLDV Mean, GLDV Ent, GLDV Contrast, and the GLDV Ang. 2nd Moment.
(3): Geometric features (GEOMs): A total of 13 shape and scope features of objects, including their area, length/width, length, width, border length, the Shape Index, Density, Asymmetry, Roundness, the Boundary Index, compactness, Ellipse Fitting and Rectangle Fitting.
(4): Index features (INDEs): Including the Enhanced Vegetation Index (EVI), Normalized Difference Vegetation Index (NDVI), Red/Green ratio (Red/Green, R/G), and Ratio Vegetation Index (RVI).

The details of these are shown in Table 2:

This study selects the texture, shape, geometry, index and other feature factors, refers to the actual crop types of different crop segmentation results, and uses the L1-regularized logistic regression model to optimize the features of different crops. By constraining the weight of the model, it also generates the sparse weight matrix and lowers the order of the polynomial to a reasonable range. In regularization models, variables with coefficients other than zero are considered to be important. This regularization method helps effectively reduce the complexity and instability caused by high-dimensional input, mitigate the risk of overfitting, and screen out important variables of practical significance. For a given dataset D = {(x_1, y_1)┤,(x_2,y_2),├(x_m, y_m)}, an embedded feature selection model based on L1 regularization can be expressed as:

\min_{ω} \sum_{i = 1}^{m} {(y_{i} - ω^{T} x_{i})}^{2} + λ {‖ ω ‖}_{1}

(8)

Wherein ω is the coefficient vector for a linear fit, λ is the regularization parameter, and

{‖ ω ‖}_{1}

is the first-order norm of the coefficient vector.

2.3.5. Random Forest Classification

The Random Forest (RF) classifier is a parameter-less classifier based on random sampling techniques and multiple independent decisions. The RF classifier predicts the target category by building several independent decision trees. Each decision tree is built from a randomly selected subset of samples and features. Each decision tree is constructed from a sub-training set extracted through the bootstrap sampling of all training samples. The classification error of the decision tree is calculated using the samples that fall outside the sub-training set. During classification, the RF classifier synthesizes the independent predictions of all decision trees and determines the most likely category of the target by voting. The following two parameters are required for building an RF classifier: (1) the number of features n selected for each decision tree, and (2) the total number of decision trees k. In this experiment, k is set to 100, and n is equal to a single randomly splitting variable. The goal is to minimize generalization errors and correlations between trees, and to prevent overfitting in the classification process [53].

2.3.6. Convolutional Neural Network

A Convolutional Neural Network (CNN) is a mathematical processing model that simulates the structure and function of biological neural networks. With a shared convolutional kernel, a CNN has an excellent high-dimensional data processing capacity, and reduces the need for complex operations such as data preprocessing and additional feature extraction. It boasts efficient automatic feature extraction, a strong self-learning ability, and high prediction precision and accuracy [54].

A CNN is mainly comprised of the convolutional layer, pooling layer, and fully connected layer. As the core layer of a CNN, the convolutional layer performs convolution operations on the input matrix through several different convolutional kernels, covering most of the computation in the network, and obtains feature information from the data via an activation function [55]. The ReLU (Rectified Linear Unit) activation function for the nth element of the ith layer is expressed as:

X_{n}^{i} = f (\sum_{t \in M_{n}} X_{t}^{i - 1} K_{t n}^{i} + b_{n}^{i})

(9)

Wherein

M_{n}

denotes the nth convolution region of the i-1st layer,

X_{t}^{i - 1}

denotes an element in this region,

K_{t n}^{i}

denotes the weight matrix of the convolution kernel, and

b_{n}^{i}

denotes the bias term for the network.

The pooling layer, i.e., the subsampling layer, is used to reduce the model weight parameters, enable network lightweighting on the premise of retaining features, speed up the operation, avoid overfitting, and output new feature information with less data [56]. Features extracted from sampling at the pooling layer are expressed as follows:

X_{n}^{i} = f (w_{n}^{i} down (X_{t}^{i - 1}) + b_{n}^{i})

(10)

Wherein

w_{n}^{i}

denotes the weight matrix of the network, and down (

X_{t}^{i - 1}

) denotes the subsampling function. After extracting features of data information through multiple convolutional layers and pooling layers, the fully connected layer assembles and classifies all local features by the weight matrix. Hidden layers contained in the fully connected layer can effectively improve the model’s generalization ability.

Of the samples from the test area, 50% are randomly selected as training samples, with each sample sized at 64 × 64 pixels. To maintain the high-resolution features of the images, we only rotated the images to increase the number of samples. The rotation angle was adjusted every 30° within a range of 0–330°. After extensive testing, we found that the optimal effect is achieved by setting the number of hidden layers to 3 and the convolution kernel size to 5. Upon extracting crops using a CNN, we obtain the heatmap of the classification results. We set the Membership Function range to (0–1) and then achieve the final crop classification results by using the CNN heatmap through membership degree classification in conjunction with object-based multi-resolution segmentation.

2.3.7. Accuracy Evaluation

To evaluate the results of the two models, this study adopts both qualitative and quantitative approaches. Qualitative evaluation compares and cross-verifies the quality of the two models. Quantitative evaluation assesses the accuracy of classification results using no less than 50% of the field-validated samples in the segmentation results of the test area.

Due to the uneven quantity of samples across different categories, the confusion matrix based on classification results was used to evaluate the accuracy, and the overall accuracy (OA) and Kappa coefficient indexes were calculated. The Kappa coefficient and the OA reflect the overall classification performance, and their values can be calculated using the confusion matrix. The greater the indicator value, the higher the classification accuracy of the category [57]. These indicators are calculated using the following formulas:

OA = \frac{\sum_{i = 1}^{K} c}{N_{t o t a l}} \times 100 %

(11)

Kappa = \frac{N_{t o t a l} \sum_{i = 1}^{K} N_{i i} - \sum_{i = 1}^{K} N_{i +} N_{+ i}}{N_{t o t a l^{2}} - \sum_{i = 1}^{K} N_{i +} N_{+ i}}

(12)

Wherein

N_{i i}

denotes the number of samples correctly classified,

N_{i +}

denotes the number of real samples of Category i,

N_{i +}

denotes the predicted number of samples of Category i, N_total denotes the total number of samples and K denotes the total number of categories.

3. Results and Analysis

3.1. Segmentation Results Combined with Canny Edge Detection

When the object information is complex, muti-resolution segmentation often performs poorly. For example, the segmentation results of a crop may contain more noise and have fuzzy outlines. To address this issue, our study uses edge information detected by a Canny operator as a feature factor in multi-resolution segmentation.

This method preliminarily determines the optimal segmentation scale range of crops using the ESP2 tool before segmentation, and calculates the optimal segmentation scale for each crop using the RMAS model. The ESP2 tool mainly uses the object homogeneity local variance (LV) and its Rates of Change (ROCs) as the quantitative evaluation indicator for screening, while considering the shape factor and compactness factor of 0.1 and 0.5, respectively. To evaluate the relative contribution of each band and the edge detection in the images, the weights of the R, G, B, NIR bands, and Canny Edge Detection are set to 1. When the ROC of the local variance reaches its maximum, more reasonable segmentation results can be obtained. Therefore, this study selected the segmentation scale corresponding to the peak ROC value as the optimal segmentation scale for the images, while prioritizing the LV indicator (Figure 5). The optimal segmentation scales for preliminary screening were 35, 45, 70, 95, 105, 125, 135, 170, and 220. Within this range, the scale increases successively in increments of 5. After repeated tests, it was determined that the optimal crop segmentation scale range is between 35 and 95. Within this range, we calculate the RMAS value of different crops successively and take the maximum RMAS value as the optimal segmentation scale, as shown in Figure 6. Upon final calculation, the optimal segmentation scale is 35 for buckwheat, 65 for wheat and apples, and 55 for corn.

Multi-resolution segmentation results before and after Canny edge information detection are integrated based on the optimal segmentation scale (Figure 7). The comparison includes that under the same segmentation scale, image objects obtained only by multi-resolution segmentation cannot effectively distinguish adjacent features. In the results where object edge information participates in segmentation, the objects are more complete, have clear outlines, stronger separability, and better segmentation qualities, in addition to reduced salt-and-pepper noise.

3.2. Feature Factor Optimization

This study gathered the statistics of the NDVI values of different crops and the NDVI ratios of adjacent months to understand the growth statuses and changes of crops. The changing trend of the NDVI and the month with the largest change rate were used as reference images for crop classification. As indicated by statistical calculation (Figure 8), the reference images of wheat, corn, apples, and buckwheat were selected in May, July, August, and September, respectively. On this basis, the NDVI ratios in different phenological periods were used as feature factors to facilitate the classification and interpretation of crops and improve classification accuracy.

When higher-dimensional features are selected, this leads to redundancy among the features. To reduce the redundant computation, this study uses the L1-regularized logistic regression model [58] to conduct the quantitative analysis of 39 features. After many tests, we found that better results can be obtained by setting regularization parameter C to 0.9 and the number of iterations to 1000. Considering that feature factors vary between crops in the classification process, we optimized the feature factors of each crop and generated statistics of the contribution rates of feature factors for different crop categories. Figure 9 shows the feature factors of different crops screened using the L1-regularized logistic regression model.

Statistics show that the number of feature factors after optimization for the four crops is greatly reduced, with texture and spectral features accounting for a large proportion, followed by geometric and exponential features. The number of feature factors selected for corn, wheat, apple and buckwheat are 9, 12, 10, and 16, respectively. As shown in Figure 9, corn has a total of nine auxiliary feature factors, dominated by texture and spectral feature factors, namely GLDV_Ang_2 and Mean_NIR, with weights of 3.89 and 1.24, respectively. Since corn ripens mainly in July and August, during this period, corn appears as a deep red false-color image with obvious texture features. GLCM Ang. 2nd Moment reflects the uniformity of the texture in the image and can be used as a cofactor to identify corn crops along with the reflection characteristics of the near-infrared band. There are a total of 13 auxiliary classification feature factors for wheat, dominated by spectral and texture factors. Its main spectral features include Mean_NIR and Max_diff, with weights of −3.95 and 1.69, respectively. During the growing season of wheat, which mainly ranges from March to May, wheat appears to be light red in color and has an even texture, making it easier to distinguish than other crops. In addition to using high near-infrared band characteristics, we can effectively support wheat classification by using the maximum band difference value. The texture features are mainly the GLDV_Mean_ with a weight of −1.18. The features of apples are dominated by GLCM_StdDe and Standard_NIR, with weights of −1.61 and −1.97, respectively. During the growing season of apples, which mainly ranges from August to September, the texture of apples is more significant than other crop features, and the changes in band standard deviation provide a good reflection of the information on apples. The dominant factors of buckwheat are the RVI, GLDV_Mean_, Layer_NIR, and GLCM_Dissi, with weights of 3.23, 2.54, −2.14, and 2.82, respectively. This is because buckwheat mainly ripens in September and October, which is also the ripening season of apple and corn. Buckwheat, on the other hand, is more delicate in texture than other crops, appearing pink in false-color images, while other crops appear darker. The RVI, which reflects the growth state of crops, has the highest weight. Additionally, the Mean_NIR among spectral features is more effective in classification because the reflectance of crops is usually higher in near-infrared bands.

3.3. Object-Based Crop Classification Results

The classification results from the four typical test areas (Figure 10) indicate that both the RF and CNN models categorize crops into regular blocks, which align with the orderly distribution of crop planting structures in the test plots. To further assess the classification accuracy of different models, field sample data were used for accuracy validation. The confusion matrix evaluation method was employed, using the overall accuracy and Kappa coefficient as evaluation parameters to assess the accuracy of the RD and CNN. The accuracy validation results for the four test areas are shown in Table 3.

According to Table 3, both the RF and CNN show high accuracy in classifying crops in test area I. Overall, regardless of the Kappa coefficient or the overall classification accuracy, the classification results obtained by the RF model are higher than those obtained by the CNN model in the four test areas. In test area I, the Kappa coefficient calculated for each crop category is higher for the RF model than for the CNN model. Overall, the Kappa coefficients calculated by the RF model and the CNN model are 0.89 and 0.87, respectively. The overall accuracy calculated by both models exceeds 90%. Test areas II and I have the same crop categories, but the former has a smaller planting area of buckwheat than the latter. The Kappa coefficient for buckwheat classification by both models is smaller than that for the other two crops because buckwheat is planted in a small area with insufficient training samples. Due to the small number of samples, model overfitting occurs, resulting in low accuracy [59]. However, both the RF and CNN models have high overall classification accuracies, reaching 94.92% and 93.43%, respectively. Test areas III and IV dominated by wheat, corn, and apples feature high overall classification accuracies. In test area III, the overall classification accuracy of the RF and CNN models reached 89.37% and 88.94%, respectively, while in test area IV, the accuracy was 90.68% and 90.18%, respectively.

This paper conducts a comparative analysis of the accuracy of the results using the two methods and provides statistics on the area and proportion of four test areas (Figure 11, Figure 12, Figure 13 and Figure 14). In the test area I, among wheat, corn, and buckwheat, corn has the largest planting area in the classification results. The classification area of the RF model is 0.56 km², accounting for 22.74%, and that of the CNN model is 0.46 km², accounting for 18.54%. The test area II has the same crop planting structure as test area I. Similarly, corn has the largest area in the classification results. The corn area classified using the RF model is 0.86 km², accounting for 34.84%, and that of the CNN model is 0.92 km², accounting for 37.19%. Buckwheat has the smallest area, with the area classified by the RF and CNN models being 0.08 km² and 0.20 km², accounting for 3.12% and 7.94%, respectively. In test area III, corn has the largest area compared to any other crop. The area classified by the RF and CNN models is 0.73 km² and 0.77 km², respectively, The area of apples classified using these two models is 0.43 km² and 0.33 km², respectively. In the results of the two models, the area of corn classified by the RF model is larger than that of the CNN model, while the area of apples classified by the RF model is smaller than that of the CNN model. According to field investigation statistics, there are more planting structures of corn and apples in this area. The main reason is that both corn and apples mature in August and September, and appear similarly in false-color remote sensing images during this period. Compared with the RF model, the CNN model can better adapt to complex data patterns. In addition to the crops studied, other vegetation also exists in the test area. Due to the interference of diverse and complex vegetation in the classification process, the RF model is not as effective in fully classifying crops. In test area IV, apples have the largest area. The area classified by the RF and CNN is 0.80 km² and 1.02 km², respectively. As can be seen from the classification result graph, the area of independent apple blocks classified by the CNN model is larger. According to field investigation and the interpretation of remote sensing image satellites, the planting structures in this test area were regularly distributed in blocks. As the CNN model grouped adjacent crops into a unified entirety, this may be the main reason why the CNN model has a larger classification area for apples than the RF model. Overall, the classification results of four test areas vary as two models use different algorithms. The difference is particularly significant in the test areas where corn and apples are planted. However, in terms of the spatial distribution and quantity classification of crops, the classification results of the CNN model and the RF model are highly consistent and integral.

4. Discussion

4.1. Evaluation of the RF and CNN Model Results

By virtue of its excellent robustness, the RF model helps effectively process noise data. Although accurate crop classification results are obtained by this model [60,61], crops in the study area are usually simple in planting structure or identical in their growing periods, making the classification targets relatively easy to distinguish. Based on scale segmentation and the selection of feature factors, this study newly introduced the phenological feature factor to effectively distinguish crop planting structures in the study area. Similarly, the CNN model with its strong learning capacity not only enables the autonomous learning of crop features but also captures complex nonlinear relationships for the effective classification of crops with complex features [62,63]. Nonetheless, the CNN model still requires improvement in accurately delineating crop boundaries at different scales. By combining the CNN model with multi-scale segmentation, we achieved the fine classification of crop planting structures.

Both models exhibit good applicability in interpreting crop planting structures, and the classification results largely correspond to the actual distribution. The classification results demonstrate good separability, with fewer instances of misclassification and omission. This allows for better extractions of crop information in areas with higher heterogeneity. In comparison, the accuracy of the RF model is slightly higher than that of the CNN model overall, and the classification results based on the RF are closer to the actual distribution of crops in the study area. The results obtained through these two approaches have a good overall effect in effectively distinguishing the categories of corn and apples with little spectral differences and overcoming the “salt-and-pepper noise”, making the boundary information of different crops clearer and yielding more accurate results. At the same time, the CNN model has a higher demand for computing power and a larger sample size and is more suitable for vegetation classification issues with a larger scale and more sample data. Limited by the small size of the test area and insufficient sample data, the accuracy of the CNN remains low. Moreover, Lin Yi et al. [64] also found that in the fine classification of vegetation in small-scale areas with small-size samples, the traditional machine learning method perform better than the deep learning method. Although the RF model has better classification results than the CNN model in the four test areas, this study still has some limitations such as its small sample size and lack of transferable remote sensing image datasets. As a result, the fine classification of crops in small areas based on the CNN model fails to fully leverage its advantages. However, deep learning approaches have great potential for application in image classification [65]. On the one hand, traditional machine learning approaches require the manual selection of samples for feature extraction, while deep learning allows for automatically extracting high-level semantic features and performs better on large-scale datasets [66]. On the other hand, traditional machine learning approaches require the manual selection of samples for training during each classification and have poor generalization ability, while the features learned by deep learning approaches have better generalization ability and stronger robustness. Therefore, we can expand the scope of crop classification datasets in the future, and conduct research based on better deep convolutional networks to leverage the value of deep learning approaches for crop classification in more regions.

4.2. Evaluation of the Effect of Combining Scale Segmentation with Feature Optimization

Compared with traditional pixel-based crop classification studies, object-based image classification fully considers the spectral, texture, and geometric features of different crops, greatly improving classification accuracy [67,68]. However, to accurately improve the features of crops, it is important to better segment and effectively distinguish different crops in the image. Despite good results in image segmentation through algorithms at multiple scales, issues like misclassification, absent classification, and the loss of detail caused by low image resolution still occur [69,70]. Additionally, only single-cropping crops in the study area are classified. This study proposes integrating the new crop edge profile information obtained by the Canny edge detection algorithm into the segmentation process. Crop edge profile information can be used as a cofactor to classify crop planting structure more accurately, but different crops require different segmentation scales. However, crop segmentation results under the same segmentation scale often show over-segmentation or under-segmentation. The ESP2 optimal scale evaluation tool combined with the RMAS model can effectively segment the contour information of different crops. Figure 15 shows the false-color original images of different crops. Figure 16 shows the results at the optimal segmentation scale for different crops combined with Canny edge detection. It can be seen from the figures that the contour edges of different crops are relatively clear, and the results are consistent with the actual ground objects. The classification results obtained based on the characteristics of multi-resolution segmentation are further improved under the multi-level classification framework.

Due to the similar false-color images of corn and apples, the texture features of both crops occupy a high proportion in the feature factors. According to the statistics of the GLCM Ang. 2nd Moment feature factor values of corn in the four test areas (Figure 17), test area I has the smallest value, and test area IV has the largest. This is because the Loess Plateau of eastern Gansu Province is a rain-fed crop area with better soil quality and fertility than the northwest region, creating better conditions for crop growth and resulting in larger single-crop areas in the southeast. Therefore, the area of a single corn block in test area I is smaller than that in test area IV. The larger block contains more surface details and texture information, making the texture features rougher [71,72]. Test area II shows consistency with test area III in terms of its size and has a corn area slightly smaller than that in test area III, resulting in a slightly larger feature factor value. The dominant texture feature factor of apples is GLCM_StdDe. Similarly, test area IV has more pixels, with a significant difference in texture features, and a relatively large standard deviation. Hence, the texture feature value of test area IV is larger than that of test area III.

Furthermore, crop features enhance the interpretative nature of crops. The classification of crops based on better feature selection can not only reduce the dimensions of feature space and computational complexity but also improve classification accuracy [73]. However, most of the features after dimensionality reduction are applied jointly to the classification of various crops [68]. In this study, the L1-regularized logistic regression model was used to select better feature factors for each crop. These, along with the phenological feature factor, laid the foundation for using feature factors to assist in crop classification.

5. Conclusions

This study selected four typical test areas in the city of Qingyang for research. It employed the optimal segmentation scale calculated by the ESP2 tool and the RMAS model, combined with Canny edge detection, to segment the crops in the study area, while selecting better spatial features with the L1-regularized logistic regression model. It also incorporated phenological feature information and analyzed the classification results from RF and CNN models under the OBIC framework using multi-level classification. The key conclusions are as follows:

(1): The multi-resolution segmentation that integrates the Canny edge detection algorithm helps improve the boundary integrity and separability of segmented objects. In addition, the best segmentation results of corn, buckwheat, wheat, and apples are obtained at the segmentation scales of 55, 35, 65, and 65, respectively.
(2): The redundancy of feature factors for different crops has been greatly reduced after optimization. The best classification results are achieved by combining phenological feature factors with reference images of different crops.
(3): The two classification models under the multi-level classification framework ensure high accuracy, with the RF model being overall superior to the CNN model. Future studies can focus on further refining the models and methods to improve the accuracy and applicability of crop classification.

Although this study has mitigated salt-and-pepper noise to some extent through multi-scale segmentation combined with Canny Edge Detection, the generated objects still diverge from crop “plots,” and the classification is less automated. In more complex crop areas with larger sizes and frequent cloud and rain conditions, it is important to continuously optimize algorithms, improve the automation and accuracy of remote sensing mapping, and expand the potential application value of high-resolution data in China.

Author Contributions

Conceptualization, Y.Q.; methodology, R.Y. and H.Z.; software, H.W.; validation, J.Z. (Jinlong Zhang); formal analysis, X.M.; investigation, J.Z. (Juan Zhang); data curation, C.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Provincial Industrialization Application Project of China High-Resolution Earth Observation System (CHEOS) of the State Administration of Science, Technology and Industry for National Defense of PRC (Grant No. 92-Y50G34-9001-22/23).

Data Availability Statement

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhu, Y.; Jia, X.; Qiao, J.; Shao, M. What Is the Mass of Loess in the Loess Plateau of China? Sci. Bull 2019, 64, 534–539. [Google Scholar] [CrossRef] [PubMed]
Zhang, Q.; Wei, W.; Chen, L.; Yang, L. The Joint Effects of Precipitation Gradient and Afforestation on Soil Moisture across the Loess Plateau of China. Forests 2019, 10, 285. [Google Scholar] [CrossRef]
Li, C. Research of the Development of the Western Agriautural Industrization—Taking Qingyang City of Gansu Province as An Example. J. Northwest AF Univ. (Soc. Sci. Ed.) 2010, 10, 37–41. [Google Scholar] [CrossRef]
Hu, Q.; Sulla-Menashe, D.; Xu, B.; Yin, H.; Tang, H.; Yang, P.; Wu, W. A Phenology-Based Spectral and Temporal Feature Selection Method for Crop Mapping from Satellite Time Series. Int. J. Appl. Earth Obs. Geoinf. 2019, 80, 218–229. [Google Scholar] [CrossRef]
Ozdarici-Ok, A.; Ok, A.O.; Schindler, K. Mapping of Agricultural Crops from Single High-Resolution Multispectral Images-Data-Driven Smoothing vs. Parcel-Based Smoothing. Remote Sens. 2015, 7, 5611–5638. [Google Scholar] [CrossRef]
Blickensdörfer, L.; Schwieder, M.; Pflugmacher, D.; Nendel, C.; Erasmi, S.; Hostert, P. Mapping of Crop Types and Crop Sequences with Combined Time Series of Sentinel-1, Sentinel-2 and Landsat 8 Data for Germany. Remote Sens. Environ. 2022, 269, 112831. [Google Scholar] [CrossRef]
Faqe Ibrahim, G.R.; Rasul, A.; Abdullah, H. Improving Crop Classification Accuracy with Integrated Sentinel-1 and Sentinel-2 Data: A Case Study of Barley and Wheat. J. Geovisualization Spat. Anal. 2023, 7, 22. [Google Scholar] [CrossRef]
Ji, S.; Zhang, C.; Xu, A.; Shi, Y.; Duan, Y. 3D Convolutional Neural Networks for Crop Classification with Multi-Temporal Remote Sensing Images. Remote Sens. 2018, 10, 75. [Google Scholar] [CrossRef]
Yang, S.; Gu, L.; Li, X.; Jiang, T.; Ren, R. Crop Classification Method Based on Optimal Feature Selection and Hybrid CNN-RF Networks for Multi-Temporal Remote Sensing Imagery. Remote Sens. 2020, 12, 3119. [Google Scholar] [CrossRef]
Wang, L.; Ma, H.; Li, J.; Gao, Y.; Fan, L.; Yang, Z.; Yang, Y.; Wang, C. An Automated Extraction of Small- and Middle-Sized Rice Fields under Complex Terrain Based on SAR Time Series: A Case Study of Chongqing. Comput. Electron. Agric. 2022, 200, 107232. [Google Scholar] [CrossRef]
Orynbaikyzy, A.; Gessner, U.; Mack, B.; Conrad, C. Crop Type Classification Using Fusion of Sentinel-1 and Sentinel-2 Data: Assessing the Impact of Feature Selection, Optical Data Availability, and Parcel Sizes on the Accuracies. Remote Sens. 2020, 12, 2779. [Google Scholar] [CrossRef]
Johnson, B.A. High-Resolution Urban Land-Cover Classification Using a Competitive Multi-Scale Object-Based Approach. Remote Sens. Lett. 2013, 4, 131–140. [Google Scholar] [CrossRef]
Heumann, B.W. An Object-Based Classification of Mangroves Using a Hybrid Decision Tree-Support Vector Machine Approach. Remote Sens. 2011, 3, 2440–2460. [Google Scholar] [CrossRef]
Yang, C.; Everitt, J.H.; Murden, D. Evaluating High Resolution SPOT 5 Satellite Imagery for Crop Identification. Comput. Electron. Agric. 2011, 75, 347–354. [Google Scholar] [CrossRef]
Mou, A.H.; Li, B.H.; Zhou, C.Y.; Zheng, D.Y.; Dong, E.R.; Cao, F.J. Estimating Winter Wheat Straw Amount and Spatial Distribution in Qihe County, China, Using GF-1 Satellite Images. J. Renew. Sustain. Energy 2021, 13, 013102. [Google Scholar] [CrossRef]
Meng, S.; Zhong, Y.; Luo, C.; Hu, X.; Wang, X.; Huang, S. Optimal Temporal Window Selection for Winter Wheat and Rapeseed Mapping with Sentinel-2 Images: A Case Study of Zhongxiang in China. Remote Sens. 2020, 12, 226. [Google Scholar] [CrossRef]
Zhang, P.; Hu, S.; Li, W.; Zhang, C. Parcel-Level Mapping of Crops in a Smallholder Agricultural Area: A Case of Central China Using Single-Temporal VHSR Imagery. Comput. Electron. Agric. 2020, 175, 105581. [Google Scholar] [CrossRef]
Zou, J.; Huang, Y.; Chen, L.; Chen, S. Remote Sensing-Based Extraction and Analysis of Temporal and Spatial Variations of Winter Wheat Planting Areas in the Henan Province of China. Open Life Sci. 2018, 13, 533–543. [Google Scholar] [CrossRef]
Shan, X.; Zhang, J. Does the Rational Function Model’s Accuracy for GF1 and GF6 WFV Images Satisfy Practical Requirements? Remote Sens. 2023, 15, 2820. [Google Scholar] [CrossRef]
Wu, J.; Li, Y.; Zhong, B.; Liu, Q.; Wu, S.; Ji, C.; Zhao, J.; Li, L.; Shi, X.; Yang, A. Integrated Vegetation Cover of Typical Steppe in China Based on Mixed Decomposing Derived from High Resolution Remote Sensing Data. Sci. Total Environ. 2023, 904, 166738. [Google Scholar] [CrossRef]
Zhou, Q.; Yu, Q.; Liu, J.; Wu, W.; Tang, H. Perspective of Chinese GF-1 High-Resolution Satellite Data in Agricultural Remote Sensing Monitoring. J. Integr. Agric. 2017, 16, 242–251. [Google Scholar] [CrossRef]
Xia, T.; He, Z.; Cai, Z.; Wang, C.; Wang, W.; Wang, J.; Hu, Q.; Song, Q. Exploring the Potential of Chinese GF-6 Images for Crop Mapping in Regions with Complex Agricultural Landscapes. Int. J. Appl. Earth Obs. Geoinf. 2022, 107, 102702. [Google Scholar] [CrossRef]
Su, T.; Zhang, S. Object-Based Crop Classification in Hetao Plain Using Random Forest. Earth Sci. Inform. 2021, 14, 119–131. [Google Scholar] [CrossRef]
Karimi, N.; Sheshangosht, S.; Eftekhari, M. Crop Type Detection Using an Object-Based Classification Method and Multi-Temporal Landsat Satellite Images. Paddy Water Environ. 2022, 20, 395–412. [Google Scholar] [CrossRef]
Li, J.; Shen, Y.; Yang, C. An Adversarial Generative Network for Crop Classification from Remote Sensing Timeseries Images. Remote Sens. 2021, 13, 65. [Google Scholar] [CrossRef]
Du, X.; Zare, A. Multiple Instance Choquet Integral Classifier Fusion and Regression for Remote Sensing Applications. IEEE Trans. Geosci. Remote Sens. 2019, 57, 2741–2753. [Google Scholar] [CrossRef]
Li, D.; Yang, F.; Wang, X. Study on Ensemble Crop Information Extraction of Remote Sensing Images Based on SVM and BPNN. J. Indian Soc. Remote Sens. 2017, 45, 229–237. [Google Scholar] [CrossRef]
Hinton, G.E.; Osindero, S.; Teh, Y.W. A Fast Learning Algorithm for Deep Belief Nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef]
Penatti, O.A.B.; Nogueira, K.; Dos Santos, J.A. Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Boston, MA, USA, 7–12 June 2015; pp. 44–51. [Google Scholar] [CrossRef]
LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Comput. 1989, 1, 541–551. [Google Scholar] [CrossRef]
Kanda, F.; Kubo, M.; Muramoto, K. Watershed segmentation and classification of tree species using high resolution forest imagery. In Proceedings of the IGARSS 2004. 2004 IEEE International Geoscience and Remote Sensing Symposium, Anchorage, AK, USA, 20–24 September 2004; Voume 6, pp. 3822–3825. [Google Scholar] [CrossRef]
Wei, H.; Hu, Q.; Cai, Z.; Yang, J.; Song, Q.; Yin, G.; Xu, B. An Object-and Topology-Based Analysis (Otba) Method for Mapping Rice-Crayfish Fields in South China. Remote Sens. 2021, 13, 4666. [Google Scholar] [CrossRef]
Shao, P. Study on Main Features Information Extraction Technology of High-Resolution Remotely Sensed Image Based on Multiresolution Segmentation. Master’s Thesis, Jilin University, Changchun, China, 2015. [Google Scholar]
Löw, F.; Michel, U.; Dech, S.; Conrad, C. Impact of Feature Selection on the Accuracy and Spatial Uncertainty of Per-Field Crop Classification Using Support Vector Machines. ISPRS J. Photogramm. Remote Sens. 2013, 85, 102–119. [Google Scholar] [CrossRef]
Chen, Z.; Jia, K.; Li, Q.; Xiao, C.; Wei, D.; Zhao, X.; Wei, X.; Yao, Y.; Li, J. Hybrid Feature Selection for Cropland Identification Using GF-5 Satellite Image. Natl. Remote Sens. Bull. 2022, 26, 1383–1394. [Google Scholar] [CrossRef]
Zhang, K.; Zhang, M.; Du, J. Spatial and Temporal Variation Characteristics of Surface Humid Condition in Qingyang from 1981 to 2016. Chin. Agric. Sci. Bulletion 2019, 35, 101–106. [Google Scholar]
Nolan, S.; Unkovich, M.; Yuying, S.; Lingling, L.; Bellotti, W. Farming Systems of the Loess Plateau, Gansu Province, China. Agric. Ecosyst. Environ. 2008, 124, 13–23. [Google Scholar] [CrossRef]
Zhao, Y.; Zhu, W.; Wei, P.; Fang, P.; Zhang, X.; Yan, N.; Liu, W.; Zhao, H.; Wu, Q. Classification of Zambian Grasslands Using Random Forest Feature Importance Selection during the Optimal Phenological Period. Ecol. Indic. 2022, 135, 108529. [Google Scholar] [CrossRef]
Ghimire, P.; Lei, D.; Juan, N. Effect of Image Fusion on Vegetation Index Quality-A Comparative Study from Gaofen-1, Gaofen-2, Gaofen-4, Landsat-8 OLI and MODIS Imagery. Remote Sens. 2020, 12, 1550. [Google Scholar] [CrossRef]
Saleh, M.A.; Ameen, Z.S.; Altrjman, C.; Al-Turjman, F. Computer-Vision-Based Statue Detection with Gaussian Smoothing Filter and EfficientDet. Sustainability 2022, 14, 11413. [Google Scholar] [CrossRef]
Yu, H.; Gu, X.; Wang, S. The edge detection of river model based on self-adaptive Canny Algorithm and connected domain segmentation. In Proceedings of the 2010 8th World Congress on Intelligent Control and Automation, Jinan, China, 7–9 July 2010. [Google Scholar] [CrossRef]
Zhao, W.Q.; Yan, H.; Shao, X.Q. Object detection based on improved non-maximum suppression algorithm. J. Image 2018, 23, 1676–1685. [Google Scholar]
Jiang, F.; Wang, G.; He, P.; Zheng, C.; Xiao, Z.; Wu, Y. Application of Canny Operator Threshold Adaptive Segmentation Algorithm Combined with Digital Image Processing in Tunnel Face Crevice Extraction. J. Supercomput. 2022, 78, 11601–11620. [Google Scholar] [CrossRef]
Li, P.; Shi, T.; Zhao, Y.; Lu, A. Design of Threshold Segmentation Method for Quantum Image. Int. J. Theor. Phys. 2020, 59, 514–538. [Google Scholar] [CrossRef]
Tab, F.A.; Naghdy, G.; Mertins, A. Scalable Multiresolution Color Image Segmentation. Signal Process. 2006, 86, 1670–1687. [Google Scholar] [CrossRef]
Dian, Y.Y.; Fang, S.H.; Yao, C.H. Change detection for high-resolution images using multilevel segment method. J. Remote Sens. 2016, 20, 129–137. [Google Scholar]
Drǎguţ, L.; Csillik, O.; Eisank, C.; Tiede, D. Automated Parameterisation for Multi-Scale Image Segmentation on Multiple Layers. ISPRS J. Photogramm. Remote Sens. 2014, 88, 119–127. [Google Scholar] [CrossRef] [PubMed]
Ma, H. Object-Based Remote Sensing Image Classification of Forest Based on Multi-Level Segmentation. Master’s Thesis, Beijing Forestry University, Beijing, China, 2014. [Google Scholar]
Zhang, J.; Wang, Y.; Li, Y.; Wang, X. An Object-Oriented Optimal Scale Choice Method for Heigh Spatial Resolution Remote Sensing Image. Sci. Technol. Rev. 2009, 27, 91–94. [Google Scholar]
Ashourloo, D.; Shahrabi, H.S.; Azadbakht, M.; Rad, A.M.; Aghighi, H.; Radiom, S. A Novel Method for Automatic Potato Mapping Using Time Series of Sentinel-2 Images. Comput. Electron. Agric. 2020, 175, 105583. [Google Scholar] [CrossRef]
Wang, M.; Fei, X.; Zhang, Y.; Chen, Z.; Wang, X.; Tsou, J.Y.; Liu, D.; Lu, X. Assessing Texture Features to Classify Coastal Wetland Vegetation from High Spatial Resolution Imagery Using Completed Local Binary Patterns (CLBP). Remote Sens. 2018, 10, 778. [Google Scholar] [CrossRef]
Zhu, Y.; Zeng, Y.; Zhang, M. Extract of Land Use/Cover Information Based on HJ Satellites Data and Object-Oriented Classification. Nongye Gongcheng Xuebao/Trans. Chin. Soc. Agric. Eng. 2017, 33, 258–265. [Google Scholar] [CrossRef]
Fu, T.; Ma, L.; Li, M.; Johnson, B.A. Using Convolutional Neural Network to Identify Irregular Segmentation Objects from Very High-Resolution Remote Sensing Imagery. J. Appl. Remote Sens. 2018, 12, 1. [Google Scholar] [CrossRef]
Hubel, D.H.; Wiesel, T.N. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J. Physiol. 1962, 160, 106–154. [Google Scholar] [CrossRef]
Uchida, K.; Tanaka, M.; Okutomi, M. Coupled Convolution Layer for Convolutional Neural Network. Neural Netw. 2018, 105, 197–205. [Google Scholar] [CrossRef]
Fortuna-Cervantes, J.M.; Ramírez-Torres, M.T.; Mejía-Carlos, M.; Murguía, J.S.; Martinez-Carranza, J.; Soubervielle-Montalvo, C.; Guerra-García, C.A. Texture and Materials Image Classification Based on Wavelet Pooling Layer in CNN. Appl. Sci. 2022, 12, 3592. [Google Scholar] [CrossRef]
Więckowska, B.; Kubiak, K.B.; Jóźwiak, P.; Moryson, W.; Stawińska-Witoszyńska, B. Cohen’s Kappa Coefficient as a Measure to Assess Classification Improvement Following the Addition of a New Marker to a Regression Model. Int. J. Environ. Res. Public Health 2022, 19, 10213. [Google Scholar] [CrossRef] [PubMed]
An, B.; Zhang, B. Logistic Regression with Image Covariates via the Combination of L1and Sobolev Regularizations. PLoS ONE 2020, 15, e0234975. [Google Scholar] [CrossRef] [PubMed]
Ma, M.; Chen, J.; Liu, W.; Yang, W. Ship Classification and Detection Based on CNN Using GF-3 SAR Images. Remote Sens. 2018, 10, 2043. [Google Scholar] [CrossRef]
Tatsumi, K.; Yamashiki, Y.; Canales Torres, M.A.; Taipe, C.L.R. Crop Classification of Upland Fields Using Random Forest of Time-Series Landsat 7 ETM+ Data. Comput. Electron. Agric. 2015, 115, 171–179. [Google Scholar] [CrossRef]
Wang, Z.; Zhao, Z.; Yin, C. Fine Crop Classification Based on UAV Hyperspectral Images and Random Forest. ISPRS Int. J. Geoinf. 2022, 11, 252. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, Z.; Feng, L.; Ma, Y.; Du, Q. A New Attention-Based CNN Approach for Crop Mapping Using Time Series Sentinel-2 Images. Comput. Electron. Agric. 2021, 184, 106090. [Google Scholar] [CrossRef]
Zhao, H.; Chen, Z.; Jiang, H.; Jing, W.; Sun, L.; Feng, M. Evaluation of Three Deep Learning Models for Early Crop Classification Using Sentinel-1A Imagery Time Series-a Case Study in Zhanjiang, China. Remote Sens. 2019, 11, 2673. [Google Scholar] [CrossRef]
Lin, Y.; Zhang, W.; Yu, J.; Zhang, H. Fine Classification of Urban vegetation based UAV images. China Environ. Sci. 2022, 42, 2852–2861. [Google Scholar] [CrossRef]
Ismail Fawaz, H.; Forestier, G.; Weber, J.; Idoumghar, L.; Muller, P.A. Deep Learning for Time Series Classification: A Review. Data Min. Knowl. Discov. 2019, 33, 917–963. [Google Scholar] [CrossRef]
Mishra, C.; Gupta, D.L. Deep Machine Learning and Neural Networks: An Overview. IAES Int. J. Artif. Intell. (IJ-AI) 2017, 6, 66. [Google Scholar] [CrossRef]
Zou, B.; Xu, X.; Zhang, L. Object-Based Classification of PolSAR Images Based on Spatial and Semantic Features. IEEE J. Sel. Top Appl. Earth Obs. Remote Sens. 2020, 13, 609–619. [Google Scholar] [CrossRef]
Kavzoglu, T.; Tonbul, H.; Yildiz Erdemir, M.; Colkesen, I. Dimensionality Reduction and Classification of Hyperspectral Images Using Object-Based Image Analysis. J. Indian Soc. Remote Sens. 2018, 46, 1297–1306. [Google Scholar] [CrossRef]
Ding, Z.; Wang, T.; Sun, Q.; Wang, H. Adaptive Fusion with Multi-Scale Features for Interactive Image Segmentation. Appl. Intell. 2021, 51, 5610–5621. [Google Scholar] [CrossRef]
Jiang, N.; Shi, H.; Geng, J. Multi-Scale Graph-Based Feature Fusion for Few-Shot Remote Sensing Image Scene Classification. Remote Sens. 2022, 14, 5550. [Google Scholar] [CrossRef]
Chen, S.; Useya, J.; Mugiyo, H. Decision-Level Fusion of Sentinel-1 SAR and Landsat 8 OLI Texture Features for Crop Discrimination and Classification: Case of Masvingo, Zimbabwe. Heliyon 2020, 6, e05358. [Google Scholar] [CrossRef] [PubMed]
Khojastehnazhand, M.; Roostaei, M. Classification of Seven Iranian Wheat Varieties Using Texture Features. Expert Syst. Appl. 2022, 199, 117014. [Google Scholar] [CrossRef]
Zhang, D.; Ying, C.; Wu, L.; Meng, Z.; Wang, X.; Ma, Y. Using Time Series Sentinel Images for Object-Oriented Crop Extraction of Planting Structure in the Google Earth Engine. Agronomy 2023, 13, 2350. [Google Scholar] [CrossRef]

Figure 1. The study area’s (a) geographical distribution of the Loess Plateau and Gansu Province in China; (b) a digital elevation model of the Loess Plateau of eastern Gansu Province and the distribution of representative test areas; I, II, III, and IV represent locations in Huanxian County, Zhenyuan County, the Xifeng Hot Spring Town, the town of Zaosheng, and Ningxian County, respectively.

Figure 2. The phenological periods of different crops (I, II, III, and IV represent the test areas where staple crops are located).

Figure 3. Hierarchical network diagram of imaging objects, with different colors representing different types of ground objects.

Figure 4. A flowchart of this study’s method: (a) pre-processing and sample data collection; (b) image segmentation and feature selection; (c) the classification and accuracy evaluation of planting structures using various methods.

Figure 5. The evaluation of the optimal segmentation scale using the ESP2 tool. (The dot line indicate that the optimal segmentation scales for preliminary screening were 35, 45, 70, 95, 105, 125, 135, 170, and 220).

Figure 6. The RMAS values of different crops. (The optimal segmentation scale is 35 for buckwheat, 65 for wheat and apples, and 55 for corn).

Figure 7. The segmentation map before and after integrating edge information. (The arrows indicate that more detailed objects are segmented after integrating edge information).

Figure 8. The NDVI variation trend of different crops and its variation rate in adjacent months.

Figure 9. The preferred feature factors of different crops.

Figure 10. The classification results of the RF and CNN models. (I, II, III, and IV represent the four test areas, respectively).

Figure 11. The area and proportion of classification results of the two models in test area I (Huanxian County).

Figure 12. The area and proportion of classification results of the two models in test area II (Zhenyuan County).

Figure 13. The area and proportion of classification results of the two models in test area III (Xifeng County).

Figure 14. The area and proportion of classification results of the two models in test area IV (Ningxian County).

Figure 15. False-color images of the different crops.

Figure 16. The segmentation results combined with Canny edge detection.

Figure 17. The statistics of the texture features of corn and apples.

Table 1. The image data from the mentioned high-resolution satellites (I~IV represent four experimental areas, respectively).

Name of Satellite	Spectral Band	Spatial Resolution (m)	Products	Number of Images
				I	II	III	IV
GF-1	Red	8	Fusion image (2 m)	/	2	1	/
	Green
	Blue
	Near-Infrared (NIR)
	Panchromatic	2
GF-2	Red	4	Fusion image (1 m)	4	1	1	4
	Green
	Blue
	Near-Infrared (NIR)
	Panchromatic	1
GF-6	Red	8	Fusion image (2 m)	/	1	2	/
	Green
	Blue
	Near-Infrared (NIR)
	Panchromatic	2

Table 2. Spatial feature factor information.

Feature Category	Feature Variable	Total/Number
Spectral feature	Mean_R, Mean_G, Mean_B, Mean_NIR, Max_diff, Briahtness and Standard Deviation (four bands)	10
Texture features	GLCM Mean, GLCM Ent, GLCM Homo, GLCM Std, GLCM Dissim, GLCM Contrast and GLCM Ang. 2nd Moment, GLCM Corr, GLDV Mean, GLDV Ent, GLDV Contrast and GLDV Ang. 2nd Moment	12
Geometric features	area, length/width, length, width, border length, Shape Index, Density, Asymmetry, Roundness, Boundary Index, Compactness, Ellipse Fitting, Rectangle Fitting	13
Index features	EVI, NDVI, R/G and RVI	4

Table 3. The accuracy validation of the classification results through random forest and deep learning.

Test Area	Type of Crops	Kappa Coefficient of Each Crop		Kappa Coefficient of Overall Classification Results		Overall Accuracy
Test Area	Type of Crops	RF Model	CNN Model	RF Model	CNN Model	RF Model	CNN Model
I	Wheat	0.92	0.90	0.89	0.87	0.92	0.91
	Corn	0.85	0.81
	Buckwheat	0.96	0.93
II	Wheat	0.93	0.89	0.91	0.88	0.95	0.93
	Corn	0.91	0.87
	Buckwheat	0.86	0.88
III	Wheat	0.87	0.89	0.85	0.84	0.89	0.89
	Corn	0.84	0.81
	Apple	0.85	0.80
IV	Wheat	0.86	0.79	0.86	0.85	0.91	0.90
	Corn	0.78	0.86
	Apple	0.93	0.89

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, R.; Qi, Y.; Zhang, H.; Wang, H.; Zhang, J.; Ma, X.; Zhang, J.; Ma, C. A Study on the Object-Based High-Resolution Remote Sensing Image Classification of Crop Planting Structures in the Loess Plateau of Eastern Gansu Province. Remote Sens. 2024, 16, 2479. https://doi.org/10.3390/rs16132479

AMA Style

Yang R, Qi Y, Zhang H, Wang H, Zhang J, Ma X, Zhang J, Ma C. A Study on the Object-Based High-Resolution Remote Sensing Image Classification of Crop Planting Structures in the Loess Plateau of Eastern Gansu Province. Remote Sensing. 2024; 16(13):2479. https://doi.org/10.3390/rs16132479

Chicago/Turabian Style

Yang, Rui, Yuan Qi, Hui Zhang, Hongwei Wang, Jinlong Zhang, Xiaofang Ma, Juan Zhang, and Chao Ma. 2024. "A Study on the Object-Based High-Resolution Remote Sensing Image Classification of Crop Planting Structures in the Loess Plateau of Eastern Gansu Province" Remote Sensing 16, no. 13: 2479. https://doi.org/10.3390/rs16132479

APA Style

Yang, R., Qi, Y., Zhang, H., Wang, H., Zhang, J., Ma, X., Zhang, J., & Ma, C. (2024). A Study on the Object-Based High-Resolution Remote Sensing Image Classification of Crop Planting Structures in the Loess Plateau of Eastern Gansu Province. Remote Sensing, 16(13), 2479. https://doi.org/10.3390/rs16132479

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Study on the Object-Based High-Resolution Remote Sensing Image Classification of Crop Planting Structures in the Loess Plateau of Eastern Gansu Province

Abstract

1. Introduction

2. Materials and Methods

2.1. Overview of the Study Area

2.2. Data

2.3. Methodology

2.3.1. Canny Edge Detection

2.3.2. Image Segmentation

2.3.3. Optimal Segmentation Scale Estimation

2.3.4. Feature Factor

2.3.5. Random Forest Classification

2.3.6. Convolutional Neural Network

2.3.7. Accuracy Evaluation

3. Results and Analysis

3.1. Segmentation Results Combined with Canny Edge Detection

3.2. Feature Factor Optimization

3.3. Object-Based Crop Classification Results

4. Discussion

4.1. Evaluation of the RF and CNN Model Results

4.2. Evaluation of the Effect of Combining Scale Segmentation with Feature Optimization

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI