*Article* **Response of Different Band Combinations in Gaofen-6 WFV for Estimating of Regional Maize Straw Resources Based on Random Forest Classification**

**Huawei Mou 1,2,3, Huan Li 1,2,4, Yuguang Zhou 1,5,6,7,\* and Renjie Dong 1,2,4,5,8**


**Abstract:** Maize straw is a valuable renewable energy source. The rapid and accurate determination of its yield and spatial distribution can promote improved utilization. At present, traditional straw estimation methods primarily rely on statistical analysis that may be inaccurate. In this study, the Gaofen 6 (GF-6) satellite, which combines high resolution and wide field of view (WFV) imaging characteristics, was used as the information source, and the quantity of maize straw resources and spatial distribution characteristics in Qihe County were analyzed. According to the phenological characteristics of the study area, seven classification classes were determined, including maize, buildings, woodlands, wastelands, water, roads, and other crops, to explore the influence of sample separation and test the responsiveness to different land cover types with different waveband combinations. Two supervised classification methods, support vector machine (SVM) and random forest (RF), were used to classify the study area, and the influence of the newly added band of GF-6 WFV on the classification accuracy of the study area was analyzed. Furthermore, combined with field surveys and agricultural census data, a method for estimating the quantity of maize straw and analyzing the spatial distribution based on a single-temporal remote sensing image and random forests was proposed. Finally, the accuracy of the measurement results is evaluated at the county level. The results showed that the RF model made better use of the newly added bands of GF-6 WFV and improved the accuracy of classification, compared with the SVM model; the two red-edge bands improved the accuracy of crop classification and recognition; the purple and yellow bands identified non-vegetation more effectively than vegetation, thus minimizing the "salt-and-pepper noise" of classification results. However, the changes to total classification accuracy were not obvious; the theoretical quantity of maize straw in Qihe County in 2018 was 586.08 kt, which reflects an error of only 2.42% compared to the statistical result. Hence, the RF model based on single-temporal GF-6 WFV can effectively estimate regional maize straw yield and spatial distribution, which lays a theoretical foundation for straw recycling.

**Keywords:** GF-6; maize; straw; support vector machine; random forest; red-edge wavelength

**Citation:** Mou, H.; Li, H.; Zhou, Y.; Dong, R. Response of Different Band Combinations in Gaofen-6 WFV for Estimating of Regional Maize Straw Resources Based on Random Forest Classification. *Sustainability* **2021**, *13*, 4603. https://doi.org/10.3390/ su13094603

Academic Editor: C. Ronald Carroll

Received: 10 March 2021 Accepted: 13 April 2021 Published: 21 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

### **1. Introduction**

China is a large country known for its abundant agricultural resources, with agricultural production comprising a significant proportion of its national economy [1]. Crop straw, as a by-product of agricultural production [2], is an indispensable production material in vast rural areas [3]. In recent years, as the rural energy structure has shifted, fewer people have used straw as an energy resource for rural life because of its large volume and scattered distribution, as well as the low degree of industrialization [4]. Furthermore, owing to the regional, seasonal, and structural surplus of straw becoming increasingly prominent, a large amount of straw is still not fully utilized, which severely restricts the development of circular agriculture in China [5,6]. Although the government advocates for the return of straw to the field [7], straw is often discarded or burned in large amounts to allow for timely sowing at the start of the growing season, leading to the serious waste of resources and environmental pollution [8]. Straw is beneficial if used, but it is harmful if it is discarded [9]. With the development of renewable energy technology, biotechnology, circular agriculture, and environmental science, the value of crop straw as a renewable energy source has gradually been increasing and become widely accepted [10], which can be used as bio-fertilizer, feed, raw materials, fuels, and base materials. Therefore, studying the quantity and spatial distribution of straw resources in China and promoting the comprehensive utilization of straw resources are necessary for promoting rural building and sustainable agricultural development in China [11].

Although the current straw quantity was estimated in many studies, there are several limitations to extant methods and findings [12]. Firstly, the low resolution of statistical data is generally used for the analysis of straw at the county level or above, which limits the value of the data for detailed spatial analyses [13]. Secondly, the quantity of straw resources cannot be calculated in time, because the agricultural census can only be completed in the next year at the earliest. Moreover, if we aim to realize the comprehensive utilization of straw, we must not only estimate the straw yield but also fully consider the spatial distribution of regional straw [14]. The effective recycling and utilization of crop straw resources can be realized more efficiently by combining the relationship between the supply and demand of regional straw resources and optimizing straw recycling and comprehensive utilization [15].

With the in-depth application of remote sensing technology in crop area extraction, growth monitoring [16], and yield estimation [17], the use of remote sensing technology to analyze the yield and spatial distribution of straw has become a major development direction for straw resource investigation [18]. The spatial characteristics of crop planting in China exhibit complex structures and fragmentations. Therefore, in the estimation of large-scale straw quantity, the data to be processed are very large when the high-resolution remote sensing image is used, while the low-resolution remote sensing image will lead to a rapid decline in the measurement accuracy [19]. Classification using single-temporal remote sensing images of the "key phenological period" combined with multi-characteristic parameters and sensitive bands has become an important method for current crop type identification [20]. The response characteristics of different wavebands to different crops can be used to optimize the combination of wavebands, so that the spectral difference and Class Separability between different crop types are significantly improved, and finally, the accurate investigation and analysis of different crop straw resources can be realized [21].

Maize, which accounts for approximately one-fifth of grain crops in China, is the third-largest grain crop after rice and wheat [22]. Therefore, the quantity and spatial distribution of maize straw in the region are of great significance to the collection, storage, and transportation as well as comprehensive utilization of straw.

The Gaofen-6 (GF-6) satellite, planned in China's high-resolution major special series satellites, adds four bands with central wavelengths of 710 nm, 750 nm, 425 nm, and 610 nm, which can provide richer spectral information for agricultural research [23]. This technological advancement is important for improving the spectral information characteristics of China's medium and high-resolution satellites [24].

In order to improve the comprehensive utilization and accelerate the development of the scale, industrialization, and commercialization of straw, the quantity and spatial distribution of straw need to be studied, to plan the recycling network and the site selection of the utilization factory of straw. In this study, the effects of different wavebands on the classification of different land cover types were analyzed based on GF-6 satellite imagery, and the quantity and spatial distribution characteristics of maize straw in Qihe County were estimated to provide data support for recycling and effective utilization of regional straw. The research contents included: (1) exploring and analyzing the impact of different band combinations on samples separability, (2) analyzing the classification accuracy of the support vector machine (SVM) and random forest (RF) classification models under different band combinations for different land cover types, and (3) proposing a method for estimating the quantity and spatial distribution of maize straw based on planting area. the scale, industrialization, and commercialization of straw, the quantity and spatial distribution of straw need to be studied, to plan the recycling network and the site selection of the utilization factory of straw. In this study, the effects of different wavebands on the classification of different land cover types were analyzed based on GF-6 satellite imagery, and the quantity and spatial distribution characteristics of maize straw in Qihe County were estimated to provide data support for recycling and effective utilization of regional straw. The research contents included: (1) exploring and analyzing the impact of different band combinations on samples separability, (2) analyzing the classification accuracy of the support vector machine (SVM) and random forest (RF) classification models under different band combinations for different land cover types, and (3) proposing a method for estimating the quantity and spatial distribution of maize straw based on planting area.

nm, which can provide richer spectral information for agricultural research [23]. This technological advancement is important for improving the spectral information characteristics

In order to improve the comprehensive utilization and accelerate the development of

*Sustainability* **2021**, *13*, x FOR PEER REVIEW 3 of 16

of China's medium and high-resolution satellites [24].

### **2. Materials and Methods 2. Materials and Methods**

### *2.1. Research Area 2.1. Research Area*

The study area is located in Qihe County, which is in the southernmost part of Dezhou City, Shandong Province, China, at latitude range 36◦24037"–37◦1 044" N and longitude range 116◦23028"–116◦57035" E (Figure 1). The annual average temperature is 15 ◦C throughout the year, which indicates a warm temperate and sub-humid monsoon climate zone, with four distinct seasons and mild weather patterns. The land area of the study area is approximately 1411 km<sup>2</sup> , of which arable land comprises 840 km<sup>2</sup> . It is flat, with an average elevation of 26 m (mean sea level), and is an alluvial plain in the lower reaches of the Yellow River. It is also an important food production area in Shandong Province. The main food crops are winter wheat and maize; peanuts, soybeans, and cotton are also planted. The study area is located in Qihe County, which is in the southernmost part of Dezhou City, Shandong Province, China, at latitude range 36°24′37″–37°1′44″ N and longitude range 116°23′28″–116°57′35″ E (Figure 1). The annual average temperature is 15 °C throughout the year, which indicates a warm temperate and sub-humid monsoon climate zone, with four distinct seasons and mild weather patterns. The land area of the study area is approximately 1411 km2, of which arable land comprises 840 km2. It is flat, with an average elevation of 26 m (mean sea level), and is an alluvial plain in the lower reaches of the Yellow River. It is also an important food production area in Shandong Province. The main food crops are winter wheat and maize; peanuts, soybeans, and cotton are also planted.

**Figure 1. Figure 1.** Research area: ( Research area: ( **aa**) Shandong Province; (**b**) Qihe County. ) Shandong Province; (**b**) Qihe County.

### *2.2. Data Source and Preprocessing*

The GF-6 was successfully launched on 2 June 2018, and is mainly used in precision agriculture observation and forestry resource investigation. An 8-band complementary metal-oxide-semiconductor detector was employed in China, equipped with a 2 m panchromatic/8 m multi-spectral high-resolution camera and a 16 m multi-spectral medium-resolution wide field of view (WFV) camera. The details of GF6-WFV are shown

in Table 1. For the first time in China, the "red-edge" band, which can effectively reflect the unique spectral characteristics of crops, was added, which has greatly improved the monitoring of agriculture, forestry, grassland, and other resources.


**Table 1.** Parameters of Gaofen 6 (GF-6) wide field of view (WFV).

The image of the research area taken on 9 September 2018, was selected for analysis, as this date corresponds to the maize filling period. The 1A-level image downloaded from the China Centre for Resources Satellite Data and Application (CCRSDA) (http: //www.cresda.com/CN/index.shtml (accessed on 7 February 2020)) must be preprocessed by radiometric calibration, atmospheric correction, and orthorectification [25], and all preprocessing performed in ENVI (Version 5.3, Research System Inc., Boulder, CO, USA). Atmospheric correction was performed using the fast line-of-sight atmospheric analysis of the spectral hypercubes model [26], and the spectral response function was provided by the CCRSDA. The rational polynomial coefficients model based on rational functions was used to further orthorectify without control points. The 2019 agricultural census data including planting area and yields of maize were obtained from the local government website (http://dztj.dezhou.gov.cn/n3100530/n3100065/index.html (accessed on 27 April 2020)), and the administrative boundary vector data of the study area were downloaded from Resource and Environmental Science and Data Center (http://www.resdc.cn/data. aspx?DATAID=202 (accessed on 7 February 2020)). SuperMap (iDEesktop 8C, SuperMap Software Co., Ltd., Beijing, China) was used to process these data and transform the coordinate system. All spatial data were converted into the universal transverse Mercator (WGS84 UTM 45N) projection.

Two types of samples were used in this study, namely training and verification samples, most of which were obtained through ground surveys using OvitalMap (V8.7.1, Beijing Ovital Software Co., Ltd., Beijing, China) in June 2018. In addition, with the support of higher spatial resolution image data, historical data, and expert knowledge, we also acquired a portion of training samples through manual visual interpretation. A total of 689 samples were acquired in this study, including maize, buildings, woodlands, wastelands, water, roads, and other crops. According to the proportions of different land cover types, 250 samples were randomly selected as verification samples and the rest were used for training samples. The training samples were used to classify land cover types in the research area. The supervised classification method was used to obtain the planting area of maize, from which the yield and distribution of maize straw were estimated. The verification sample was used to evaluate the classification accuracy of different land cover types. All samples were randomly collected to cover the entire study area as much as possible and they were quadrats of single crops to better avoid noise and ensure classification accuracy.

When the maize was being harvested in October 2018, three quadrats of 5 m × 10 m were selected in the southern, central, and northern regions of Qihe County, respectively, to count the number of the maize planted and the weight of straw (15% moisture content). These data will be used for the estimation of maize straw yield.

## *2.3. Classification of Land Cover Types Using Different Bands Combinations*

The maize in the research area of the acquired image was in the grain-filling stage, and the main land cover types were determined by ground investigation as maize, buildings, wasteland, water, woodland, roads, vegetables, cotton, and soybean. Soybean and cotton were planted less than the other crops, and their spectral characteristics were similar to those of vegetables. Hence, vegetables, cotton, soybeans, and a very small number of other crops were classified as other crop types, and the identification and statistics of maize were the focus of classification in this study. In summary, we divided the study area into seven final land cover types, including maize, buildings, woodland, wasteland, water, roads, and other crops. Firstly, layer stacking was performed on the preprocessed image, and five schemes (Table 2) were designed for the newly added bands for experimentation. Second, two types of machine learning—SVM [27] and RF [28]—were used to classify the research area [29]. Finally, the classification results of the two classification methods were analyzed on the influence of the red-edge, purple, and blue bands on the recognition of various land cover types to verify the improvement of the classification accuracy of the newly added band of GF-6 WFV compared to GF1/WFV.

**Table 2.** Classification schemes with different band combinations.


SVM is based on a statistical learning theory, trying to find an optimal hyperplane as a decision function in high-dimensional space. The number of free parameters used in the SVM does not depend on the number of input features, and the reduction in the number of features is not required to avoid overfitting. SVM provides a generic mechanism to fit the surface of the hyperplane to the data through the use of a kernel function, such as linear, polynomial, or sigmoid curve. RF is a combination of tree predictors which exhibits superior performance in cases with noise and weak discrimination data and is insensitive to the initialization of parameters [30]. Compared to SVM, the number of user-defined parameters in RF is less than the number required for SVMs and easier to define. In this paper, the training of SVM with a linear kernel was performed. ENMAP-BOX [31] was used for RF classification.

### *2.4. Classes Separability Assessment*

Class separability, which is a measure of similarity between classes, can be determined from these values. There are four widely used quantitative measures for class separability: divergence, transformed divergence (TD), Bhattacharyya distance, and Jeffries–Matusita distance (JM) [32]. Divergence is one of the most popular separability measures used in remote sensing, which can be calculated by the mean and variance-covariance matrices of the data representing feature classes. The TD is the standardized form of divergence, which can minimize the effect of several well-separated classes that may increase the average divergence value and make the divergence measure misleading. The Bhattacharyya distance and the JM can be used to estimating the probability of correct classification, and the JM can suppress high separability values by transforming the Bhattacharyya distance values to a specific range.

### *2.5. Maize Straw Estimation*

The goal of county-level straw estimation was to determine the type and quantity of straw resources. The total theoretical quantity of straw was considered the maximum quantity that can be produced in a certain area each year [33]. This value was estimated by taking the crop planting area and straw resource density, and the equation used is as follows:

$$P\_T = \sum\_{i=1}^{n} D\_i \cdot A\_i \tag{1}$$

where: *P<sup>T</sup>* is the theoretical total quantity of straw (t); *i* is the number of different crop straws, and the maize straw was counted in this study, thus, *i* = 1; *D<sup>i</sup>* indicates the straw resource density of the *i* th crop (t/km<sup>2</sup> ), and *A<sup>i</sup>* is the plantation area of the *i* th crop (km<sup>2</sup> ).

$$D\_i = 1000 \times \left(\sum\_{j=1}^{n} \frac{\mathbf{C}\_{ij}}{\mathbf{S}\_{ij}}\right) / j \tag{2}$$

where: *j* is a different sampling region; *Ci j* is the theoretical total quantity of straw resources of the *i* th crop in area *j* (kg); *Si j* is the plantation area of *i* th crop in area *j* (m<sup>2</sup> ).

### *2.6. Accuracy Verification*

A confusion matrix [34] was used to evaluate the accuracy of the classification results based on the verification samples of the ground survey. The evaluation indicators include overall accuracy (OA), user accuracy (UA), production accuracy (PA), and kappa coefficient (KC). The OA and KC reflect the overall classification effect, while the PA and UA represent omission and misclassification errors, respectively. Since accuracy is not necessarily normally distributed, the non-parametric Wilcoxon test for paired samples was conducted to evaluate the changes in OA and PA of each land cover type between different bands combination. Besides, the planting area of maize can also be verified by statistical census data.

According to the yield of maize and the straw-grain ratio, the straw yield of maize could be calculated; that is, the theoretical total quantity of maize straw used as validation data was the product of maize yield and straw-grain ratio, and the maize yield was obtained using annual census data [35].

### **3. Results and Discussion**
