1. Introduction
As an important grain crop, sustainable wheat production plays a crucial role in ensuring food security in the context of rapidly increasing population worldwide. Aboveground biomass (AGB) usually serves as an important factor in crop growth monitoring and grain yield prediction due to its strong relevance with crop growth status and soil fertility. Moreover, the information on crop biomass is required in calculating critical nitrogen concentration and nitrogen nutrition index, which can well indicate crop nitrogen status and thus facilitate the adjustment of in-season nitrogen management [
1,
2]. Conventional measurements of crop biomass mainly involve massive manpower investment and destructive sampling in the field [
3]. These ground-based methods are time-consuming and not suitable for large-scale crop monitoring, since most of agronomical parameters exhibit high spatial variability within field [
4]. Therefore, it is imperative to develop advanced and effective technologies to timely and economically obtain crop biomass information during growing seasons over large areas.
Over the past few decades, remote sensing technology has been gradually recognized as an effective and non-invasive tool for estimating crop biomass and characterizing its spatial and temporal variability. Compared with synthetic aperture radar (SAR) and light detection and ranging (LiDAR), optical remote sensing is still an attractive choice to obtain crop biomass information due to its relatively low cost and the availability of new optical sensors with fine spectral, temporal, and spatial resolutions. The vegetation indices (VIs) derived from multispectral broadband remotely sensed data have long been used as indirect measurements of crop biomass, such as normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), and normalized difference water index (NDWI) [
5,
6,
7]. However, these vegetation indices have limited value and often lose sensitivity for dense crop canopies when biomass exceeds a certain amount. This VI-AGB insensitivity is known as the saturation problem. Recent studies have asserted that hyperspectral remotely sensed data have great potential to overcome the saturation problem in crop biomass estimation, because they offer approximately continuous spectra and can describe vegetation canopy reflectance in greater detail as compared to multispectral data [
3,
8,
9,
10]. The red-edge region, the transition of red-near infrared, has been shown to have high information content for crop biomass. In the study of Hansen and Schjoerring [
3], the newly constructed NDVI (718, 720 nm) yielded high accuracy for the estimation of winter wheat biomass. Fu et al. [
9] extracted sensitive features from red-edge spectra using band-depth analysis and combined them with partial least squares regression to estimate winter wheat biomass, achieving higher accuracy than did the optimal NDVI (1097, 980 nm). Marshall and Thenkabail [
10] also highlighted the importance of red-edge spectra when comparing the predictive performance of multispectral broadband and hyperspectral narrowband spaceborne remote sensed data for crop biomass estimation. Apart from the wavebands in red edge, several wavebands located in the near-infrared region have been reported to be sensitive to crop biomass. Yao et al. [
11] indicated that the continuous wavelet feature located at 1197 nm yielded more accurate estimates of wheat biomass than the existing narrowband hyperspectral VIs. Fu et al. [
9] evaluated the predictive performance of all possible two-band combinations in the form of NDVI for winter wheat biomass estimation and found that the optimal NDVI was calculated by the wavebands 1097 and 980 nm. Marshall and Thenkabail [
10] indicated that three near-infrared wavebands of Hyperion EO-1 hyperspectral imagery including 914, 1130, and 1320 nm were crucial for crop biomass estimation. In spite of the appealing characteristics of hyperspectral data, many studies associated with hyperspectral remote sensing of crop biomass utilized hand-held devices or ground platforms to acquire hyperspectral data. They are not applicable to crop biomass estimation at large scale. With the accomplishment of upcoming hyperspectral spaceborne missions, the situation will change, and vast hyperspectral data will be available for crop monitoring. However, the weather conditions and specific characteristics of satellite sensors largely limit the availability of high quality hyperspectral images, which satisfy the requirements of precision crop monitoring.
Currently, the rapid development of unmanned aerial vehicles (UAVs) and compact optical sensors opens a new perspective for quantifying crop biomass. The UAV remote sensing systems have shown great potential to fill the gap between ground-based devices and spaceborne and manned airborne remote sensing systems. Compared with ground-based platforms, UAV platforms were no longer confined to a small survey area and can largely improve the efficiency of data collection. The relatively low-cost UAV systems were more accessible for researchers and farmers than spaceborne and manned airborne systems. Moreover, the data acquired by UAV systems usually have higher temporal and spatial resolutions as compared with the existing spaceborne remote sensing systems. A variety of imaging data including red green blue (RGB) digital, multispectral, and hyperspectral images acquired from UAV platforms have been exploited in crop biomass estimation. Bendig et al. [
12] indicated that barley biomass was strongly correlated with canopy height, which was extracted from point clouds generated by UAV-based RGB imaging. However, these photogrammetry-derived plant heights were not reliable for dense crop canopies [
13]. Moreover, the change in canopy height was not obvious during late growth stages. Therefore, it is not feasible to estimate crop biomass only based on plant height during the entire growth period. Lu et al. [
14] reported that the combination of VI and canopy height derived from UAV-based RGB images produced higher estimation accuracy of wheat biomass than the use of VI or canopy height alone. However, the VIs derived from UAV-based RGB images also suffered from the saturation problem. To obtain accurate estimates of potato aboveground biomass, Li et al. [
15] established a random forest regression mode, which combined canopy height derived from UAV-based RGB images and multiple hyperspectral indices calculated from UAV-based hyperspectral images. Of these UAV remote sensing systems, the UAV systems equipped with an RGB digital camera have gained great popularity in precision agriculture, owing to its low cost and easy integration with other agricultural management tools. Due to the limitation of UAV-based RGB VIs and canopy height in crop biomass estimation, researchers have begun to mine spatial features from ultrahigh-spatial-resolution RGB images (<10 cm). Texture, a typical spatial feature, is expected to be more effective in imagery with finer spatial resolution, since more discriminative structural details can be shown [
16]. Zheng et al. [
17] proposed the normalized difference texture index (NDTI), which was calculated by the gray level co-occurrence matrix (GLCM)-based textures. Their experimental results demonstrated the superiority of the NDTI to the evaluated multispectral VIs for the estimation of rice biomass. In the study of Yue et al. [
18], the GLCM-based textural features were extracted from UAV-based RGB images with five different spatial resolutions and exhibited better predictive performance in the wheat biomass estimation as compared with traditional hyperspectral VIs and RGB-based VIs. Commonly, crop canopy size and plant density vary with growth stages and field management practices. Therefore, it is difficult to characterize the spatial changes of crop canopy using textural features with a single scale. Better crop estimates may be possible if multiscale textures can be well extracted. Motivated by this idea, a multiscale image texture extraction method is proposed in the study. The proposed texture extraction method takes good advantage of Gabor filters and GLCM analyses. Gabor filters can provide multiscale and multidirectional spatial features that are tolerant to local illumination changes, while GLCM-based textures have the capability of characterizing the local heterogeneity in the tone values of pixels as well as their spatial distribution. To the best of our knowledge, the predictive performance of multiscale textural features has rarely been evaluated in crop biomass estimation.
It is critical to employ appropriate regression methods to correlate the extracted features with crop biomass. Recently, non-parametric regression methods have exhibited superiority over parametric regression methods in crop traits retrieval, especially machine learning based regression techniques. Studies are now saturated with multiple examples of machine learning regression algorithms to estimate a variety of crop traits from remotely sensed data [
19,
20,
21]. However, it is less informative to treat the established machine learning regression model solely as a black box. Remotely sensed data analyses should always push toward better understanding of the underlying mechanisms and identification of the important spatial and spectral features that contribute most to crop trait of interest. An extensive interest has been shown in applying kernel-based machine learning regession algorithms for crop trait estimation, because they have high generalization performance and good ability to characterize non-linear relationships in a global manner. Owing to the study of Ustun et al. [
22], the kernel-based machine learning regression (e.g., support vector machine regression and kernel partial least squares regression) can be interpreted like partial least squares regression (PLSR) rather than a black box technique. The proposed visualization method makes it possible to interpret the driving force behind the established kernel-based regression model. Least squares support vector machine (LSSVM), as a variant of SVM, also applies structural risk minimization and is superior to artifical neural networks (ANNs) in terms of model generalization when limited training samples are available [
23]. At present, only limited studies have used LSSVM to retrieve vegetation biophyiscal parameters [
24,
25]. Moreover, the interpretation of LSSVM model has rarely been conducted. In this study, PLSR and LSSVM were selected as representatives of non-parametric linear and nonlinear regression techniques and used to model wheat biomass. Hyperspectral features describe the average tonal variations in multiple wavebands, whereas textural features contain information concerning the spatial distribution for tonal variations winthin a waveband [
26].
The goal of this study was to investigate if the proposed multiscale textures extracted from UAV digital images could compete with hyperspectral information for winter wheat biomass estimation. The specific objectives of this study were to (1) compare the predictive performance of optimal NDVI and continuum removal of red-edge spectra with different bandwidths; (2) evaluate the effectiveness of the proposed multiscale texture extraction method (hereafter referred to as Multiscale_Gabor_GLCM) and determine the optimal textural features for winter wheat biomass estimation; and (3) demonstrate if the combination of multiscale textures and spectral features (i.e., optimal NDVI and continuum removal of red-edge spectra) could further improve estimation accuracy.
2. Materials and Methods
2.1. Experimental Design
Winter wheat field experiments were conducted during three growing seasons (2014–2015, 2017–2018, and 2018–2019) at the Xiaotangshan Experiment Site (40.175°~40.188°N, 116.436°~116.451°E, geographic WGS84), Changping district, Beijing, China. The soil is classified as fine loam with an organic content of 15.3~19.1 mg/kg, a nitrate-nitrogen (NO3-N) content of 8.21~41.71 mg/kg, an ammonium-nitrogen (NH3-N) content of 0.45~8.2 mg/kg, an available potassium content of 60.58~109.61 mg/kg, and an available phosphorus content of 3.14~21.18 mg/kg at top 30 cm soil layer. This experimental station is characterized by a warm temperate and semi-humid continental monsoon climate. The minimum and maximum temperature are −10 ℃ and 40 ℃, respectively. Most of precipitation occurs in summer and the annual amount is about 620 mm.
The trails were undertaken in experimental plots and considered three experimental variables included different cultivars, irrigation rates, and nitrogen (N) application levels (
Table 1). Experiment 1 (Exp. 1) followed an orthogonal experimental design and involved three replications of two varieties (Zhongmai 175 and Jing 9843), four N application rates, and three irrigation levels. In this experiment, forty-eight sampling plots were designed with each covering an area of 48 m
2 (6 m × 8 m). Exp. 2 and Exp. 3 were conducted in two consecutive growing seasons (2017–2018 and 2018–2019), and both of them followed a randomized complete block design. There were 64 and 32 sampling plots for the two experiments, respectively. The covering area of each plot was about 135 m
2 (9 × 15 m). The two experiments conducted four repetitions and considered two winter wheat varieties (Lunxuan 167 and Jingdong 18) and four N application rates. The detailed experiment designs can be referred to the studies of Xu et al. [
27] and Fu et al. [
20]. For all experiments, 50% nitrogen fertilizer was applied prior to seeding, and the left was applied at the jointing stage. Other field management practices followed the local farmlands for winter wheat production.
2.2. Data Collection
2.2.1. UAV Images Acquisition and Pre-Processing
For Exp. 1 and Exp. 2, an eight-rotor UAV equipped with a high-definition digital camera was used to acquire RGB digital images of winter wheat canopy. The camera (Sony DSC-QX100, Sony, Tokyo, Japan; 20.2 megapixel; 64° field of view) was attached to a gimbal, which could compensate the UAV movement during the flight and guaranteed nearly nadir image acquisition. For Exp. 3, a DJI phantom 3 with embedded RGB digital camera (1/2.3-inch CMOS; 12 megapixel) was used to capture images of winter wheat canopy. The UAV systems flew autonomously over the study area according to the predefined flight route programmed by the DJI ground station. The longitudinal and lateral overlaps were set to 80% and 75%, respectively. The flight altitude was about 50 m, corresponding to a ground sampling distance of about 1 cm. The integration time of cameras was adjusted automatically according to the light conditions of each flight to avoid over-saturation and ensure the maximum dynamics. The camera focal length was 10 mm and the aperture was set to f/11 and f/3.2 for Exp. 1 and Exp. 2, respectively. The camera focal length was 9 mm, and the aperture was set to f/5.6 for Exp. 3. All flights were accomplished under cloud-free daylight conditions between 10:00 a.m. and 2:00 p.m. (Beijing local time).
Prior to extracting informative features from UAV-based RGB imagery, image-preprocessing must be conducted, and the main procedures included aligning images, generating dense point cloud, building mesh, constructing texture, and mosaicking image. The generation of ortho-rectified and geo-referred mosaic image depends on the structure from motion algorithm and mosaic blending theory [
28,
29]. We co-registered the RGB orthomosaics acquired at different growth stages using the image-to-image registration workflow of ENVI (Exelis Visual Information Solution, Boulder, CO, USA) with a geometric accuracy of <0.5 pixel. To maintain the radiometric consistency of multitemporal remotely sensed images, relative radiometric correction was conducted following the algorithm in the literature [
30]. The orthomosaic images for the three experiments were all resampled to a ground sampling distance of 1 cm through nearest-neighbor interpolation. To further avoid the influence of illumination variation, the normalization was implemented through dividing each pixel value by the corresponding sum of the pixel value of all bands [
31]. For each sampling plot, a region of interest (ROI) was selected from the center of plot area to exclude border effect. Consequently, a total of 384 ROIs were cropped, including 192 ROIs (448 × 553 pixels) for Exp. 1, 128 ROIs (621 × 930 pixels) for Exp. 2, and 64 ROIs (621 × 930 pixels) for the Exp. 3.
2.2.2. Hyperspectral Data Acquisition and Biomass Measurements
After UAV flight missions, the field hyperspectral reflectance measurements of wheat canopy were acquired by a field spectroradiometer (Analytical Spectral Devices, Boulder, CO. USA) with a spectral range of 350~2500 nm under cloud-free conditions around solar noon. The spectral resolution is 1.4 nm between 350 and 1050 nm, while the spectral resolution is 2 nm between 1050 and 2500 nm. Spectral data were usually interpolated to a spectral resolution of 1 nm for further analysis by the corresponding software. The sensor was placed vertically approximately 1.3 m above wheat canopy, with a field of view of 25°. Prior to each measurement, the radiance of a white standard spectral reference panel was recorded to convert the radiance measurements of each plot to reflectance. To ensure the reliability of spectral measurments, ten replicate spectra were taken for each plot, and the average spectrum of them was used to represent the spectrum of the plot. The spectra in the spectral region of 400~1350 nm were used for the subsequent analyses, due to their high signal to noise ratio. To further suppress noise in the spectral measurements, a Savitzky–Golay filter with a frame size of 19 was used.
After the collection of hyperspectral data, ground-based plant sampling was undertaken, and an area of 0.25 m2 wheat plants was clipped at ground level for each plot. The cut plants were sealed in a plastic bag and sent to laboratory immediately. These samples were firstly green-killed in an oven at 105 °C for about 30 minutes and then dried at 80 °C until a constant weight. The dry aboveground biomass was expressed on the basis of ground area according to the number of plant stems per unit area.
2.3. Multiscale Textures Extraction by Fusing Gabor Filter and GLCM Analyses
To capture the changes in winter wheat canopy during growing season well, multiscale textural features derived from high-definition UAV-based RGB images were required and expected to be more suitable for biomass estimation. To this end, we proposed a new texture extraction method, which took advantages of two-dimensional (2D) Gabor filter and GLCM analyses. For illustrative purposes,
Figure 1 shows the schematic of the proposed multiscale texture extraction method (Multiscale_Gabor_GLCM). GLCM is computed for various angular relationships and distances between nearest pixel pairs in the image [
26]. The textures derived from GLCM can well represent the spatial distribution of gray tones rather than the solely tones. The human visual system is regarded to have a special function to process information in multiresolution levels, which can be mimicked by changing the scale parameter in 2D Gabor filters [
32]. Gabor filters are actually a series of local spatial band-pass filters, which are characterized by good spatial localization, orientation selectivity, and frequency selectivity [
32]. In this study, four orientations ([0°, 45°, 90°, 135°]) and five scales (ν ∈ [0, 1, 2, 3, 4]) were considered to generate a bank of Gabor filters, which approximately uniformly covered the spatial-frequency domain. The settings of other parameters were the same with the literature [
20]. According to our previous study [
20], for UAV-based RGB images of winter wheat canopy, there was no significant heterogeneity in both Gabor- and GLCM-based textures of different orientations, and the textures of 45° were recommended. Thus, we only considered the orientation of 45° in the subsequent Gabor representation and GLCM analyses. For a ROI of winter wheat, each image band was convoluted with the Gabor filter bank, and a total of 15 Gabor magnitude images were generated. Then GLCM-based textures were extracted from each Gabor magnitude image to form multiscale image textural features. The used GLCM-based textures mainly included mean (Mean), variance (Var), homogeneity (Hom), contrast (Con), dissimilarity (Dis), entropy (Ent), second moment (Sec), and correlation (Cor) [
26]. Consequently, a total of 120 textural features were extracted for each ROI image.
The suffixes indicating specific feature scale and image band followed the names of textures for convenience and illustrative purpose. For instance, “Hom-G-S2”represents the texture homogeneity at scale 2 derived from green band, “Sec-R-S1”represents the texture second moment at scale 1 derived from red band, and “Cor-B-S5”represents the texture correlation at scale 5 derived from blue band.
2.4. Biomass Related Hyperspectral Feature Extraction
To extract hyperspectral features sensitive to wheat biomass, two methods were conducted including optimal narrowband vegetation index and continuum removal of red-edge spectra.
2.4.1. Narrowband Vegetation Index
Two-band normalized difference vegetation index (NDVI) was used to select biomass specific wavebands. All possible two-pair combinations of 951 wavelengths were used to calculate the narrowband NDVI-like indices. Then, the coefficient of determination (R2) was obtained in the univariate linear regression of these NDVI-like indices and winter wheat biomass based on the calibration dataset. According to the two-dimension (2D) correlation scalogram, the wavelength combinations resulting in the highest R2 values formed the optimal NDVI-like indices and were identified as sensitive spectral features to winter wheat biomass.
2.4.2. Continuum Removal of Red-Edge Spectra with Different Bandwidths
Red-edge spectral region has proved to have abundant information for crop biomass. To effectively extract spectral features from red edge, continuum removal was conducted on the spectra between 550 and 750 nm. Comparing with original hyperspectral data, continuum removal enables direct comparison of individual absorption features from a common baseline. Considering the strong intercorrelation among adjacent hyperspectral wavebands, the spectra between 550 and 750 nm under different bandwidths were simulated with full width at half maximum (FWHM) ranging from 1 to 21 nm and a step of 2 nm. The spectral response function was simulated by Gaussian function (
f) and calculated with the following Equations (1) and (2).
where
b represents the wavelength in the spectral response range for a certain band width;
is the central wavelength of interest;
is standard variance; and
ln is natural logarithm operator.
Apart from the reduction of the hyperspectral dimension, simulating spectra with different bandwidths was conducive to investigating the influence of bandwidth on the predictive performance of red-edge spectral features for winter wheat biomass estimation.
2.5. Non-Parametric Modeling Methods: PLSR and LSSVM
In this study, two different regression techniques were analyzed, including a non-parametric linear regression (i.e., PLSR) and a non-parametric nonlinear regression (i.e., LSSVM), so that improvements in estimation accuracy of features–algorithm pairs could be evaluated. Moreover, the visualization of LSSVM facilitated the identification of influential textural features for winter wheat biomass and improved the understanding of the established predictive models.
2.5.1. Partial Least Squares Regression
As a computationally efficient modeling algorithm, PLSR has been widely used to establish predictive models of crop traits based on a variety of features, due to its ability to cope with multi-collinearity and high-dimensional data. Rather than vegetation indices only using several wavebands, PLSR can incorporate full-spectrum data into modeling analysis through decomposing high-dimensional data into several orthogonal latent factors as well as maximizing the covariance between dependent and independent variables [
33]. The leave-one-out cross-validation (LOOCV) was used to select the optimal latent factors, which is crucial to model performance. To make compromises between model predictive ability and simplicity, the criterion of adding an extra factor to the model was that the root mean square error of LOOCV was decreased by no less than 2% [
34,
35].
2.5.2. Least Squares Support Vector Machine
As a variant of SVM, LSSVM inherits many strengths of standard SVM, including application of structural risk minimization, being capable of processing data with high-dimension and non-normal distribution, and overcoming the problem of small sample [
23]. However, the solution of LSSVM is accomplished through solving a set of equality constraints instead of inequality ones in standard SVM [
23]. It avoid solving a quadratic programming and largely improves the efficency of model establishement. LSSVM has been widely adopted in characterizing the nonlinear relationships between crop traits and various features. However, the driving force behind the established LSSVM regression models has been rarely interpreted and visualized. In this study, we took advantage of the visualization method for support vector regression (SVR) and investigated the influential textures for winter wheat biomass estimation. In the study of Ustun et al. [
22], the interpretation of a SVR model is realized by computing the inner product of the independent variable matrix and the vector of Lagrange multipliers (α-vector). The length of the resulting new vector is the number of independent variables. This new vector can be used to indicate which independent variables are highly correlated to the dependent variables of interest, like the partial least squares (PLS) loadings or regression coefficients. After identifying the most relevant independent variables through the initial LSSVM regression model, the final LSSVM regression was rebuilt with relevant independent variables. During the LSSVM regression modeling, the regularization parameter and kernel function parameters such as the standard variance of radial basis function (RBF) kernel were determined by LOOCV in the study.
2.6. Modeling Framework and Validation
To construct and validate winter wheat biomass predictive models, the three-year experimental data were split into calibration and validation datasets. The calibration dataset was used to adjust the parameters of models, while the validation dataset was used to confirm the true predictive power of the established models and had no effect on model construction. The validation dataset consisted of the data from the first replicate collected in 2015 and the fourth replicate collected in 2018 and 2019. The left data made up the calibration dataset. Consequently, there were 272 and 112 samples in calibration and validation datasets, respectively.
Figure 2 illustrates the modeling framework of this study. To validate the effectiveness of the proposed texture extraction method Multiscale_Gabor_GLCM, the GLCM-based textures were directly extracted from UAV-based RGB images and used to construct winter wheat biomass predictive model. To illustrate the necessity of the extraction of multiscale textures, the predictive performance of the textures from each scale was evaluated. The spectral-based predictive models included the linear regression model based on the optimal NDVI-like and the models based on the continuum removal of red-edge spectra with different bandwidths. The PLSR and LSSVM analyses based on the combination of multiscale textures and biomass related spectral features were conducted to ascertain whether the combination could improve estimation accuracy. Before modeling analyses, the combined feature matrices were standardized to reduce the influence of different ranges of feature values. The coefficient of determination (R
2), root mean squared error (RMSE), and mean absolute error (MAE) between observations and estimations were used to assess the accuracy of biomass predicted by different models.