1. Introduction
Wheat, one of the world’s three major cereal crops, plays a pivotal role in maintaining global food supply and security. Accurately and efficiently estimating wheat growth and yield is crucial for various agricultural practices, including field management and evaluation. Parameters such as plant height (PH) and the leaf area index (LAI) are effective indicators of wheat growth [
1,
2,
3], serving as important phenotypic parameters for assessing field management practices. Additionally, yield is closely associated with crop growth, serving as a key parameter in agricultural production processes and informing efficient fertilization strategies and germplasm evaluation [
4,
5].
The current methods for monitoring crop growth primarily rely on field surveys, which involve experiments conducted throughout the entire crop growth period in designated plots, typically carried out by breeding experts. Similarly, yield-related phenotypes require manual calculations after wheat maturation and threshing, involving long experimental cycles, low estimation efficiency, high labor intensity, and significant time costs. Moreover, with the advancement of remote sensing technology, its efficiency, comprehensive information acquisition, and independence from terrain conditions have led to its widespread application in agriculture [
6,
7,
8,
9,
10]. Since 1970, satellite remote sensing has been extensively used for large-scale crop yield prediction due to its excellent spatial, temporal, and spectral resolution. However, satellite data struggle to simultaneously meet the demands of spatial and temporal resolution and are susceptible to external environmental factors such as cloud cover [
11,
12,
13]. The conflict between the demand for spectral data for growth monitoring and the capabilities of satellite data acquisition platforms is becoming increasingly apparent. Emerging unmanned aerial vehicle (UAV) remote sensing platforms serve as an important means of spatial data collection, offering advantages such as high spatial resolution, low cost, and high efficiency [
14,
15,
16]. UAVs can partially compensate for the shortcomings of existing satellite platforms and have been applied in various fields, including vegetation cover calculation [
17], ecological environment monitoring [
18], multi-crop detection [
19], and pest or disease detection [
20]. Moreover, due to the ability of UAV remote sensing to quickly, accurately, and non-destructively acquire canopy spectral data within a certain range, it has irreplaceable advantages in estimating wheat growth and yield [
21,
22,
23].
Currently, UAV remote sensing for estimating wheat growth and yield mainly relies on models established using vegetation indices (VIs) [
24]. Zheng et al. addressed the issue of background differences in wheat cultivation under film cover by using spectral purification techniques to calculate 14 visible and near-infrared spectral indices, and utilized different machine learning algorithms to predict wheat phenotypic parameters [
25]. Zhao et al. constructed a wheat phenotypic prediction model based on a hierarchical linear model, achieving an accurate prediction of wheat phenotypes at different growth stages [
26]. Zhang et al. aimed to improve the prediction accuracy of wheat phenotypes by combining the normalized difference texture index (NDTI) from RGB images with vegetation indices, and built a model for predicting wheat LAI using the random forest regression method [
27]. Li et al. proposed a new Residual Soil Adjusted Red Edge Index (RSARE), effectively enhancing the prediction accuracy of early wheat LAI [
28]. Elazab et al. compared the effects of the normalized difference vegetation index (NDVI) and normalized green-red difference index (NGRDI) in predicting grain yield under different irrigation conditions, verifying the superiority of NGRDI in this regard [
29]. Zhang et al. correlated the growth characteristics of winter wheat with various vegetation indices on a time series axis, identified the optimal vegetation index combination, and established a winter wheat yield prediction model using the Bayesian optimization of the CatBoost (BO-CatBoost) regression method [
30]. Shafiee et al. considered factors such as the sunlight angle, camera model, and phenological period when predicting wheat yield, and demonstrated that the green normalized difference vegetation index (GNDVI) consistently had a high correlation with yield [
31]. Wang et al. developed a CNN-GRU deep learning framework, determining the optimal time for predicting yield through experiments and proving the effectiveness of a regional-scale wheat yield estimation [
32].
However, most studies only use data from a single growth stage to estimate wheat yield, and both the accuracy and stability of these estimates need to be improved. Furthermore, yield estimation often stops at model establishment, without integrating yield with growth phenotypes. This study utilizes winter wheat UAV multispectral data as a data source and proposes a wheat growth and yield estimation method based on the genetic algorithm-support vector regression (GA-SVR) algorithm. By establishing the optimal estimation model, trends in wheat growth and yield under different nitrogen application levels are discussed based on the obtained data, providing technical support for the application of UAV remote sensing platforms in agricultural production management.
3. Results
3.1. Correlation Analysis between Wheat Nutritional Growth Phenotypes and Vegetation Indices
In this study, wheat nutritional growth phenotypes were analyzed with LAI and PH as dependent variables, while vegetation indices served as the independent variables. The analysis was conducted for different nitrogen fertilizer application rates, and the results are presented in
Figure 3 and
Figure 4.
At the nitrogen fertilizer application level N0, the correlation coefficients between all vegetation indices and the LAI range from 0.12 to 0.68 (
Figure 3a). However, at nitrogen fertilizer application level N1, significant correlations were observed between all vegetation indices and the LAI, with correlation coefficients ranging from 0.38 to 0.63 (
Figure 3b). It was noteworthy that the GNDVI and NDVI exhibit the highest correlation coefficients with the LAI, at 0.63 and 0.56, respectively. As shown in
Figure 3c, at the nitrogen fertilizer application level N2, the absolute correlation values between all vegetation indices and the LAI range from 0.19 to 0.72. It was observed that the GNDVI reached its peak correlation across different nitrogen fertilizer application levels. These results suggested that the correlation between the wheat LAI and vegetation indices varies under different nitrogen fertilizer application rates. However, the GNDVI and NDVI consistently exhibited strong correlations with the LAI across varying nitrogen fertilizer levels. Most wheat plants reached their peak nutritional growth when nitrogen fertilizer was applied at level N1, resulting in maximum leaf coverage and reduced exposure to soil due to leaf overlap. Consequently, the interference from soil pixels and shadows on wheat canopy vegetation parameters was minimal, thus establishing the strongest association between vegetation indices and the wheat LAI. In addition, the mean absolute deviation (MAD) of the LAI ground data collected under different N fertilization gradients was calculated, which was 0.34, 0.46, and 0.60, respectively, with the smaller MAD representing the better consistency and reliability of the data. Among them, the collected wheat PH data had better consistency and reliability at the N0 and N1 levels. However, at the N2 level, the dispersion of the wheat LAI values increased, and the MAD increased, so that the consistency of the data decreased slightly but the reliability was still good.
At nitrogen fertilizer application level N0, significant correlations existed between all vegetation indices and PH, with correlation coefficients ranging from 0.13 to 0.64 (
Figure 4a). After applying nitrogen fertilizer at level N1, the correlation coefficients between all vegetation indices and PH range from 0.01 to 0.71 (
Figure 4b), with the NDRE and LCI exhibiting the highest correlation coefficients with the LAI, at 0.71 and 0.53, respectively. As depicted in
Figure 4c, at nitrogen fertilizer application level N2, the absolute correlation values between all vegetation indices and PH range from 0.13 to 0.45 (
p < 0.01). It can be observed that the correlation between PH and vegetation indices varies under different nitrogen fertilizer application levels. Although increasing nitrogen fertilizer application accelerated wheat plant growth, a higher PH may lead to an increased leaf tilt angle. Therefore, in all three scenarios, each vegetation index had a certain degree of correlation with wheat PH. The MAD of the PH ground data collected under different nitrogen fertilizer application gradients was 1.84 (N0), 1.70 (N1), and 2.19 (N2), respectively. From the calculation results, it can be seen that the LAI data collected under different nitrogen fertilizer application gradients had good consistency and reliability.
3.2. Construction of Wheat LAI and PH Inversion Models
A total of 108 sets of sample data were collected, consisting of wheat canopy multispectral reflectance data, vegetation indices calculated from these data, and corresponding ground-truth measurements of wheat plots. Among these, 70% of the data (76 sets) were randomly selected as the training set to construct the inversion model for the wheat LAI and PH, while the remaining 30% (32 sets) were reserved as the test set for model evaluation.
Based on the correlation results obtained from
Figure 3, the average correlation values of different nitrogen fertilizer application rates were calculated for vegetation indices and the LAI. These indices were sorted as follows: the GNDVI, NDVI, OSAVI, NDRE, and LCI. Then, the top two, three, four, and all five vegetation indices based on their importance rankings were selected as inputs for the GA-SVR machine regression algorithm to construct the wheat LAI inversion model. The results are presented in
Table 2 and
Figure 5.
It was observed that when constructing models using the top two vegetation indices, namely the GNDVI and NDVI, the coefficient of determination (R2) reaches a maximum of 0.82, with the root mean square error (RMSE) falling within an acceptable range. However, as the number of vegetation indices used in model construction increases, the coefficient of determination gradually decreases. When all five vegetation indices were utilized, the coefficient of determination of the constructed model dropped to a minimum of 0.56, and the RMSE reached its maximum at 0.34.
The vegetation indices calculated from wheat canopy multispectral reflectance data and the corresponding measured wheat PH data in the study area were selected as the training sample dataset. While 70% of the data (76 sets) were randomly chosen as the training set to construct the wheat PH inversion model, the remaining 30% (32 sets) were designated as the test set for model evaluation.
Based on the correlation results between vegetation indices and PH obtained from
Figure 4, different numbers of vegetation indices were selected as inputs for constructing the wheat PH inversion model using the GA-SVR machine. The results are presented in
Table 3 and
Figure 6. The coefficient of determination (R
2) of all constructed models exceeded 0.61. Notably, when using the top three vegetation indices based on correlation sorting, although the coefficient of determination was slightly lower compared to models using only the top two vegetation indices, the root mean square error was smaller, indicating the best overall model performance.
3.3. Wheat Yield Estimation Model
Since wheat yield was not only related to vegetation indices but also correlated with wheat nutritional growth phenotypes, including the LAI and PH, the five calculated vegetation indices, wheat nutritional growth phenotypes, and PH were taken as independent variables, while yield served as the dependent variable. Importance analysis was conducted separately, and the results are shown in
Figure 7.
Under all nitrogen fertilizer application gradients, the vegetation indices GNDVI, LCI, and NDRE exhibit relatively strong correlations with yield, with the LCI showing the highest correlation with yield. Additionally, the correlation between wheat phenotypic traits and yield demonstrates a relatively stable trend. The correlation between wheat plant height and yield was higher than that between the leaf area index and yield, and the overall correlations decreased with increasing nitrogen fertilizer application levels.
A total of 36 sets of data were collected, including vegetation indices calculated from wheat canopy multispectral reflectance data and corresponding ground-truth measurements of wheat plots. While 70% of the data (26 sets) were randomly selected as the training set to construct the wheat yield inversion model, the remaining 30% (10 sets) were used for model evaluation.
Using the GA-SVR machine regression algorithm, wheat yield inversion models were constructed using five vegetation indices, wheat nutritional growth phenotypes, and combinations of both as inputs. The results are presented in
Table 4.
It can be observed that when using vegetation indices to establish the model, the model performance was best when using the top three vegetation index combinations in terms of correlation ranking, with a highest coefficient of determination (R2) of 0.70 and a root mean square error (RMSE) of 68.5. When using a combination of vegetation indices and phenotypic traits to build the model, its overall performance was slightly better compared to models constructed using only vegetation indices or phenotypic traits. With an R2 of 0.67 and an RMSE of 77.8, this approach can improve the accuracy of wheat yield estimation to some extent, especially when it was challenging to obtain vegetation indices with high correlations. However, when only using wheat nutritional growth phenotypes to establish the model, the overall performance was poorer, with an R2 of only 0.52 and an RMSE of 90.8.
3.4. Comparative Analysis of Different Models
Using the optimal combinations of vegetation indices selected from the preceding text, namely the GNDVI and NDVI, NDRE, LCI, as well as all five vegetation indices correlated with the LAI, PH, and yield, the wheat LAI, PH, and yield inversion models were constructed separately using the partial least squares regression, random forest regression, and support vector machine regression methods. The results were presented in
Table 5.
In constructing the wheat LAI inversion model, the determination coefficients of the models generated using different regression methods were relatively consistent, hovering around 0.70. Notably, the random forest regression approach yielded a smaller root mean square error (RMSE) of 0.11. For the wheat PH inversion models developed using the partial least squares, random forest, and support vector machine regression techniques, the R2 values were 0.56, 0.68, and 0.65, respectively, accompanied by corresponding RMSE values of 5.8, 4.7, and 6.1. All models achieved R2 values exceeding 0.62, with RMSE values within an acceptable range, except for the model established using partial least squares regression. Notably, both the random forest and support vector machine regression models exhibited commendable performance.
Regarding the wheat yield inversion models, those built using partial least squares, random forest, and support vector machine regression methods yielded R2 values of 0.52, 0.64, and 0.66, respectively, with corresponding RMSE values of 90.1, 83.7, and 79.8. Both the random forest and support vector machine inversion models demonstrated superior determination coefficients and acceptable root mean square errors.
In summary, the wheat LAI, PH, and yield inversion models developed utilizing different regression methods, random forest, and support vector machine inversion models consistently showcased robust determination coefficients and acceptable root mean square errors. Moreover, owing to the support vector machine’s strong performance, especially attributed to the relatively small dataset, it exhibited notably higher determination coefficients and smaller root mean square errors compared to other models. Overall, the utilization of the GA-SVR machine regression method led to varying degrees of enhancement in the performance of the wheat LAI, PH, and yield inversion models. Furthermore, given the strong correlation between the LAI and vegetation indices, the improvement in this model outperformed others.
4. Discussion
4.1. Nutritional Growth and Yield Correlation in Wheat
Wheat yield is influenced by various factors, among which PH and the LAI are two crucial nutritional growth characteristics. PH reflects the height and structure of wheat plants, while the LAI provides information about leaf coverage and growth status.
In field cultivation, taller wheat plants typically possess more photosynthetically active areas, facilitating more efficient sunlight absorption and photosynthesis, thus enhancing yield. A moderate PH also helps maintain favorable ventilation and light conditions, promoting growth. Apart from the risk of lodging associated with excessively tall plants, there is a strong correlation between wheat PH and yield. The LAI, as an indicator reflecting vegetation leaf distribution and density, implies more leaf coverage with a higher index, providing more photosynthetically active areas conducive to increased yield. However, an excessive LAI may result in an excessive allocation of light energy to the lower parts of the plant, leading to the underutilization of photosynthesis. Hence, there exists a certain correlation between the LAI and yield (as shown in
Figure 8a,b).
In practical agricultural production, remote sensing technology can be employed to monitor and evaluate wheat PH and the LAI, facilitating the real-time detection of abnormal plant growth and providing scientific decision support for field management, thereby estimating and improving wheat yield.
4.2. Correlation of Wheat LAI, PH, Yield, and Nitrogen Fertilizer Application
The nutritional growth of wheat mainly includes the division and growth of embryonic root meristems, as well as the differentiation and development of leaves and stems. Reproductive growth refers to the growth of reproductive organs such as wheat spikes. After the application of nitrogen fertilizer exceeds a certain level, the nutritional growth of wheat exceeds reproductive growth, accelerating the growth of stems and leaves, which may lead to lodging, excessive growth, and prolonged maturity, resulting in reduced yield and quality. The relationship between the wheat LAI, PH, and nitrogen fertilizer application is shown in the figure, and it can be observed that the change in the LAI is more pronounced. When nitrogen fertilizer N1 is applied, wheat nutritional growth increases and the LAI is significantly higher than that of nitrogen level N0. However, because the LAI measurement is not only related to the leaf area but also the leaf inclination angle, under the condition of nitrogen fertilizer application N2, both the leaf area and leaf inclination angle increase, which will affect the value of the LAI, leading to chaotic changes in the LAI and a decrease in its correlation with nitrogen fertilizer application. Additionally, although the changing trend of the LAI with nitrogen fertilizer application is relatively small, it can also be observed from the graph that when nitrogen fertilizer N1 is applied, wheat PH is significantly higher than that of wheat without nitrogen fertilizer N0, indicating that the nitrogen fertilizer application significantly promotes the growth of wheat stems. There is a strong correlation between the nitrogen fertilizer application and PH when nitrogen fertilizer is applied without causing nitrogen stress. When nitrogen fertilizer N2 is applied, the nutritional growth of most wheat varieties peaks and the change in PH is small. However, a few varieties may experience excessive growth due to an excessive nitrogen fertilizer application.
Furthermore, based on the yield trend of different wheat varieties in gradient nitrogen fertilizer application field experiments (as shown in
Figure 9), wheat varieties can be roughly divided into three categories: those with a trend of increasing yield with increasing nitrogen fertilizer application, those with a trend of either increasing or decreasing yield with increasing nitrogen fertilizer application, and those with a decreasing trend in yield with increasing nitrogen fertilizer application. The first category of wheat varieties is the most numerous and is referred to as normal varieties, which exhibit a significant increase in yield when nitrogen fertilizer is applied at normal levels compared to when no nitrogen fertilizer is applied, but excessive nitrogen fertilizer application affects reproductive growth, leading to a decrease in yield. In addition, this study classifies nitrogen-efficient wheat into two categories. Firstly, there are two types of wheat varieties with more pronounced trends, showing a significant increase in yield under high nitrogen fertilizer application. These wheat varieties are classified as the first category of nitrogen-efficient wheat, which can continue to increase yield in a high nitrogen environment based on twice the normal application of nitrogen fertilizer without affecting wheat reproductive growth. Secondly, the yield trend of the second category of nitrogen-efficient wheat decreases with increasing nitrogen fertilizer application. These wheat varieties can achieve normal or even higher yields with only the residual accumulated nitrogen in the field, but once nitrogen fertilizer is applied, wheat yield decreases, categorizing them as having high nitrogen use efficiency.
4.3. Analysis of Vegetation Indices Changes under Different Nitrogen Fertilizer Application Levels
Vegetation indices are parameters obtained through remote sensing technology and are widely used in agriculture. These indices provide information about vegetation growth status, chlorophyll content, and moisture status. The five vegetation indices used in this study are strongly related to wheat nutritional growth phenotypes and yield. After the application of nitrogen fertilizer, the total nutritional and reproductive growth of wheat tends to increase. Overall, chlorophyll content shows an increasing trend, followed by a decrease in the growth rate after reaching a threshold or a decrease in growth due to high nitrogen stress. However, there are subtle differences in different vegetation indices. The LCI and NDRE exhibit similar changes under different gradient nitrogen fertilizer application levels (as shown in
Figure 10a,b). Both indices can characterize wheat growth conditions to varying degrees. Compared with the other three vegetation indices, they focus more on the characteristics of the red edge band, which is sensitive to chlorophyll. Therefore, they can better assess chlorophyll content and indirectly reflect wheat photosynthetic efficiency and growth status, facilitating the prediction of wheat growth and yield. The GNDVI and NDVI focus more on the information of the near-infrared band, and both can characterize wheat density and growth conditions (as shown in
Figure 10c,d). When nitrogen fertilizer is applied, the nutritional growth of wheat intensifies, the LAI increases, and wheat gradually becomes denser, resulting in an increase in the values of these two vegetation indices, making them more suitable for monitoring wheat nutritional growth. In addition, compared to other vegetation indices, the OSAVI pays more attention to soil influence (as shown in
Figure 10e). Due to variations in nitrogen fertilizer application levels, the nitrogen content in the soil also changes accordingly. As a result, the spectral data of the soil undergo more noticeable changes. Compared to other vegetation indices, the OSAVI shows more pronounced trends under different nitrogen fertilizer application levels. Therefore, using the OSAVI may lead to better performance in predicting wheat yield under different basal fertilizer application rates or in different planting areas.
5. Conclusions
This study proposed a detection method based on genetic algorithm-improved support vector regression (GA-SVR). Based on the correlation analysis of different vegetation indices with wheat phenotypes and yields, the optimal combination of vegetation indices was selected as inputs, and a regression prediction model was established using the GA-SVR algorithm. In turn, the growth monitoring and yield estimation of wheat were realized. The comprehensive performance of wheat LAI and PH models established based on the GA-SVR regression method surpasses other machine learning models. When using vegetation index combinations with higher correlations, the constructed models demonstrate a better detection of the wheat LAI and PH, with R2 values of 0.82 and 0.71, respectively, and RMSE values of 0.09 and 2.7, all within acceptable ranges. The wheat yield estimation model based on vegetation indices outperforms the model based on nutritional growth phenotypes, with an R2 of 0.7 and RMSE of 68.5. Future work is expected to incorporate soil nitrogen content as a consideration factor in model construction and apply the proposed method to wheat datasets covering a wider range of vegetation indices, regions, and environmental conditions.
The proposed method was applicable to wheat growth monitoring and yield estimation in a field environment. Its accuracy met the requirements. It provided an effective UAV remote sensing technique for monitoring wheat growth and estimating yield, in addition to technical support for wheat yield assessment and field management.