Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning

Lou, Zhengfang; Lu, Xiaoping; Li, Siyi

doi:10.3390/agronomy14081834

Open AccessArticle

Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning

by

Zhengfang Lou

,

Xiaoping Lu

^* and

Siyi Li

Key Laboratory of Spatio-Temporal Information and Ecological Restoration of Mines of Natural Resources of the People’s Republic of China, Henan Polytechnic University, Jiaozuo 454003, China

^*

Author to whom correspondence should be addressed.

Agronomy 2024, 14(8), 1834; https://doi.org/10.3390/agronomy14081834

Submission received: 29 July 2024 / Revised: 14 August 2024 / Accepted: 18 August 2024 / Published: 20 August 2024

(This article belongs to the Special Issue Applications of Machine Learning and Remote Sensing in Crop and Vegetation Monitoring)

Download

Browse Figures

Versions Notes

Abstract

Accurate and timely prediction of crop yields is crucial for ensuring food security and promoting sustainable agricultural practices. This study developed a winter wheat yield prediction model using machine learning techniques, incorporating remote sensing data and statistical yield records from Henan Province, China. The core of the model is an ensemble voting regressor, which integrates ridge regression, gradient boosting, and random forest algorithms. This study optimized the hyperparameters of the ensemble voting regressor and conducted an in-depth comparison of its yield prediction performance with that of other mainstream machine learning models, assessing the impact of key hyperparameters on model accuracy. This study also explored the potential of yield prediction at different growth stages and its application in yield spatialization. The results demonstrate that the ensemble voting regressor performed exceptionally well throughout the entire growth period, with an R² of 0.90, an RMSE of 439.21 kg/ha, and an MAE of 351.28 kg/ha. Notably, during the heading stage, the model’s prediction performance was particularly impressive, with an R² of 0.81, an RMSE of 590.04 kg/ha, and an MAE of 478.38 kg/ha, surpassing models developed for other growth stages. Additionally, by establishing a yield spatialization model, this study mapped county-level yield predictions to the pixel level, visually illustrating the spatial differences in land productivity. These findings provide reliable technical support for winter wheat yield prediction and valuable references for crop yield estimation in precision agriculture.

Keywords:

machine learning; winter wheat; growth stage; yield prediction; food security

1. Introduction

Accurate crop yield prediction is vital for ensuring food security and promoting sustainable agricultural development [1]. Wheat, one of the world’s three major staple crops, constitutes 40% of the global food supply, thus playing a pivotal role in global food security [2]. With the rapid advancement of satellite-based Earth observation technologies, the significance of utilizing remote sensing techniques in the research of large-scale winter wheat yield prediction has become increasingly apparent.

Crop yield is influenced by various factors such as weather, climate, soil, and field management practices [3]. Crop yield prediction commonly employs both physical and statistical modeling approaches. Physical models typically utilize crop growth models to simulate the dynamic changes in crop growth and the formation of yield [4,5]. However, the complexity of parameters required by physical models, including crop varieties, soil types, and climate variables, limits their application in large-scale predictions [6,7]. Statistical models predict yields by establishing relationships between crop production and inherent crop and environmental characteristics [8]. Remote sensing technology provides the data foundation for the establishment of statistical models [9]. Remote sensing data have the advantages of wide coverage and spectral range, which can capture a variety of crop characteristics, such as monitoring crop growth [10], identifying crop pests and diseases [11,12], and estimating weed density [13]. The vegetation index (VI), derived from remote sensing data, is more common in crop yield forecasting. VI is more sensitive to vegetation conditions than the original reflectance values and can better capture changes in vegetation conditions, such as crop growth and health status [14]. Temperature and evapotranspiration data can characterize crop health or stress [15]. Remote sensing products within various spectral ranges have been extensively employed in crop yield prediction. These include vegetation indices (VIs) [16,17,18], surface reflectance (SR) [19], leaf area index (LAI) [20], fraction of photosynthetically active radiation (FPAR) [21], solar-induced fluorescence (SIF) [9], land surface temperature (LST) [22], and gross primary productivity (GPP) [23], among others. Data-driven statistical models have the advantage of data detection and are widely used for crop yield prediction on a large scale [19,24,25]. Although remote sensing data provide type-rich data for crop yield prediction, crop yield, crop biochemical information, and growth condition information are usually nonlinear, and statistical models constructed using only linear relationships are poorly fitted at large scales [26].

Machine learning possesses the ability to discern nonlinear relationships between target and feature variables, effectively aiding quantitative remote sensing research [27]. Currently, various machine learning algorithms, such as ridge regression (RR), Gaussian process regression (GPR), random forest (RF), Lasso regression (Lasso), support vector machine (SVM), and gradient boosting, among others, are extensively applied in crop yield prediction driven by remote sensing data [28,29,30,31,32]. However, single machine learning algorithms exhibit instability in crop yield prediction. For instance, Pang et al. employed the random forest (RF) algorithm with high-resolution imagery, meteorological variables, and yield data to predict wheat yields in the southeastern region of Australia, where the predictive performance in one planting area significantly lagged behind that of the other two areas [30]. Similarly, Zhou et al. utilized remote sensing data and climate variables, employing the RF, SVM, and Lasso algorithms for yield prediction in winter wheat planting areas in China, revealing substantial disparities in the predictive accuracy among the three machine learning algorithms [33]. Moreover, utilizing feature variables across the entire growth period of winter wheat for yield prediction obscures the potential variations in predictive capabilities across different growth stages, thereby limiting the timeliness of governmental decision-making. Zhou et al., considering both spectral features and agronomic trait parameters, assessed the impact of different growth stages on yield prediction outcomes [17]. Zhao et al., employing inputs such as cumulative biomass, climate adaptability indices, and extreme climate indices in a statistical regression model, predicted wheat yields in the North China Plain and evaluated the performance of yield prediction models concerning different growth stages [34]. While the previous studies explored the predictive performance across different growth stages, they all employed a single machine learning model for yield prediction, overlooking the impact of the model itself on the predictive potential across various growth stages. Additionally, the yield prediction results in the past were mostly presented at the county level and did not downscale county-level yield data to pixel-level resolution. Pixel-level yield information is crucial in helping the government take necessary measures in the agricultural production process to achieve yield maximization.

To address the aforementioned issues, this study utilized eight parameters, namely normalized difference vegetation index (NDVI), land surface temperature (LST), gross primary productivity (GPP), enhanced vegetation index (EVI), fraction of photosynthetically active radiation (Fpar), potential evapotranspiration (PET), actual evapotranspiration (ET), and leaf area index (LAI). Combining these parameters with winter wheat yield statistics, an ensemble voting model based on gradient boosting, random forest, and ridge algorithms was constructed. This study analyzed the yield prediction potential across different growth stages of winter wheat and established a spatialization model for winter wheat yield at both county and pixel levels (Figure 1).

2. Materials and Methods

2.1. Study Area

This study focuses on Henan Province, a primary winter wheat cultivation area in China, located between 31°23′ to 36°22′ N and 110°21′ to 116°39′ E (Figure 2). The region encompasses diverse land-use types, ranked in descending order by proportion: arable land, forest land, built-up land, water bodies, grassland, and unused land [35]. The climate is characterized as subtropical and temperate monsoon, featuring distinct seasonal variations, with an average annual temperature ranging from 12 to 16 °C and annual precipitation between 500 and 900 mm [36]. In the study area, winter wheat is typically sown in October and harvested from late May to early June of the following year [37]. Precipitation in both spring and winter benefits winter wheat growth and other early spring crops. However, natural precipitation alone is insufficient to meet the growth requirements of winter wheat, prompting local farmers to adopt groundwater extraction for irrigation as an additional water source [38].

2.2. Data and Pre-Processing

2.2.1. Statistical Data

The county-level total yield and total planting area data for winter wheat in Henan Province were sourced from the Statistical Yearbook of Henan Province published by the Henan Provincial Bureau of Statistics [39]. Unit yield data for each county-level entity were obtained by dividing the total yield by the total planting area. Considering factors such as administrative changes and occasional missing statistical data, this study selected counties with complete records from 2012 to 2021 (Figure 3) as the modeling and analysis samples, resulting in an effective sample size of 1020 records.

2.2.2. Winter Wheat Vector Data and Phenological Periods

The reliability of crop yield prediction using agri-environmental variables depends on the spatial aggregation of environmental variables. The use of annual masks for specific crop groups can effectively improve the accuracy of yield estimates compared to the use of general cropland masks [40]. The winter wheat vector data in this study were extracted from 10 m resolution Sentinel-2 satellite remote sensing images via an object-oriented deep learning method. In order to ensure the accuracy and quality of the data, the satellite images used underwent rigorous pre-processing, including atmospheric correction, radiometric correction, and geometric correction, to eliminate possible sensor errors and atmospheric effects. At the same time, in order to ensure consistency between the extracted winter wheat planting distribution data and the actual planting situation, the confusion matrix was calculated by combining the ground survey sample data, and the Kappa coefficient was 0.82. Considering that Henan Province is the main production area of winter wheat and its crop cultivation structure is relatively stable, the winter wheat vector data in 2021 were selected as the mask data for extracting the model feature parameters. Table 1 shows the winter wheat phenology calendar in Henan Province.

2.2.3. Remote Sensing Data

The prediction of winter wheat yield is affected by many complex factors [41]. Yield prediction models considering multiple factors have better accuracy in yield estimation [33,42]. In this paper, based on MODIS remote sensing products, EVI and NDVI, which reflect vegetation growth, and GPP, FPAR, LST, LAI, ET, and PET, which represent ecosystem functions, were calculated. The data type, resolution, and source are detailed in Table 2. In order to ensure data consistency and accuracy, the above remote sensing data were pre-processed according to the quality control band of MODIS data products, and mask processing was performed according to the winter wheat vector data of each county, and the average value was calculated. In addition, some characteristic variables may have outliers, and these outliers may affect the model’s prediction performance. For this reason, the RobustScaler scale, which is more robust to outliers, was used in this paper to pre-process the feature variables to reduce the impact of outliers on the prediction results, thereby improving the reliability and accuracy of the prediction.

2.3. Machine Learning Methods for Yield Prediction

The commonly used statistical model methods for yield prediction include linear regression, deep learning, and machine learning. Linear regression typically involves simple weights and coefficients but fails to capture nonlinear relationships within the data. Deep learning models are often more complex, requiring training on large datasets to achieve outstanding performance. In contrast, machine learning models stand out by bridging the shortcomings of both approaches, showcasing unique advantages. Machine learning has the ability to explore linear or nonlinear relationships between data features and target variables. It is well-suited for small-scale datasets, exhibiting relatively low model complexity, and is less prone to overfitting effects.

2.3.1. Linear Models and Regularization Methods

In this study, two linear regression models were initially employed: ridge regression and elastic net regression, to predict winter wheat yield. Ridge regression effectively addresses the issue of multicollinearity among predictors by introducing a regularization term [43]. The regularization strength was set to α = 1.0 to ensure robust performance in environments with highly correlated features. Elastic net regression, which combines L1 and L2 regularization (with l1_ratio = 0.5), excels in handling both sparsity and multicollinearity, making it particularly suitable for high-dimensional data.

2.3.2. Decision Tree Models and Their Extensions

This study also utilized various decision tree-based models to capture the nonlinear characteristics of winter wheat yield, including the decision tree regressor, extra tree regressor, and random forest regressor. The decision tree regressor recursively partitions the data space and makes predictions within each partition, making it particularly suitable for noisy data [44]. To mitigate the risk of overfitting, the absolute error criterion (criterion = “absolute_error”) was employed and a random state was set. The extra tree regressor enhances model robustness by introducing randomness during node splitting [45], using a random splitter (splitter = “random”) and the squared error criterion (criterion = “squared_error”). The random forest regressor reduces model variance by integrating multiple decision trees, showing excellent performance, especially in handling high-dimensional and missing data. In this study, we used 100 trees (n_estimators = 100) and full feature selection (max_features = 1.0) to ensure the model’s robustness.

2.3.3. Distance-Based Models

This study also employed the instance-based k-nearest neighbors regressor (KNeighborsRegressor). KNeighborsRegressor is an instance-based learning method that makes predictions by calculating the distance between new samples and those in the training set [46]. In this research, three nearest neighbors (n_neighbors = 3) and the Minkowski distance metric (p = 2) were used. This approach is particularly suitable for analyzing datasets with smaller sample sizes or low-dimensional feature subsets.

2.3.4. Support Vector Regression Models

This study also introduced support vector regression (SVR and NuSVR) for yield prediction. Support vector regression models (SVR and NuSVR) are based on the theory of support vector machines (SVM) and are well-suited for handling small sample sizes and nonlinear problems. NuSVR controls the proportion of support vectors by introducing a relaxation variable, using a linear kernel function (kernel = “linear”) and a regularization parameter C = 0.5. In contrast, SVR uses a radial basis function (RBF) kernel (kernel = “rbf”) to handle nonlinear relationships. Although SVR is advantageous for small samples and nonlinear problems, its higher computational complexity limits its application in large-scale data and high-dimensional features [47].

2.3.5. Ensemble Models

To further enhance model performance, this study employed various ensemble methods, including the gradient boosting regressor, AdaBoost regressor, and voting regressor. The gradient boosting regressor improves predictive accuracy by incrementally fitting new models [37]. The learning rate was set to 0.2 (learning_rate = 0.2) and 200 iterations were used (n_estimators = 200). The AdaBoost regressor increases the model’s accuracy by adjusting sample weights to handle difficult-to-predict samples more effectively [38], with 50 weak learners (n_estimators = 50) and a default learning rate (learning_rate = 1.0). The voting regressor enhances overall model performance by combining the predictions of three base learners: ridge, random forest, and gradient boosting. Each base learner was individually tuned, with ridge set to α = 0.01, random forest using a random state, and gradient boosting with a learning rate of 0.2 and 200 iterations. This approach leverages the strengths of multiple models, making it suitable for complex regression problems that require a balance between bias and variance [48]. Figure 4 provides a visual representation of the ensemble methods used in this study.

2.4. Winter Wheat Yield Prediction

To assess the performance of various machine learning algorithms in yield prediction, initially, we constructed models using data spanning the entire growth period. We partitioned the county-level modeling factors for each growth stage of winter wheat, calculating the mean values of these factors during the corresponding growth stage as feature variables for model construction. Data from 2012 to 2019 served as the training set for training the models, including the ensemble voting model based on ridge, random forest, and gradient boosting, along with 10 other commonly used machine learning models. The data from 2020 were used as the validation set and the data from 2021 as the test set in the evaluation of the predictive accuracy of the models.

To explore the yield prediction potential across different growth stages, we chose machine models with high accuracy throughout the entire growth period and the ensemble voting model constructed in this study for modeling analysis at each growth stage. In this process, each class of modeling factors was partitioned based on the winter wheat growth stage and used as a set of feature variables for model construction. We continued to use data from 2012 to 2019 as the training set, 2020 as the validation set, and 2021 as the test set for data partitioning. The algorithms were then employed for yield prediction at each growth stage, followed by accuracy validation.

2.5. Accuracy Assessment

This study constructed winter wheat yield prediction models using historical yield data and remote sensing data, followed by accuracy evaluation. All constructed yield prediction models underwent validation using K-fold cross-validation, with root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R²) employed as evaluation metrics.

M A E = \frac{\sum_{i = 1}^{n} |y_{i} - y_{p}|}{n},

(1)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - y_{p})}^{2}}{n}},

(2)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - y_{p})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}},

(3)

where

n

represents the number of samples,

y_{i}

denotes the actual observed value for the ith sample,

y_{p}

is the predicted value for the ith sample, and

\bar{y}

is the mean of all the observed values.

2.6. Construction of Spatialization Model for Yield

By constructing a yield spatialization model, we aimed to depict productivity visually at the pixel level and reveal spatial variations in winter wheat yield across Henan Province. The model utilized the county-level averages of each feature variable as a bridge, linking the feature variables at the pixel level to the county-level predicted yields. This process allowed for the back-calculation of county-level winter wheat yield data to the pixel level. The yield spatialization model is represented by Equation (4):

y_{p i x e l} = \frac{y_{p r e} \times \sum_{j = 1}^{N} \frac{{F e a t u r e}_{i}}{{F e a t u r e}_{m e a n}}}{N},

(4)

where

N

represents the number of modeling driver types, which is 8 in this study;

{F e a t u r e}_{i}

corresponds to the pixel value for each class of modeling driver factors;

{F e a t u r e}_{m e a n}

denotes the pixel average value for each class of modeling driver factors;

y_{p r e}

is the predicted yield; and

y_{p i x e l}

is the yield for each pixel.

3. Results

3.1. Comparison of Accuracy in Yield Prediction Algorithms for Entire Growth Period

This study utilized winter wheat yield data as the target variable and constructed 11 winter wheat yield prediction models covering the entire growth period in Henan Province. Table 3 summarizes the accuracy of these models on the validation and test sets for winter wheat yield prediction.

In the validation set, the ensemble voting model achieved the highest R² value, reaching 0.90. Following closely were ridge, gradient boosting, and random forest, with R² values ranging from 0.69 to 0.86, while other models had R² values below 0.69. Simultaneously, the ensemble voting model obtained the lowest RMSE and MAE at 439.21 kg/ha and 351.28 kg/ha, respectively. Ridge, gradient boosting, and random forest exhibited relatively higher RMSE and MAE, ranging from 509.50 to 756.39 kg/ha and 389.66 to 611.46 kg/ha, respectively. The RMSE and MAE for other models fell within the range of 847.14 to 1332.69 kg/ha and 725.36 to 1165.48 kg/ha, respectively.

In the test set, the ensemble voting model continued to achieve the highest R² value and the lowest RMSE and MAE, with values of 0.90, 424.44 kg/ha, and 313.92 kg/ha, respectively. Following closely were ridge, gradient boosting, and random forest, with R² values ranging from 0.75 to 0.79 and RMSE and MAE ranging from 609.80 to 676.49 kg/ha and 475.07 to 517.72 kg/ha, respectively. Other models exhibited R² values below 0.75, with RMSE and MAE ranging from 821.96 to 1316.57 kg/ha and 614.30 to 1161.38 kg/ha, respectively. The results from the validation and test sets indicated that the ensemble voting model outperformed other models in predicting winter wheat yield across the entire growth period in Henan Province.

To assess the contribution of each sub-model to the overall performance of the ensemble voting model, we conducted an ablation experiment. The ensemble voting model comprised three sub-models: ridge, random forest, and gradient boosting. By sequentially removing these sub-models and evaluating model performance using R², RMSE, and MAE, Table 4 summarizes the changes in model performance during the ablation experiment. It is evident that the removal of sub-models led to a decrease in model performance in both the validation and test sets, with the most significant decline observed when removing ridge and the smallest decline when removing random forest. Specifically, R² decreased by a range of 0.02 to 0.13, RMSE decreased by 38.41 to 213.29 kg/ha, and MAE decreased by 30.34 to 174.76 kg/ha.

From Figure 5, it is evident that the ensemble voting model achieved the highest accuracy on both the validation and test sets, demonstrating robust performance. Figure 6 and Figure 7 depict scatter plots between predicted values and actual values for models with an R² value exceeding 0.75 in the validation and test sets, respectively. These plots provide a visual representation of the performance and residual information of each predictive model on both training and test sets. The results further confirm the effectiveness and superiority of the ensemble voting model in winter wheat yield prediction.

3.2. Comparison of Accuracy in Yield Prediction Algorithms for Individual Growth Periods

This study focused on winter wheat yield as the target variable, utilizing different modeling factors for each growth stage as the feature variables. High-precision machine learning algorithms, including gradient boosting, random forest, ridge, and ensemble voting, were employed for modeling and analysis at each growth stage throughout the entire winter wheat growing season. Figure 8 illustrates the accuracy of the four algorithm models in predicting winter wheat yield on the validation and test sets. For the validation set, considering the entire growth period, the ranges of R², RMSE, and MAE across different models were 0.20–0.81, 585.34–1211.34 kg/ha, and 468.07–1040.78 kg/ha, respectively. On an individual growth stage basis, the differences in R², RMSE, and MAE for each model ranged from 0.05 to 0.30, 48.32 to 260.84 kg/ha, and 3.01 to 198.16 kg/ha, respectively. Ensemble voting showed higher yield prediction accuracy. For the test set, considering the entire growth period, the ranges of R², RMSE, and MAE across different models were 0.33–0.77, 645.66–1101.03 kg/ha, and 534.93–877.08 kg/ha, respectively. On an individual growth stage basis, the differences in R², RMSE, and MAE for each model ranged from 0.16 to 0.42, 268.3 to 372.83 kg/ha, and 263.02 to 403.70 kg/ha, respectively. The heading stage exhibited the highest yield prediction accuracy.

In both the validation and test sets, the ensemble voting model consistently exhibited stable predictive capabilities. Figure 9 illustrates the scatter plot and residual information for the ensemble voting model during the heading stage. The curve in the graph indicates a well-fitted linear relationship between the predicted and actual yields during the wheat heading stage.

3.3. Pixel-Level Spatialization of Yield

Based on the aforementioned research, the stability and superiority of the ensemble voting model were confirmed, particularly in predicting winter wheat yields during the heading stage. Therefore, this study adopted the ensemble voting model, utilizing various feature variables during the wheat heading stage as inputs, to predict the winter wheat yield in Henan Province for the year 2021. By employing the yield spatialization model, the predicted results at the county level were mapped to a more refined pixel scale. The enhancement in spatial resolution allowed for more accurate capture of the geographical variations in winter wheat productivity, unveiling local growth characteristics and yield fluctuations within specific regions.

Figure 10 demonstrates the spatial distribution of winter wheat yield at the pixel scale in Henan Province. It can be observed from the figure that the high-yielding areas of winter wheat in Henan Province are mainly concentrated in the eastern region, and the yield in the western region is relatively low. In addition, the planting structure of winter wheat is more fragmented in the western region. The more concentrated distribution of high-yielding plots highlights the existence of more favorable conditions for winter wheat cultivation in the eastern region. The results of this spatialization of yields help us to more comprehensively understand the geographic variability of winter wheat yields in Henan Province and provide visual support for agricultural decision-making and management.

4. Discussion

4.1. Performance Comparison of Winter Wheat Yield Prediction Models

We compared commonly used machine learning models with the ensemble voting model constructed in this study. The results show that the R² values for different algorithm models range from 0.03 to 0.90, demonstrating significant performance differences. Among these models, the ensemble voting model exhibited the highest R² and the lowest RMSE and MAE for both the full growth cycle and individual growth cycles. This model integrates ridge, gradient boosting, and random forest using a weighted approach during training, where the results of weak learners compensate for the errors of individual learners [49,50], enhancing the flexibility of reducible error [51] and showcasing the substantial advantages of ensemble learning in overall prediction performance. When selecting the best ensemble method for a given problem, it is important to consider the suitability of the setup (such as class imbalance and high dimensionality) as well as computational costs [52]. Due to its advantages of fast training speed and low computational cost, ensemble learning methods are widely used in various fields, including short-term power load forecasting, cost estimation, and plasma reaction dynamics modeling [53,54,55].

4.2. Analysis of Yield Prediction Potential for Individual Growth Periods

The spatial heterogeneity of the soil and physiological characteristics of crops change during different growth stages [18,56,57,58]. Figure 8 illustrates the accuracy trends of various yield prediction models for winter wheat. In this study, we observed significant differences in the accuracy of winter wheat yield predictions across different growth stages. The accuracy of yield predictions gradually increased with the growth of winter wheat, peaking at the heading stage before declining. The peak prediction potential at the heading stage is likely due to the formation of spikes and ears, which stabilize the plant’s morphology and structure. Nutrient accumulation and growth conditions play a crucial role in the final yield of winter wheat. The subsequent decline in prediction accuracy may be attributed to the inclusion of vegetation index data among the features. After the heading stage, nutrients are transferred from the stems and leaves to the grains, leading to a decrease in chlorophyll in the leaves. This reduction in chlorophyll affects the vegetation index data related to chlorophyll, thereby reducing its correlation with winter wheat yield and resulting in decreased accuracy of the yield prediction models.

4.3. Analysis of Spatialization in Yield Research

Small-scale yield prediction for winter wheat is crucial for understanding planting structures and achieving optimal agricultural resource allocation. Previous studies typically used methods such as drone imagery and crop growth models for yield estimation at the field level [4,56]. However, these methods often struggle to cover provincial scales simultaneously, and some crop growth models require numerous parameters. Due to spatial heterogeneity in soil, weather, and environmental factors, unifying some of these parameters can be challenging.

In this study, the ensemble voting model was used to predict winter wheat yield during the heading stage, which offers the greatest yield prediction potential. By applying a spatial yield model, winter wheat yield estimates at the county level were downscaled to the pixel level. This approach not only meets the need for county-level yield prediction but also visually represents winter wheat productivity at a finer scale. The feasibility and practicality of this method make it a powerful tool for supporting agricultural decision-making and management, providing a scientific basis for precision agriculture.

5. Conclusions

In this study, we estimated the yield of winter wheat in Henan Province using eight feature variables, including LAI, LST, and GPP, along with historical yield data. We proposed an ensemble voting model composed of gradient boosting, random forest, and ridge. The results show that the ensemble voting model demonstrated the highest accuracy among various machine learning models, both across the entire growth period and within individual growth stages, highlighting the stability and predictive accuracy of this approach for crop yield estimation. Additionally, we found that the heading stage had the greatest yield prediction potential, which may be linked to the stabilization of the wheat plants’ morphology and nutrient accumulation during this period. This finding provides critical information for the early allocation of agricultural resources, greatly aiding in the achievement of food security and precision agriculture. By constructing a yield spatialization model, we refined the county-level yield predictions to the pixel level, avoiding the complexities and computational difficulties associated with direct pixel-level yield estimation, thus offering an effective solution for pixel-level winter wheat yield prediction.

Nevertheless, this study has some uncertainties and areas for improvement. Crop yield is influenced by multiple factors, including climate, soil properties, and human management practices, and the modeling features selected in this study may not cover all key factors, leading to potential instability in prediction performance. Future research should consider incorporating additional features to enhance the model’s comprehensiveness and accuracy. Furthermore, the relatively limited training data may affect the model’s training efficacy and generalization capability. Future studies could use longer time-series data or introduce adversarial networks and other methods to increase sample size, thereby better capturing the long-term trends and cyclical variations in winter wheat yield.

Author Contributions

Conceptualization, X.L. and Z.L.; methodology and data curation, Z.L. and S.L.; writing—original draft preparation, Z.L.; writing—review and editing, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research and the APC were funded by National Key Research and Development Plan of China (No. 2016YFC0803103).

Data Availability Statement

Data are contained within the article. The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tian, H.; Wang, P.; Tansey, K.; Han, D.; Zhang, J.; Zhang, S.; Li, H. A Deep Learning Framework under Attention Mechanism for Wheat Yield Estimation Using Remotely Sensed Indices in the Guanzhong Plain, PR China. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102375. [Google Scholar] [CrossRef]
Chen, P.; Li, Y.; Liu, X.; Tian, Y.; Zhu, Y.; Cao, W.; Cao, Q. Improving Yield Prediction Based on Spatio-Temporal Deep Learning Approaches for Winter Wheat: A Case Study in Jiangsu Province, China. Comput. Electron. Agric. 2023, 213, 108201. [Google Scholar] [CrossRef]
Xu, X.; Gao, P.; Zhu, X.; Guo, W.; Ding, J.; Li, C.; Zhu, M.; Wu, X. Design of an Integrated Climatic Assessment Indicator (ICAI) for Wheat Production: A Case Study in Jiangsu Province, China. Ecol. Indic. 2019, 101, 943–953. [Google Scholar] [CrossRef]
Zhuo, W.; Fang, S.; Gao, X.; Wang, L.; Wu, D.; Fu, S.; Wu, Q.; Huang, J. Crop Yield Prediction Using MODIS LAI, TIGGE Weather Forecasts and WOFOST Model: A Case Study for Winter Wheat in Hebei, China during 2009–2013. Int. J. Appl. Earth Obs. Geoinf. 2022, 106, 102668. [Google Scholar] [CrossRef]
Ren, Y.; Li, Q.; Du, X.; Zhang, Y.; Wang, H.; Shi, G.; Wei, M. Analysis of Corn Yield Prediction Potential at Various Growth Phases Using a Process-Based Model and Deep Learning. Plants 2023, 12, 446. [Google Scholar] [CrossRef] [PubMed]
Cao, J.; Zhang, Z.; Tao, F.; Zhang, L.; Luo, Y.; Han, J.; Li, Z. Identifying the Contributions of Multi-Source Data for Winter Wheat Yield Prediction in China. Remote Sens. 2020, 12, 750. [Google Scholar] [CrossRef]
Li, L.; Wang, B.; Feng, P.; Li Liu, D.; He, Q.; Zhang, Y.; Wang, Y.; Li, S.; Lu, X.; Yue, C.; et al. Developing Machine Learning Models with Multi-Source Environmental Data to Predict Wheat Yield in China. Comput. Electron. Agric. 2022, 194, 106790. [Google Scholar] [CrossRef]
Cao, J.; Wang, H.; Li, J.; Tian, Q.; Niyogi, D. Improving the Forecasting of Winter Wheat Yields in Northern China with Machine Learning–Dynamical Hybrid Subseasonal-to-Seasonal Ensemble Prediction. Remote Sens. 2022, 14, 1707. [Google Scholar] [CrossRef]
Cai, Y.; Guan, K.; Lobell, D.; Potgieter, A.B.; Wang, S.; Peng, J.; Xu, T.; Asseng, S.; Zhang, Y.; You, L.; et al. Integrating Satellite and Climate Data to Predict Wheat Yield in Australia Using Machine Learning Approaches. Agric. For. Meteorol. 2019, 274, 144–159. [Google Scholar] [CrossRef]
Madugundu, R.; Al-Gaadi, K.A.; Tola, E.; Edrris, M.K.; Edrees, H.F.; Alameen, A.A. Optimal Timing of Carrot Crop Monitoring and Yield Assessment Using Sentinel-2 Images: A Machine-Learning Approach. Appl. Sci. 2024, 14, 3636. [Google Scholar] [CrossRef]
Kaur, P.; Harnal, S.; Tiwari, R.; Upadhyay, S.; Bhatia, S.; Mashat, A.; Alabdali, A.M. Recognition of Leaf Disease Using Hybrid Convolutional Neural Network by Applying Feature Reduction. Sensors 2022, 22, 575. [Google Scholar] [CrossRef]
Nagaraju, M.; Chawla, P.; Upadhyay, S.; Tiwari, R. Convolution Network Model Based Leaf Disease Detection Using Augmentation Techniques. Expert Syst. 2022, 39, e12885. [Google Scholar] [CrossRef]
Mishra, A.M.; Harnal, S.; Gautam, V.; Tiwari, R.; Upadhyay, S. Weed density estimation in soya bean crop using deep convolutional neural networks in smart agriculture. J. Plant Dis. Prot. 2022, 129, 593–604. [Google Scholar] [CrossRef]
Khan, S.N.; Li, D.; Maimaitijiang, M. A Geographically Weighted Random Forest Approach to Predict Corn Yield in the US Corn Belt. Remote Sens. 2022, 14, 2843. [Google Scholar] [CrossRef]
Proutsos, N.D.; Fotelli, M.N.; Stefanidis, S.P.; Tigkas, D. Assessing the Accuracy of 50 Temperature-Based Models for Estimating Potential Evapotranspiration (PET) in a Mediterranean Mountainous Forest Environment. Atmosphere 2024, 15, 662. [Google Scholar] [CrossRef]
Guo, Y.; Wang, H.; Wu, Z.; Wang, S.; Sun, H.; Senthilnath, J.; Wang, J.; Robin Bryant, C.; Fu, Y. Modified Red Blue Vegetation Index for Chlorophyll Estimation and Yield Prediction of Maize from Visible Images Captured by UAV. Sensors 2020, 20, 5055. [Google Scholar] [CrossRef] [PubMed]
Zhou, H.; Yang, J.; Lou, W.; Sheng, L.; Li, D.; Hu, H. Improving Grain Yield Prediction through Fusion of Multi-Temporal Spectral Features and Agronomic Trait Parameters Derived from UAV Imagery. Front. Plant Sci. 2023, 14, 1217448. [Google Scholar] [CrossRef] [PubMed]
Zhou, X.; Zheng, H.B.; Xu, X.Q.; He, J.Y.; Ge, X.K.; Yao, X.; Cheng, T.; Zhu, Y.; Cao, W.X.; Tian, Y.C. Predicting Grain Yield in Rice Using Multi-Temporal Vegetation Indices from UAV-Based Multispectral and Digital Imagery. ISPRS J. Photogramm. Remote Sens. 2017, 130, 246–255. [Google Scholar] [CrossRef]
Sun, J.; Di, L.; Sun, Z.; Shen, Y.; Lai, Z. County-Level Soybean Yield Prediction Using Deep CNN-LSTM Model. Sensors 2019, 19, 4363. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Si, H.; Gao, Z.; Shi, L. Winter Wheat Yield Prediction Using an LSTM Model from MODIS LAI Products. Agriculture 2022, 12, 1707. [Google Scholar] [CrossRef]
Tan, C.; Wang, D.; Zhou, J.; Du, Y.; Luo, M.; Zhang, Y.; Guo, W. Remotely Assessing Fraction of Photosynthetically Active Radiation (FPAR) for Wheat Canopies Based on Hyperspectral Vegetation Indexes. Front. Plant Sci. 2018, 9, 776. [Google Scholar] [CrossRef] [PubMed]
Schwalbert, R.A.; Amado, T.; Corassa, G.; Pott, L.P.; Prasad, P.V.V.; Ciampitti, I.A. Satellite-Based Soybean Yield Forecast: Integrating Machine Learning and Weather Data for Improving Crop Yield Prediction in Southern Brazil. Agric. For. Meteorol. 2020, 284, 107886. [Google Scholar] [CrossRef]
Khan, S.N.; Li, D.; Maimaitijiang, M. Using Gross Primary Production Data and Deep Transfer Learning for Crop Yield Prediction in the US Corn Belt. Int. J. Appl. Earth Obs. Geoinf. 2024, 131, 103965. [Google Scholar] [CrossRef]
Peng, B.; Guan, K.; Pan, M.; Li, Y. Benefits of Seasonal Climate Prediction and Satellite Data for Forecasting U.S. Maize Yield. Geophys. Res. Lett. 2018, 45, 9662–9671. [Google Scholar] [CrossRef]
Huber, F.; Yushchenko, A.; Stratmann, B.; Steinhage, V. Extreme Gradient Boosting for Yield Estimation Compared with Deep Learning Approaches. Comput. Electron. Agric. 2022, 202, 107346. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, Z.; Luo, Y.; Cao, J.; Xie, R.; Li, S. Integrating Satellite-Derived Climatic and Vegetation Indices to Predict Smallholder Maize Yield Using Deep Learning. Agric. For. Meteorol. 2021, 311, 108666. [Google Scholar] [CrossRef]
Fei, S.; Hassan, M.A.; Xiao, Y.; Su, X.; Chen, Z.; Cheng, Q.; Duan, F.; Chen, R.; Ma, Y. UAV-Based Multi-Sensor Data Fusion and Machine Learning Algorithm for Yield Prediction in Wheat. Precis. Agric. 2023, 24, 187–212. [Google Scholar] [CrossRef]
Ahmed, A.A.M.; Sharma, E.; Jui, S.J.J.; Deo, R.C.; Nguyen-Huy, T.; Ali, M. Kernel Ridge Regression Hybrid Method for Wheat Yield Prediction with Satellite-Derived Predictors. Remote Sens. 2022, 14, 1136. [Google Scholar] [CrossRef]
Wang, Y.; Shi, W.; Wen, T. Prediction of Winter Wheat Yield and Dry Matter in North China Plain Using Machine Learning Algorithms for Optimal Water and Nitrogen Application. Agric. Water Manag. 2023, 277, 108140. [Google Scholar] [CrossRef]
Pang, A.; Chang, M.W.L.; Chen, Y. Evaluation of Random Forests (RF) for Regional and Local-Scale Wheat Yield Prediction in Southeast Australia. Sensors 2022, 22, 717. [Google Scholar] [CrossRef]
Kumar, S.; Attri, S.D.; Singh, K.K. Comparison of Lasso and Stepwise Regression Technique for Wheat Yield Prediction. J. Agrometeorol. 2019, 21, 188–192. [Google Scholar] [CrossRef]
Son, N.-T.; Chen, C.-F.; Cheng, Y.-S.; Toscano, P.; Chen, C.-R.; Chen, S.-L.; Tseng, K.-H.; Syu, C.-H.; Guo, H.-Y.; Zhang, Y.-T. Field-Scale Rice Yield Prediction from Sentinel-2 Monthly Image Composites Using Machine Learning Algorithms. Ecol. Inform. 2022, 69, 101618. [Google Scholar] [CrossRef]
Zhou, W.; Liu, Y.; Ata-Ul-Karim, S.T.; Ge, Q.; Li, X.; Xiao, J. Integrating Climate and Satellite Remote Sensing Data for Predicting County-Level Wheat Yield in China Using Machine Learning Methods. Int. J. Appl. Earth Obs. Geoinf. 2022, 111, 102861. [Google Scholar] [CrossRef]
Zhao, Y.; Xiao, D.; Bai, H.; Tang, J.; Liu, D.L.; Qi, Y.; Shen, Y. The Prediction of Wheat Yield in the North China Plain by Coupling Crop Model with Machine Learning Algorithms. Agriculture 2023, 13, 99. [Google Scholar] [CrossRef]
Zhao, Y.; Zhang, Y.; Yang, Y.; Li, F.; Dai, R.; Li, J.; Wang, M.; Li, Z. The Impact of Land Use Structure Change on Utilization Performance in Henan Province, China. Int. J. Environ. Res. Public Health 2023, 20, 4251. [Google Scholar] [CrossRef]
Huang, J.; Zhou, L.; Zhang, F.; Hu, Z.; Tian, H. Responses of Yield Variability of Summer Maize in Henan Province, North China, to Large-Scale Atmospheric Circulation Anomalies. Theor. Appl. Clim. 2021, 143, 1655–1665. [Google Scholar] [CrossRef]
Xie, Y.; Shi, S.; Xun, L.; Wang, P. A Multitemporal Index for the Automatic Identification of Winter Wheat Based on Sentinel-2 Imagery Time Series. GIScience Remote Sens. 2023, 60, 2262833. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, Y.; Zhang, R.; Li, J.; Zhang, M.; Zhou, S.; Wang, Z. Reduced Irrigation Increases the Water Use Efficiency and Productivity of Winter Wheat-Summer Maize Rotation on the North China Plain. Sci. Total Environ. 2018, 618, 112–120. [Google Scholar] [CrossRef]
National Bureau of Statistics of China. Data and Statistics. National Bureau of Statistics of China. Available online: https://data.stats.gov.cn (accessed on 1 October 2022).
Ronchetti, G.; Manfron, G.; Weissteiner, C.J.; Seguini, L.; Nisini Scacchiafichi, L.; Panarello, L.; Baruth, B. Remote Sensing Crop Group-Specific Indicators to Support Regional Yield Forecasting in Europe. Comput. Electron. Agric. 2023, 205, 107633. [Google Scholar] [CrossRef]
Wang, X.; Huang, J.; Feng, Q.; Yin, D. Winter Wheat Yield Prediction at County Level and Uncertainty Analysis in Main Wheat-Producing Regions of China with Deep Learning Approaches. Remote Sens. 2020, 12, 1744. [Google Scholar] [CrossRef]
Tian, L.; Wang, C.; Li, H.; Sun, H. Yield Prediction Model of Rice and Wheat Crops Based on Ecological Distance Algorithm. Environ. Technol. Innov. 2020, 20, 101132. [Google Scholar] [CrossRef]
Khalaf, G.; Månsson, K.; Shukur, G. Modified Ridge Regression Estimators. Commun. Stat.-Theory Methods 2013, 42, 1476–1487. [Google Scholar] [CrossRef]
Polat, K.; Guenes, S. A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems. Expert Syst. Appl. 2009, 36, 1587–1592. [Google Scholar] [CrossRef]
Tian, H.; Cheng, L.; Wu, D.; Wei, Q.; Zhu, L. Regional Monitoring of Leaf ChlorophyII Content of Summer Maize by Integrating Multi-Source Remote Sensing Data. Agronomy 2023, 13, 2040. [Google Scholar] [CrossRef]
Appelhans, T.; Mwangomo, E.; Hardy, D.R.; Hemp, A.; Nauss, T. Evaluating machine learning approaches for the interpolation of monthly air temperature at Mt. Kilimanjaro, Tanzania. Spat. Stat. 2015, 14, 91–113. [Google Scholar] [CrossRef]
Song, J.; Zhang, L.; Jiang, Q.; Ma, Y.; Zhang, X.; Xue, G.; Shen, X.; Wu, X. Estimate the Daily Consumption of Natural Gas in District Heating System Based on a Hybrid Seasonal Decomposition and Temporal Convolutional Network Model. Appl. Energy 2022, 309, 118444. [Google Scholar] [CrossRef]
Mienye, I.D.; Sun, Y. A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects. IEEE Access 2022, 10, 99129–99149. [Google Scholar] [CrossRef]
Sagi, O.; Rokach, L. Ensemble Learning: A Survey. WIREs Data Min. Knowl. Discov. 2018, 8, e1249. [Google Scholar] [CrossRef]
Sankalpa, C.; Kittipiyakul, S.; Laitrakun, S. Forecasting Short-Term Electricity Load Using Validated Ensemble Learning. Energies 2022, 15, 8567. [Google Scholar] [CrossRef]
Yulisa, A.; Park, S.H.; Choi, S.; Chairattanawat, C.; Hwang, S. Enhancement of Voting Regressor Algorithm on Predicting Total Ammonia Nitrogen Concentration in Fish Waste Anaerobiosis. Waste Biomass Valorization 2023, 14, 461–478. [Google Scholar] [CrossRef]
Banfield, R.E.; Hall, L.O.; Bowyer, K.W.; Kegelmeyer, W.P. Ensemble Diversity Measures and Their Application to Thinning. Inf. Fusion 2005, 6, 49–62. [Google Scholar] [CrossRef]
Hanicinec, M.; Mohr, S.; Tennyson, J. A Regression Model for Plasma Reaction Kinetics. J. Phys. D Appl. Phys. 2023, 56, 374001. [Google Scholar] [CrossRef]
Phyo, P.-P.; Byun, Y.-C.; Park, N. Short-Term Energy Forecasting Using Machine-Learning-Based Ensemble Voting Regression. Symmetry 2022, 14, 160. [Google Scholar] [CrossRef]
Natras, R.; Soja, B.; Schmidt, M. Ensemble Machine Learning of Random Forest, AdaBoost and XGBoost for Vertical Total Electron Content Forecasting. Remote Sens. 2022, 14, 3547. [Google Scholar] [CrossRef]
Perros, N.; Kalivas, D.; Giovos, R. Spatial Analysis of Agronomic Data and UAV Imagery for Rice Yield Estimation. Agriculture 2021, 11, 809. [Google Scholar] [CrossRef]
Sagan, V.; Maimaitijiang, M.; Bhadra, S.; Maimaitiyiming, M.; Brown, D.R.; Sidike, P.; Fritschi, F.B. Field-Scale Crop Yield Prediction Using Multi-Temporal WorldView-3 and PlanetScope Satellite Data and Deep Learning. ISPRS J. Photogramm. Remote Sens. 2021, 174, 265–281. [Google Scholar] [CrossRef]
Xu, W.; Yang, W.; Chen, S.; Wu, C.; Chen, P.; Lan, Y. Establishing a Model to Predict the Single Boll Weight of Cotton in Northern Xinjiang by Using High Resolution UAV Remote Sensing Data. Comput. Electron. Agric. 2020, 179, 105762. [Google Scholar] [CrossRef]

Figure 1. Workflow of winter wheat yield prediction.

Figure 2. Distribution map of land-use types in Henan Province in 2021.

Figure 3. Statistical graph of production data.

Figure 4. Overall architecture diagram of multi-model ensemble methods.

Figure 5. Comparison of model prediction accuracy for the entire growth period of winter wheat.

Figure 6. Scatter plots and residuals of the validation set models for the entire growth period. Subfigures (a–d) depict the scatter plots and residual information for ensemble voting, ridge, gradient boosting, and random forest models, respectively.

Figure 7. Scatter plots and residuals of the test set models for the entire growth period. Subfigures (a–d) depict the scatter plots and residual information for ensemble voting, ridge, gradient boosting, and random forest models, respectively.

Figure 8. Accuracy of single growth period models. Subfigures comprise a (a) heatmap of R² for the validation set, (b) heatmap of R² for the test set, (c) heatmap of RMSE for the validation set, (d) heatmap of RMSE for the test set, (e) heatmap of MAE for the validation set, and (f) heatmap of MAE for the test set.

Figure 9. Scatter plots of ensemble voting’s yield predictions during the heading stage. Subfigures comprise a (a) scatter plot for the validation set and (b) scatter plot for the test set.

Figure 10. Spatialization of winter wheat yield prediction in Henan Province, 2021.

Table 1. Phenological calendar of winter wheat in Henan Province.

Phenology	Emergence	Tillering	Overwintering	Green-Up	Jointing	Heading	Milk Ripening	Maturation
Time	Late September to late October	Early November to early December	Mid-December to mid-February	Mid-February to mid-March	Mid-March to early April	Mid-April to late April	May	Early June to Late June

Table 2. Summary of winter wheat production forecast data.

Category	Variables	Spatial Resolution	Temporal Resolution	Sources
Statistical data	Yield	County-level	Yearly	Statistical Yearbook [39]
Statistical data	Wheat area	County-level	Yearly	Statistical Yearbook [39]
Vector data	Wheat	County-level	Yearly	Sentinel 2 Image Extraction
Vegetation index	EVI	500 m	8-day	MOD09A1
	NDVI			MOD09A1
	FPAR			MOD15A2H
	LST	1 km		MOD11A2
Ecological data	GPP	500 m	8-day	MOD17A2H
Ecological data	LAI	500 m	8-day	MOD15A2H
Hydrological data	ET	500 m	8-day	MOD16A2
Hydrological data	PET	500 m	8-day	MOD16A2

Table 3. Accuracy of 11 yield prediction models.

Models	Validation Data			Test Data
Models	R²	RMSE	MAE	R²	RMSE	MAE
Elastic Net	0.52	933.28	767.95	0.52	930.86	754.47
Gradient Boosting	0.78	631.74	507.25	0.78	633.30	462.28
Random Forest	0.69	756.39	611.46	0.75	676.49	517.72
Ridge	0.86	509.50	389.66	0.79	609.80	475.07
Adaboost	0.61	848.46	725.36	0.57	878.83	731.44
KNeighbors	0.61	847.14	685.25	0.63	821.96	614.30
DecisionTree	0.55	904.67	628.91	0.49	954.56	700.81
ExtraTree	0.32	1116.44	812.83	0.38	1054.95	752.55
NuSVR	0.25	1172.38	1036.04	0.26	1156.47	1021.89
SVR	0.03	1332.69	1165.48	0.04	1316.57	1161.38
Ensemble Voting	0.90	439.21	351.28	0.90	424.44	313.92

Table 4. Accuracy of ablation experimental models.

Removed Model	Validation Data			Test Data
Removed Model	R²	RMSE	MAE	R²	RMSE	MAE
Gradient Boosting	0.82	569.00	458.29	0.86	506.82	386.23
Random Forest	0.88	477.62	381.62	0.88	474.53	349.19
Ridge	0.77	652.50	526.04	0.78	636.83	470.86
None (Initial Model)	0.90	439.21	351.28	0.90	424.44	313.92

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lou, Z.; Lu, X.; Li, S. Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning. Agronomy 2024, 14, 1834. https://doi.org/10.3390/agronomy14081834

AMA Style

Lou Z, Lu X, Li S. Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning. Agronomy. 2024; 14(8):1834. https://doi.org/10.3390/agronomy14081834

Chicago/Turabian Style

Lou, Zhengfang, Xiaoping Lu, and Siyi Li. 2024. "Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning" Agronomy 14, no. 8: 1834. https://doi.org/10.3390/agronomy14081834

APA Style

Lou, Z., Lu, X., & Li, S. (2024). Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning. Agronomy, 14(8), 1834. https://doi.org/10.3390/agronomy14081834

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Yield Prediction of Winter Wheat at Different Growth Stages Based on Machine Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data and Pre-Processing

2.2.1. Statistical Data

2.2.2. Winter Wheat Vector Data and Phenological Periods

2.2.3. Remote Sensing Data

2.3. Machine Learning Methods for Yield Prediction

2.3.1. Linear Models and Regularization Methods

2.3.2. Decision Tree Models and Their Extensions

2.3.3. Distance-Based Models

2.3.4. Support Vector Regression Models

2.3.5. Ensemble Models

2.4. Winter Wheat Yield Prediction

2.5. Accuracy Assessment

2.6. Construction of Spatialization Model for Yield

3. Results

3.1. Comparison of Accuracy in Yield Prediction Algorithms for Entire Growth Period

3.2. Comparison of Accuracy in Yield Prediction Algorithms for Individual Growth Periods

3.3. Pixel-Level Spatialization of Yield

4. Discussion

4.1. Performance Comparison of Winter Wheat Yield Prediction Models

4.2. Analysis of Yield Prediction Potential for Individual Growth Periods

4.3. Analysis of Spatialization in Yield Research

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI