Soybean Lodging Classification and Yield Prediction Using Multimodal UAV Data Fusion and Deep Learning

Xu, Xingmei; Fang, Yushi; Sun, Guangyao; Zhang, Yong; Wang, Lei; Chen, Chen; Ren, Lisuo; Meng, Lei; Li, Yinghui; Qiu, Lijuan; Guo, Yan; Yu, Helong; Ma, Yuntao

doi:10.3390/rs17091490

Open AccessArticle

Soybean Lodging Classification and Yield Prediction Using Multimodal UAV Data Fusion and Deep Learning

by

Xingmei Xu

¹,

Yushi Fang

¹,

Guangyao Sun

²,

Yong Zhang

³,

Lei Wang

³,

Chen Chen

⁴,

Lisuo Ren

⁴,

Lei Meng

⁵

,

Yinghui Li

⁶,

Lijuan Qiu

⁶,

Yan Guo

²,

Helong Yu

¹

and

Yuntao Ma

^1,2,*

¹

College of Information and Technology, Jilin Agricultural University, Changchun 130118, China

²

College of Land Science and Technology, China Agricultural University, Beijing 100193, China

³

Keshan Branch of Heilongjiang Academy of Agricultural Sciences, Qiqihar 161000, China

⁴

Bayannur Modern Agriculture and Animal Husbandry Development Center, Bayannur 015000, China

⁵

School of Environment, Geography, and Sustainability, Western Michigan University, Kalamazoo, MI 49008, USA

⁶

Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(9), 1490; https://doi.org/10.3390/rs17091490

Submission received: 14 January 2025 / Revised: 3 March 2025 / Accepted: 17 April 2025 / Published: 23 April 2025

Download

Browse Figures

Versions Notes

Abstract

:

UAV remote sensing is widely used in the agricultural sector due to its non-destructive, rapid, and cost-effective advantages. This study utilized two years of field data with multisource fused imagery of soybeans to evaluate lodging conditions and investigate the impact of lodging grade information on yield prediction. Unlike traditional approaches that build empirical lodging models using band reflectance, vegetation indices, and texture features, this research introduces a transfer learning framework. This framework employs a ResNet18 encoder to directly extract features from raw images, bypassing the complexity of manual feature extraction processes. To address the imbalance in the lodging dataset, the Synthetic Minority Over-sampling Technique (SMOTE) strategy was employed in the feature space to balance the training set. The findings reveal that deep learning effectively extracts meaningful features from UAV imagery, outperforming traditional methods in lodging grade classification across all growth stages. On the 65 days after emergence (DAE), lodging grade classification using ResNet18 features achieved the highest accuracy (Accuracy = 0.76, recall = 0.76, F1 score = 0.73), significantly exceeding the performance of traditional methods. However, classification accuracy was relatively low in plots with higher lodging grades (lodging grades = 3, 5, 7), with an accuracy of 0.42 and an F1 score of 0.56. After applying the SMOTE module to balance the samples, the classification accuracy in plots with higher lodging grades improved to 0.65, marking an increase of 54.76%. To improve accuracy in yield prediction, this study integrates lodging information with other features, such as canopy spectral reflectance, vegetation indices, and texture features, using two multimodal data fusion strategies: input-level fusion (ResNet-EF) and intermediate-level fusion (ResNet-MF). The findings reveal that the intermediate-level fusion strategy consistently outperforms input-level fusion in yield prediction accuracy across all growth stages. Specifically, the intermediate-level fusion model incorporating measured lodging grade information achieved the highest prediction accuracy on the 85 DAE (R² = 0.65, RMSE = 529.56 kg/ha). Furthermore, when predicted lodging information was used, the model’s performance remained comparable to that of the measured lodging grades, underscoring the critical role of lodging factors in enhancing yield estimation accuracy.

Keywords:

UAV; deep learning; lodging classification; yield; data fusion

1. Introduction

Soybean is a significant food and oilseed crop, essential to global food production and agriculture [1,2]. During soybean growth, various environmental factors and cultivation practices can lead to lodging, which disrupts normal development and mechanized harvesting, ultimately reducing yield and quality. Currently, manual field surveys are the primary method for assessing crop lodging, but they are inefficient and inadequate for high-throughput crop breeding and production management. Therefore, there is an urgent need for a rapid, accurate, non-destructive, and quantitative method to monitor soybean lodging stress.

Recent advancements in low-altitude UAV monitoring technology have made it a vital tool for field-scale crop growth and lodging assessment. UAVs offer non-destructive, cost-effective monitoring with high spatial resolution, the ability to integrate imagery and spectral data, and the capacity to capture images under cloud cover [3,4]. By combining UAV imagery with machine learning features, such as canopy structure, vegetation indices, texture, spectral data, and temperature, can be extracted. These features serve as inputs for machine learning models, enabling high-throughput lodging classification [5,6,7,8]. Yang, Huang [9] integrated digital surface models (DSM), texture information, and spectral data with a decision tree classification model, achieving an accuracy of 96.17% in successfully identifying rice lodging rates. Han, Yang [5] combined UAV imagery with logistic regression to predict the probability of maize lodging, emphasizing the significance of canopy structural features in distinguishing lodging at the plot scale. Rajapaksa, Eramian [10] applied texture features and support vector machines (SVM) to differentiate lodging in wheat and canola. However, studies using deep learning for lodging grade classification remain relatively scarce. Particularly, the potential of deep learning models to directly utilize raw UAV imagery for field-scale lodging classification requires further investigation.

Existing studies on lodging classification primarily rely on visible light and multispectral sensors to identify lodging traits, while the potential of hyperspectral imagery remains underexplored. Wang [6] distinguished between lodged and non-lodged winter wheat using an unsupervised classification method applied to UAV RGB imagery. Dai [7] extracted information on lodged cotton from vegetation indices and texture details obtained from UAV multispectral imagery. Han [5] and Chu [8] identified lodged corn by extracting texture features, spectral characteristics, canopy structure, and topography from UAV-collected RGB and multispectral imagery. Moreover, the feature extraction processes in these methods are often complex, and the combination of various feature types can significantly influence the accuracy of classification models. Therefore, there is an urgent need for an automated feature extraction method to improve the effectiveness and consistency of lodging classification models.

In field experiments, the imbalance between lodging and non-lodging plots often compromises the accuracy and reliability of lodging classification models. SMOTE (Synthetic Minority Oversampling Technique) is a widely used data augmentation method that addresses class imbalance by generating synthetic samples between minority class samples to balance the class distribution. Han, Yang [11] combined the SMOTE-ENN preprocessing method with an XGBoost classifier to classify maize lodging severity using RGB images, achieving a classification accuracy of 94.5%. Sarkar, Zhou [12] demonstrated that SMOTE-ENN effectively distinguished soybean lodging traits across four classifiers, achieving a classification accuracy of up to 96%. Chemchem, Alin [13] applied undersampling and SMOTE techniques for wheat yield prediction and found that SMOTE significantly improved training scores and AUC-ROC values. However, the synthetic samples generated by SMOTE are typically based on manually selected features, such as vegetation indices, and their effectiveness in deep learning feature vector spaces remains an area that requires further exploration.

In crop breeding, evaluating the growth performance of various genotypes is essential for selecting those with the highest yield and stability. Considerable research has been conducted on crop yield estimation using UAV imagery. Zhu, Li [14] employed a UAV remote sensing platform equipped with a multispectral camera to capture image data of wheat at the tillering, heading, and filling stages. They developed linear models based on nine different vegetation indices and estimated yield using the least squares method. Gong, Duan [15] utilized a UAV remote sensing platform equipped with a six-band multispectral camera to develop a product model combining vegetation indices and leaf-related abundance. Their results showed that the product of the normalized difference vegetation index (NDVI) and short-petiole leaf abundance achieved the highest estimation accuracy for canola yield under varying nitrogen treatments, with estimation errors below 13%. Fei, Hassan [16] applied a multisensor UAV platform and machine learning algorithms for data fusion and ensemble learning, successfully predicting wheat yield, with the highest R² value reaching 0.692. Yang [17] proposed a deep convolutional neural network model to predict rice yield at the maturation stage. This CNN model extracts key spatial features related to yield from high-resolution RGB images. Zhong [18] integrated ground data and statistical references, employing artificial neural networks and deep CNN models to map winter wheat and achieve automatic identification of wheat seasonality without the need for samples. Hao [19] proposed a classification model GL-CNN based on convolutional neural networks (CNNs) to identify the optimal growth stage of leafy vegetables. While these studies primarily utilized spectral reflectance, texture, and other image-derived features for yield estimation, they overlooked the impact of lodging on crop yield prediction.

Karmakar [20] investigated the application of multimodal remote sensing (MRS) techniques in crop monitoring, highlighting that the integration of data from various remote sensing devices enables more accurate and comprehensive analysis of crop conditions, thus optimizing decision making and enhancing crop yields. Maimaitijiang [21] assessed a UAV-based multimodal data fusion method, employing a deep neural network (DNN) framework to integrate RGB, multispectral, and thermal sensor data for accurate soybean grain yield prediction. The results demonstrate that multimodal data fusion significantly improves prediction accuracy. Zhang [22] performed multistage phenotypic analysis of soybeans using multimodal UAV sensor data. The study findings indicate that the fusion of multimodal data markedly enhanced the accuracy of the XGBoost model in predicting the Leaf Area Index (LAI), exhibiting particularly strong performance across various growth stages. The type and severity of lodging in crops significantly affect yield [23]. Fischer and Stapper [24] demonstrated that lodging with a 45° stem inclination resulted in smaller yield losses compared to lodging at an 80° inclination. Similarly, Kendall, Holmes [25] quantified canola yield losses due to lodging, showing that lodging to 90° during the flowering period led to a 46% yield reduction, while lodging at 45° resulted in a yield decrease of approximately 20%. Berry and Spink [26] found that when wheat lodging occurs at a 90° angle to the vertical, it resulted in a yield reduction of approximately 61%. Therefore, integrating lodging severity monitoring data with traditional image-derived features is expected to enhance yield prediction accuracy. The main objectives of this study are as follows: (1) to propose a novel deep learning framework for automatically extracting fused features from visible and hyperspectral images to accurately classify soybean lodging severity; (2) to assess the impact of incorporating the SMOTE module into the framework on lodging classification accuracy; and (3) to determine the optimal data fusion strategy for yield estimation and evaluate the potential contribution of lodging information to yield prediction.

2. Materials and Methods

2.1. Experimental Area Profile

The experiment was performed in the two consecutive years of 2022 and 2023 at the Keshan Branch of the Heilongjiang Academy of Agricultural Sciences, located in Qiqihar City, Heilongjiang Province, China (48°01′21.7″N, 125°51′06.5″E) (Figure 1). Keshan County is situated in a typical continental monsoon climate zone, characterized by strong monsoon influences, abundant seasonal rainfall, and ample sunlight throughout the year. The average temperature during the soybean growing season is 16 °C, with precipitation primarily occurring between June and August, totaling approximately 380 mm. This region is part of Heilongjiang Province’s fertile black soil zone, known for its rich soil and favorable physical and chemical properties, making it a significant grain production base.

The experiment in 2022 and 2023 involved 263 and 251 soybean varieties with two replications for each variety and each year, respectively. The soybean varieties used in this study are primarily sourced from Northeast China, including several representative varieties that are adapted to cold climates and varying growth periods. The selection of these varieties ensures the representativeness of the dataset, encompassing varieties with different maturity periods, leaf types, and growth characteristics. This enables a more comprehensive evaluation of the lodging severity classification model’s applicability. Approximately 92% of the varieties had a maturity period of 110 days, while the rest matured within 115 to 120 days. Each experimental plot consisted of three rows, each 2 m in length, with a row spacing of 0.2 m. All plots were harvested on 10 October 2022 and 10 October 2023, ensuring that all varieties reached maturity after over 130 days after emergence (DAE). Field management practices, including irrigation, fertilization, and pest control, followed local agricultural guidelines. After harvest, soybean grains from each plot were cleaned to remove impurities and weighed, with the results recorded as kilograms per hectare.

2.2. Data Collection

2.2.1. UAV Image Acquisition and Preprocessing

The vertical RGB images were captured using a DJI Phantom 4RTK UAV (DJI Innovations, Shenzhen, China) equipped with an FC6310R camera, which has a resolution of 5472 × 3648 and an effective pixel count of 20 million. The UAV operated at an altitude of 30 m, with 80% forward and lateral overlap. The images were stitched using Agisoft PhotoScan Professional (v1.4.5, Agisoft LLC, St. Petersburg, Russia). Following photo alignment and generation of a dense point cloud, a digital elevation model (DEM) and a digital orthophoto map were produced.

The hyperspectral images of soybeans were captured using a UAV hyperspectral imaging system, which included the DJI Matrice 600 Pro hexacopter (DJI Innovations, Shenzhen, China), a gimbal, a microcomputer, and the UHD 185 hyperspectral imaging spectrometer (Cubert GmbH, Ulm, Germany). The UHD 185 is a full-frame, non-scanning, real-time airborne hyperspectral system with a spectral range of 450 to 950 nm, a spatial resolution of 4 nm, and coverage of 125 spectral bands [27]. The UAV was operated between 12:00 and 14:00 on clear, cloudless days, with calibration conducted using a standard whiteboard before each flight. The flight altitude was maintained at 30 m, with 80% forward and lateral overlap, and images were acquired at 1-millisecond intervals. The raw images had a resolution of 1000 × 1000 pixels. The images captured by the UHD 185 included both full-color photos (.jpg) and hyperspectral cubes (.cub). These images were then stitched and geometrically corrected using Cube-Pilot (Cubert GmbH, Ulm, Germany) and Agisoft PhotoScan software [28], resulting in the final hyperspectral images of the study area.

UAV RGB and hyperspectral images were acquired on the 55th, 65th, 76th, 85th, and 95th days following soybean emergence during the 2022 and 2023 growing seasons.

2.2.2. Ground Data Collection

The ground measurement data include yield (kg/ha) and lodging grade for each variety. Lodging grades are assessed and determined in the field by breeders, with all plots standardized to minimize subjective errors. The manual assessment of lodging grade is conducted at 55 DAE, as this time point is essential for ensuring a thorough and consistent evaluation. By 55 DAE, all lodging events have fully manifested in the experimental fields, and the soybean plants have transitioned from the rapid growth phase to a relatively stable developmental stage. Lodging grade classification is determined not only by the plant tilt angle but also by the extent of the lodging area within the plot. Specifically, during the assessment, both the tilt angle and the lodging area size within each plot are analyzed to ensure a more comprehensive and accurate classification. A vertical root corresponds to a grade of 1, indicating no lodging. Inclination angles of 60°, 45°, and 30° correspond to lodging grades of 3, 5, and 7, respectively.

After harvest, soybean grains from each plot were thoroughly cleaned to remove debris and foreign matter, ensuring accurate weight measurements. Subsequently, the cleaned grains were weighed using a standardized method, with results expressed in kilograms per hectare to record actual yield data. To reduce experimental error, each plot’s grain weight was measured three times, and the average was calculated as the final result.

2.3. Feature Extraction

2.3.1. Construction of Vegetation Indices

Research has demonstrated that spectral indices derived from the DN values of RGB images can effectively estimate crop phenotypic traits [29]. Table 1 lists 15 vegetation indices selected from RGB vertical images and point clouds, along with 18 indices derived from hyperspectral images. These indices are commonly used to assess plant health [30] and predict crop yield [21]. After eliminating the soil background, the average reflectance values from 125 hyperspectral images are used as canopy spectral features.

2.3.2. First-Order Differential

The differential transformation of spectral reflectance helps reduce the influence of low-frequency noise on the target spectrum. The first-order differential (FOD) of the original spectrum was calculated using MATLAB (v2016a, MathWorks, Natick, MA, USA) to analyze the variations in the FOD of soybean under different lodging stresses. The spectral reflectance difference is commonly used as a finite approximation of the derivative, as expressed below:

D λ_{i} ({nm}^{- 1}) = \frac{R λ_{i + 1} - R λ_{i - 1}}{λ_{i + 1} - λ_{i - 1}}

(1)

In this equation,

λ_{i}

represents the wavelength of the i-th band,

D λ_{i}

denotes the FOD of

λ_{i}

, and

R λ_{i + 1}

refers to the spectral reflectance at wavelength

λ_{i + 1}

.

2.4. Soybean Lodging Grade Classification Based on RGB and Hyperspectral Images with SMOTE-ResNet

2.4.1. Representation Learning and Feature Fusion Based on ResNet18

Convolutional neural networks (CNNs) generally demonstrate improved performance with increased depth. For instance, widely used deep networks, such as AlexNet [58], VGG-Net [59], and GoogleNet [60], feature 5, 19, and 22 convolutional layers, respectively. However, research has highlighted the challenges of training very deep networks due to the “vanishing gradient problem” [61,62]. This issue occurs when gradients diminish progressively during backpropagation, ultimately becoming negligible. As a result, as network depth increases, performance may plateau or even decline rapidly [61,63].

Deep Residual Networks (or ResNet) are a residual learning framework designed to facilitate the training of deep networks [62]. ResNet mitigates the vanishing gradient problem by incorporating identity “skip connections”, which allow gradients to propagate to shallower layers during backpropagation without diminishing. As a result, ResNet enables the training of networks with depths of up to thousands of layers [62].

Figure 2a illustrates the ResNet network architecture used for lodging grade assessment. Each convolutional layer is followed by a batch normalization layer and a ReLU activation function. Each module consists of three convolutional layers (Conv), with a batch normalization layer [64] and a ReLU activation function [65] preceding each convolution. The kernel sizes for the three convolutional layers are 1 × 1, 3 × 3, and 1 × 1, designed to reduce the feature dimension with the first 1 × 1 convolutional layer and then restore the original dimension with the second 1 × 1 convolutional layer. The conv1 layer performs spatial downsampling on the input with a stride of 2, followed by a 3 × 3 kernel size, stride 2 max pooling (MaxPool) layer before conv2. Finally, the input RGB and hyperspectral images are processed through fully connected layers, resulting in output feature vectors of dimension 64. The choice of ResNet18 as the encoder stems from its efficient utilization of residual connections, which mitigates the vanishing gradient problem and ensures more stable training of deep models. The depth and design of this architecture also strike a good balance between computational cost and model performance, making it an ideal choice for lodging classification tasks. Additionally, the decision to fuse RGB and hyperspectral feature vectors aims to maximize the complementary information between spatial and spectral data, thereby enhancing the model’s robustness and classification performance across different lodging grades. RGB images, with their higher resolution, provide more accurate information on soybean shape, texture, and structure. Hyperspectral images, on the other hand, offer richer spectral data, capable of detecting subtle spectral differences associated with varying lodging grades. To fully exploit the advantages of both, and enhance lodging grade classification accuracy, the RGB and hyperspectral feature vectors, encoded by ResNet, are fused to maximize information complementarity and classification performance.

The Cross-Entropy Loss function was used during training, which performs excellently in multiclass classification problems. The formula is as follows:

Loss = - \sum_{i = 1}^{C} y_{i} \log (p_{i})

(2)

where

C

represents the number of classes,

y_{i}

is the true label of the sample, and

p_{i}

is the probability predicted by the model for that class. This loss function promotes the model to learn more accurate lodging grade features by minimizing the difference between the predicted class and the true label.

2.4.2. Category Balancing of Feature Vectors Based on the SMOTE Strategy

Sample imbalance in machine learning tasks often reduces a model’s ability to accurately predict specific classes, especially in classification problems. Class imbalance is a well-known challenge for machine learning algorithms in developing high-performance classifiers [66]. The Synthetic Minority Over-sampling Technique (SMOTE) has demonstrated effective as a preprocessing method to handle imbalanced datasets, enhancing the performance of classification models. The application of SMOTE is implemented during the training phase, specifically before feeding the data into the deep learning model, to ensure that the model does not exhibit bias towards the majority class. This approach effectively enables the classification of all lodging grades. To address the issue of class imbalance in the dataset, we employed the SMOTE technique to generate synthetic samples before training. The number of nearest neighbors was set to 5, and the sampling ratio was configured at 1:1, meaning that each minority class sample was augmented with an equal number of synthetic samples. This setup helps balance the dataset and mitigates the model’s bias towards the majority class.

After encoding and fusion through ResNet, the RGB and hyperspectral images are transformed into 128-dimensional feature vectors, capturing the essential information from the original images. Figure 3 illustrates the distribution of soybean plots across various lodging levels, revealing that non-lodging plots significantly outnumbered lodging plots. To address this imbalance in lodging samples, this study integrates the SMOTE module into the network framework. To prevent information leakage from the test set, SMOTE is applied only to generate synthetic samples in the training set. During data preprocessing, SMOTE is exclusively used for oversampling the training set samples, without involving the entire dataset. This ensures that no test set information is introduced during model training, thereby preserving the validity of the evaluation results. This module generates synthetic samples for minority class feature vectors, improving the classifier’s performance on a balanced dataset (Figure 2b). SMOTE operates by randomly selecting samples within the feature space of the minority class, computing interpolations with neighboring samples based on their labels, and generating synthetic samples. This approach effectively increases the representation of the minority class while maintaining the diversity of feature vectors. The resulting balanced feature vectors are then used to train the machine learning model for lodging classification.

2.4.3. Lodging Classification and Performance Evaluation

To fairly compare the effectiveness of the deep learning-extracted features with that of traditional handcrafted features, we input the automatically extracted feature vectors into traditional machine learning models, such as Random Forest (RF) and Support Vector Machine (SVM), for classification tasks. RF is an ensemble learning method that builds multiple decision trees and combines their outputs for classification. This approach is highly effective for handling large-scale datasets and offers feature importance evaluation during feature selection. Furthermore, the ensemble structure of RF helps mitigate the risk of overfitting commonly associated with individual decision trees, particularly in situations involving noise or class imbalance, thereby enhancing the accuracy of lodging grade predictions. SVM classification operates by identifying the optimal separating hyperplane within the feature space. For lodging grade prediction, SVM effectively manages complex feature spaces and delivers high-accuracy classification results. The fundamental principle of SVM is to find the hyperplane that best separates samples from different classes in a high-dimensional space, ensuring accurate classification. Both classifiers are computationally efficient and widely adopted in machine learning, making them ideal for a fair comparison with deep learning-extracted features.

A random sampling method was employed to divide the dataset, allocating 70% of the samples to the training set and 30% to the testing set. The model’s classification performance was evaluated using accuracy (ACC), recall and the F1 score. Accuracy measures the proportion of correctly predicted samples out of the total, reflecting the model’s overall effectiveness. Recall is a metric that quantifies the proportion of true positive samples correctly identified by the model. It is especially crucial in imbalanced datasets, as it assesses the model’s ability to detect minority-class samples. A higher recall indicates that the model is more effective in identifying positive samples, thereby reducing the risk of false negatives. Conversely, the F1 score, a harmonic mean of precision and recall, is particularly effective for imbalanced datasets. By incorporating penalties for misclassifications, the F1 score provides a more accurate assessment of the model’s ability to detect minority classes. The formulas for calculating accuracy, recall and the F1 score are presented below:

Accuracy = \frac{TP + TN}{TP + FN + TN + FP}

(3)

Recall = \frac{TP}{TP + FN}

(4)

F 1 score = \frac{2 TP}{2 TP + FN + FP}

(5)

TP represents the number of samples correctly classified as the positive class by the model, while TN denotes the number of samples correctly classified as the negative class. FP refers to the number of samples incorrectly classified as the positive class, and FN indicates the number of samples incorrectly classified as the negative class.

2.5. The Construction and Validation of the Yield Estimation Model

Lodging not only reduces grain yield but also results in various negative consequences, including lower grain quality, higher drying costs, and delays in harvest progress. Figure 4 shows that for lodging grades 1 and 3, the median and mean yields are close, suggesting minimal yield variation at grade 3. However, at lodging grade 5, both the median and mean yields decrease significantly, and at grade 7, the yield further declines, indicating a significant negative correlation between lodging and yield loss. The yield distribution patterns vary notably across lodging grades: low lodging grades (1 and 3) exhibit more concentrated yield distributions, reflected in the wider violin plots, while high lodging grades (5 and 7) show greater dispersion, suggesting substantial variability in yield responses among different varieties under severe lodging conditions. To explore the potential of incorporating lodging grade information into crop yield prediction, this study employs SMOTE-ResNet to assess the spatial distribution and severity of crop lodging. These factors, along with canopy spectral reflectance, vegetation indices, texture, and other features, are incorporated as key variables in the yield prediction model.

This study employs a multimodal data fusion approach, wherein spectral reflectance features, texture matrices, and lodging grade labels are standardized and preprocessed before being uniformly converted into single-band grayscale images for model input. Specifically, each feature matrix is normalized using min–max scaling, linearly mapping values to the pixel range of [0, 255], and subsequently resampled to a spatial resolution of 224 × 224 pixels via bicubic interpolation. The rigorous alignment of spatial dimensions ensures geometric consistency when concatenating feature maps along the channel dimension, enabling the ResNet network to process multiple feature spaces in parallel. Multimodal data fusion can be performed at the data level (early fusion), feature level (mid fusion), or decision level (late fusion). In the ResNet framework, feature-level fusion utilizes both input-level and intermediate-level features. This study employs these two fusion strategies to integrate multimodal information for yield prediction (Figure 5). The input-level fusion structure, illustrated in Figure 5a, combines spectral, texture, and lodging grade data into a high-dimensional feature vector, which is subsequently processed by the ResNet model. Figure 5b presents the intermediate-layer feature fusion structure, which leverages a self-attention mechanism. Features from each modality are independently fed into parallel ResNet encoders, and the encoded outputs produce multiple feature vectors. Given the limited number of soybean samples, the high redundancy of features, and the substantial spatial and spectral semantic information, these features significantly influence yield prediction performance. To aggregate these feature vectors into a unified feature space, while enabling the network to focus on key information and maintain a global perspective, we propose a dual self-attention fusion module. This module comprises a spatial attention module and a band attention module. The spatial attention module captures spatial dependencies by emphasizing relationships between different spatial regions, while the band attention module enhances the modeling of correlations between spectral bands, optimizing multimodal data fusion and feature extraction. The fused feature vector is then passed through a multilayer perceptron (MLP) to perform the final yield prediction, ensuring the integrated features are effectively utilized for accurate outcomes.

The model’s estimation accuracy is evaluated using the coefficient of determination (R²) and root mean square error (RMSE). The formulas for these calculations are as follows [67]:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - y_{j})}^{2}}{{\sum_{i = 1}^{n} (y_{i} - \bar{y})}^{2}}

(6)

RMSE = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - y_{j})}^{2}}{n}}

(7)

Here,

y_{i}

represents the measured yield,

y_{j}

represents the predicted yield,

\bar{y}

represents the average measured yield, and

n

is the number of plots.

3. Results

3.1. Spectral Changes in Soybean Canopy Under Lodging Stress

Figure 6a presents the Pearson correlation coefficients between soybean lodging and various spectral wavelengths, demonstrating the effect of lodging on spectral reflectance across different growth stages. During the early stages (55 DAE and 65 DAE), lodging showed a strong correlation with the green and red spectral bands (502–626 nm), with Pearson correlation coefficients ranging from 0.28 to 0.47 for the green band and from 0.15 to 0.36 for the red band. As the growth stages advanced, during the mid-to-late stages (76 DAE and 85 DAE), the correlation coefficients between lodging and the near-infrared spectral band (710–946 nm) ranged from 0.34 to 0.40 and from 0.31 to 0.39, respectively. At the maturity stage (95 DAE), the correlation coefficients between lodging and near-infrared reflectance ranged from 0.23 to 0.32. These findings suggested that the near-infrared spectral band was highly sensitive and held significant potential for lodging grade classification, providing valuable spectral information for precise lodging monitoring and assessment.

The analysis following FOD processing (Figure 6b) revealed the dynamic relationship between lodging and spectral characteristics. Compared to spectral reflectance, most band differentials showed significantly lower correlations with lodging, although some still exhibited relatively high correlations. For instance, during the 55 DAE and 65 DAE stages, the differentials in the green spectral band (498–542 nm) and the red-edge band (662–770 nm) demonstrated notably higher correlation coefficients than other bands, ranging from 0.28 to 0.51 and from 0.25 to 0.52, respectively. During the mid-to-late stages (76 DAE and 85 DAE) and the maturity stage (95 DAE) of soybean growth, the differentials in the red, red-edge, and near-infrared spectral bands exhibited substantially higher correlations compared to other band differentials, with correlation coefficients reaching up to 0.42. These findings suggested that the reflectance information from these bands held significant value in lodging analysis, while the rate of change between bands also provided meaningful supplementary information.

3.2. Lodging Classification

Figure 7 compares the lodging classification accuracy of two models using various inputs, including original reflectance and vegetation indices, FOD, ResNet feature vectors, and the fusion of all features. When a single feature type was used, the ResNet feature vector yielded the highest performance, with an accuracy range of 0.66–0.76, a recall range of 0.62–0.76 and an F1-score range of 0.60–0.73. Notably, on the 65 DAE, when RF was used as the classifier, the ResNet feature vector achieved the highest prediction accuracy (ACC = 0.76, recall = 0.76, F1 score = 0.73). In contrast, spectral features and FOD showed suboptimal performance, with F1 scores ranging from 0.55 to 0.64 and 0.56 to 0.63, respectively. When all features were used as input for lodging classification, the model’s F1 score ranged from 0.59 to 0.72, with the highest classification accuracy achieved at 55 DAE (ACC = 0.74, recall = 0.74, F1 score = 0.72). At 55 DAE, the classification accuracy using all features surpassed that of the ResNet feature vector alone. Overall, while the ResNet feature vector provided the best performance when using a single feature, combining all features improved classification accuracy during the early growth stages, demonstrating that multisource feature fusion enhanced lodging classification performance.

3.3. Sample Balancing Strategy Based on the SMOTE Module

Under natural conditions, lodging plots were significantly fewer than non-lodging plots, creating a significant class imbalance in the dataset that complicated classifier development. The experimental results of this study demonstrated that integrating SMOTE significantly improved the classification performance of the ResNet model across all growth stages (Table 2). SMOTE-ResNet consistently outperformed ResNet in terms of accuracy, recall and F1 score. Although the magnitude of improvement varied across growth stages, the overall trend remains positive, indicating that SMOTE enhanced the model’s ability to identify minority-class samples by increasing their representation through oversampling. This enhancement not only strengthened classification performance but also improved model stability, leading to more reliable predictions.

Figure 8 presents the confusion matrix results for different classification methods on the optimal lodging prediction date (65 DAE). When using original features for lodging prediction, the highest classification accuracy was observed for plots without lodging (lodging grade = 1), with 129 samples correctly predicted. However, a notable misclassification issue occurred for lodging-affected plots (lodging grades = 3, 5, 7), with only 11 out of 84 samples correctly identified. The ResNet model achieved strong performance on plots without lodging, correctly predicting 137 instances, while also demonstrating significant improvement in reducing misclassification for lodging-affected plots. Specifically, 20 out of 52 plots with a lodging grade of 3 were correctly classified. For the 32 plots with lodging grades of 5 and 7, 15 were correctly identified, while the remainder were misclassified as non-lodging. After incorporating the SMOTE module, classification accuracy improved substantially, with correct classifications for lodging grade 3 increasing to 35. Similarly, the prediction accuracy for lodging grades 5 and 7 increased from 7 and 8 correct classifications to 11 and 9, respectively. Although accuracy for non-lodging plots slightly declined, accuracy for lodging-affected plots improved by 57.14%, resulting in a more stable classification accuracy curve. Overall, the inclusion of the SMOTE module significantly enhanced the ResNet model’s detection capability, effectively addressing the sample imbalance issue and markedly improving prediction accuracy for lodging severity levels.

Figure 9 displays the field soybean lodging conditions, comparing the measured data with the predictions from RF and SMOTE-ResNet. It can be seen from Figure 9 that both models predicted lodging distribution patterns that are highly correlated with soybean genotype. However, the classification results from the SMOTE-ResNet model more closely matched the actual measured lodging conditions, demonstrating greater accuracy. In contrast, the lodging classification results from RF showed less consistency with the observed lodging levels.

3.4. Yield Estimation Optimization Based on Lodging Information

The yield estimation results for later growth stages, based on image-derived features, such as vegetation indices, texture, and lodging grade information, are presented in Table 3. Multimodal fusion models were employed for modeling analysis, combining different lodging information at both the input-level (ResNet-EF) and intermediate-level (ResNet-MF). The results demonstrated that all models achieved the highest estimation accuracy on the 85 DAE (Figure 10). When feature fusion was used as the sole input, ResNet-MF consistently outperformed ResNet-EF at all growth stages. Specifically, the highest estimation accuracy for ResNet-MF was R² = 0.62, RMSE = 547.18 kg/ha, while for ResNet-EF, the highest was R² = 0.59, RMSE = 564.97 kg/ha. These findings suggested that the intermediate-level multimodal fusion strategy is more effective in capturing nonlinear relationships between features, thereby improving yield estimation accuracy.

When the model incorporated measured lodging grades as input, prediction accuracy improved significantly. The highest prediction accuracy for ResNet-MF was R² = 0.65 and RMSE = 529.56 kg/ha, while ResNet-EF achieved R² = 0.63 and RMSE = 541.25 kg/ha. When predicted lodging information was used as input, the model’s performance was comparable to that with measured lodging grades, with ResNet-MF achieving R² = 0.63 and RMSE = 539.75 kg/ha, and ResNet-EF achieving R² = 0.60 and RMSE = 559.04 kg/ha. Compared to models using only feature fusion as input, models incorporating predicted lodging information demonstrated improved yield estimation accuracy.

4. Discussion

4.1. The Impact of Hyperspectral Wavelengths on Lodging Classification

Accurate and timely assessment of crop lodging severity is crucial for effective post-disaster production management and the swift processing of insurance claims [25,68]. In this study, a UAV platform equipped with a UHD185 hyperspectral imaging system was used to capture hyperspectral images, providing soybean plot data and canopy spectral reflectance information. Figure 6 illustrates the dynamic effects of lodging on soybean canopy spectral reflectance and applies FOD transformation to enhance the spectral curve’s variation trends, thereby partially mitigating atmospheric influences [69]. This differentiation process effectively reduces noise, highlighting the influence of lodging on the reflectance across various bands and providing supplementary information to improve lodging classification accuracy.

During the early growth stages (55 DAE and 65 DAE), lodging primarily affected reflectance in the green and red spectral bands [70]. Lodging altered the leaf arrangement, which increased light reflection and scattering within the canopy, therefore affecting the reflectance in the green band. The red band, a region of strong chlorophyll absorption [71], exhibited changes in reflectance after lodging. Alterations in leaf surface reflection patterns led to greater absorption of red light by the leaves. In contrast, during the mid-to-late growth stages (76 DAE, 85 DAE and 95 DAE), lodging had a more pronounced impact on the red-edge and near-infrared bands. Since these bands primarily indicate vegetation health and water status [72,73], the loosening of the canopy and moisture loss due to lodging resulted in increased reflectance in these bands. Therefore, the near-infrared band exhibits high sensitivity and significant potential for lodging grade classification, offering essential spectral information to support accurate lodging monitoring and assessment.

4.2. Comparison of Different Modeling Methods

Feature extraction methods for classification algorithms can be broadly divided into traditional machine learning and deep learning approaches [74]. Modern machine learning techniques primarily rely on deep learning and incorporate artificial intelligence for image processing and data analysis [75]. This study evaluates both traditional machine learning and deep learning algorithms for soybean lodging detection, comparing their performance to identify the most effective approach. In lodging classification, the choice of input features significantly influences classification accuracy. The best performance is achieved using a ResNet18-based deep learning framework. The accuracy advantage of ResNet is particularly pronounced at 65 DAE (ACC = 0.76, recall = 0.76, F1 score = 0.73), significantly outperforming spectral features (F1 score = 0.60) and FOD (F1 score = 0.62). This disparity stems from ResNet’s deep architecture, which automatically extracts lodging-related abstract features (e.g., texture gradients, local geometric deformations), whereas spectral and FOD features rely on manual calculations of limited spectral bands, failing to capture complex spatial patterns. During the early 55 DAE stage, feature fusion slightly improved classification accuracy (ACC = 0.74) compared to ResNet alone (ACC = 0.69), suggesting that spectral features may supplement subtle biochemical signals of early lodging. However, as crops mature (65 DAE), lodging-induced physical structural features become more pronounced, allowing ResNet’s spatial modeling capability to dominate, rendering feature fusion redundant. This contrast confirms that ResNet, through residual skip connections, stably extracts multiscale features to adapt to classification demands across different growth stages.

4.3. Performance of the SMOTE Module

Data imbalance significantly affects model classification performance, particularly in lodging classification tasks, where minority class features (e.g., severely lodged samples) are often underrepresented in the training data [12]. A significant number of lodging samples were misclassified as non-lodging samples, which notably affected the model’s classification accuracy. This issue primarily arises from the inherent class imbalance in the dataset, where non-lodging samples (the majority class, 54.26%) greatly outnumber lodging samples (the minority class, 45.74%). Traditional machine learning approaches inherently prioritize the majority class during optimization, a limitation consistent with standard classifiers in imbalanced scenarios. As a result, predictions are often dominated by the majority class (non-lodging samples), leading to a decrease in classification performance for the minority class (lodging samples). To address this issue, the study incorporated the Synthetic Minority Oversampling Technique (SMOTE) module to improve data distribution. As shown in Figure 8, the SMOTE model effectively rebalanced the sample distribution, resulting in a significant improvement in classification performance for lodged plots. Specifically, recognition accuracy for lodged plots increased from 41.67% to 65.48%. Although the SMOTE module improved lodging grade classification accuracy at the cost of a slight decline in non-lodged plot accuracy, this trade-off is acceptable in soybean lodging monitoring. This is because identification of severely lodged plots is critical, as these plots have a greater impact on yield and field management.

The pre-trained ResNet18 model can be effectively transferred to the lodging severity classification task, particularly in terms of RGB and hyperspectral image feature extraction. Since the model has learned generic low-grade and high-grade features from large-scale datasets, such as ImageNet, these features can be effectively applied to the target task. However, the validity of this assumption may be influenced by the characteristics of the target dataset, especially since agricultural datasets often contain more noise and complex background variations. As a result, the effectiveness of transfer learning may vary across different datasets. SMOTE balances class imbalance by generating synthetic samples. The generated synthetic samples can effectively represent the actual distribution of the minority class. However, in lodging severity classification, especially in scenarios with limited data, SMOTE may produce oversimplified or inaccurate samples, particularly when there is significant variability within the minority class. SMOTE can reduce bias towards the majority class, but excessive balancing may lead to overfitting of the minority class. To mitigate this issue, we adjusted the number of nearest neighbors and the sampling ratio in SMOTE during experiments and combined it with cross-validation to ensure the model’s generalization capability.

The experimental design of this study included two consecutive years of field trials (2022–2023) conducted at Keshan Farm, Heilongjiang Province, using standardized cultivation practices on uniformly fertile plots. This setup provided valuable data and insights for soybean lodging classification and yield prediction through multimodal UAV data fusion and deep learning. However, the generalizability of the results may be constrained by the study’s geographic scope, as validation was limited to a single region (Northeast China’s cold temperate zone) and 251 soybean varieties. Future research should extend across broader geographic regions and incorporate a more diverse range of varieties to further validate the method’s applicability and enhance model performance under varying environmental conditions.

4.4. The Impact of Lodging on Yield Prediction

Numerous studies have highlighted that lodging is a key factor influencing crop yield prediction [76,77,78]. Based on existing research, this study employs two fusion strategies at the 85 DAE stage to predict yield using actual lodging data. The results demonstrate that the ResNet-MF model achieved the highest yield estimation performance, with an R² of 0.65 and an RMSE of 529.56 kg/ha, while the ResNet-EF model yielded an R² of 0.63 and an RMSE of 541.25 kg/ha. Feature-level fusion directly combines raw multimodal data into a high-dimensional feature vector, which may lead to the loss or interference of information between modalities. In contrast, ResNet-MF incorporates two attention mechanisms—channel attention and spatial attention—during the fusion process. The channel attention mechanism allows the model to weight feature channels based on their importance, prioritizing features critical for yield prediction. The spatial attention mechanism emphasizes important spatial regions within the image, further enhancing the model’s focus on key areas. Changes in sample size and modality count also affect the performance of both strategies. Feature-level fusion in ResNet-EF may struggle with high-dimensional feature spaces when directly combining multimodal data, particularly with a large number of modalities. In contrast, ResNet-MF processes each modality’s features independently before fusion, allowing it to better adapt to variations in sample size and modality count. The attention mechanisms enable the model to focus on the most relevant information, thereby improving both robustness and accuracy.

When predicted lodging data were incorporated into the yield prediction, the R² values for the two models were 0.63 and 0.60, with RMSE values of 539.75 kg/ha and 559.04 kg/ha, respectively. In contrast, when lodging information was not included, the R² values for the models were 0.62 and 0.59, with RMSE values of 547.18 kg/ha and 564.97 kg/ha, respectively. Although the improvement in accuracy after incorporating lodging information is modest, the data in Table 3 show that the inclusion of lodging information consistently enhances the precision of yield estimation across the entire soybean growth stages (from 55 DAE to 95 DAE). Specifically, in the ResNet-MF model, the addition of lodging information increased the R² value from 0.59 to 0.65, indicating enhanced model reliability, particularly during the later growth stages. These results further confirm that, in addition to the significant role of image-derived features in yield prediction, incorporating lodging data and other key physiological traits substantially improved the models’ predictive accuracy. Furthermore, the intermediate-level multimodal fusion strategy better captures the nonlinear relationships between features, enhancing yield estimation accuracy while reducing reliance on actual data. This approach offers substantial application potential for large-scale regional crop management. Given the difficulty of collecting actual lodging data, this study integrates UAV RGB and hyperspectral images and employs the SMOTE-ResNet approach for high-throughput lodging grade classification, providing an efficient and cost-effective solution to improve yield prediction accuracy.

5. Conclusions

This study examines two years of field data from soybean cultivation, using UAV remote sensing technology with deep learning to monitor soybean lodging and estimate yield. ResNet18 was used as a shared encoder to automatically extract feature vectors from both visible light and hyperspectral images. To mitigate class imbalance in lodging grade classification, the SMOTE module was incorporated into the model framework. Furthermore, the potential of integrating various multimodal fusion strategies with lodging grades for yield estimation was also assessed. The results are summarized as follows:

(1): At all growth stages examined in this study, the feature vectors encoded by ResNet18 consistently outperformed manually extracted image-derived features in lodging classification. This underscores the effectiveness and potential of deep learning-based automated feature extraction methods for accurate soybean lodging monitoring.
(2): In the context of imbalanced lodging class samples, the minority class initially exhibited low classification accuracy. However, incorporating the SMOTE module into the framework significantly improved the accuracy of the minority class. This enhancement resulted in more balanced classification outcomes across various lodging levels, enhancing the model’s capability to effectively identify lodging samples.
(3): Incorporating ground-truth lodging grades into the multimodal intermediate-level fusion strategy improved yield estimation accuracy from 0.62 to 0.65. When lodging grades obtained through SMOTE-ResNet classification were introduced, the yield estimation accuracy reached 0.63, comparable to the results obtained using ground-truth lodging grades. This suggests that including lodging information can significantly enhance yield estimation accuracy, providing valuable theoretical support for yield management in precision agriculture.

Author Contributions

Conceptualization, X.X. and Y.F.; Methodology, G.S.; Software, Y.Z.; Validation, Y.F., G.S. and X.X.; Formal Analysis, L.W.; Investigation, C.C.; Resources, L.R.; Data Curation, H.Y.; Writing—Original Draft Preparation, Y.F.; Writing—Review and Editing, L.M.; Visualization, Y.L.; Supervision, L.Q.; Project Administration, Y.G.; Funding Acquisition, Y.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (grant number XDA0450000, XDA0450200 and XDA0450202) and the key project of ‘Inner Mongolia science and technology promotion action’ (NMKIXM202303).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

We thank the editors and reviewers for their hard work and valuable advice.

Conflicts of Interest

The authors declare no conflict of interest.

References

Pagano, M.C.; Miransari, M. The importance of soybean production worldwide. In Abiotic and Biotic Stresses in Soybean Production; Elsevier: Amsterdam, The Netherlands, 2016; pp. 1–26. [Google Scholar]
Liu, Z.; Ying, H.; Chen, M.; Bai, J.; Xue, Y.; Yin, Y.; Batchelor, W.D.; Yang, Y.; Bai, Z.; Du, M. Optimization of China’s maize and soy production can ensure feed sufficiency at lower nitrogen and carbon footprints. Nat. Food 2021, 2, 426–433. [Google Scholar] [CrossRef] [PubMed]
Xiao, S.; Ye, Y.; Fei, S.; Chen, H.; Cai, Z.; Che, Y.; Wang, Q.; Ghafoor, A.; Bi, K.; Shao, K. High-throughput calculation of organ-scale traits with reconstructed accurate 3D canopy structures using a UAV RGB camera with an advanced cross-circling oblique route. ISPRS J. Photogramm. Remote Sens. 2023, 201, 104–122. [Google Scholar] [CrossRef]
Sun, G.; Zhang, Y.; Chen, H.; Wang, L.; Li, M.; Sun, X.; Fei, S.; Xiao, S.; Yan, L.; Li, Y. Improving soybean yield prediction by integrating UAV nadir and cross-circling oblique imaging. Eur. J. Agron. 2024, 155, 127134. [Google Scholar] [CrossRef]
Han, L.; Yang, G.; Feng, H.; Zhou, C.; Yang, H.; Xu, B.; Li, Z.; Yang, X. Quantitative Identification of Maize Lodging-Causing Feature Factors Using Unmanned Aerial Vehicle Images and a Nomogram Computation. Remote Sens. 2018, 10, 1528. [Google Scholar] [CrossRef]
Wang, J.-J.; Ge, H.; Dai, Q.; Ahmad, I.; Dai, Q.; Zhou, G.; Qin, M.; Gu, C. Unsupervised discrimination between lodged and non-lodged winter wheat: A case study using a low-cost unmanned aerial vehicle. Int. J. Remote Sens. 2018, 39, 2079–2088. [Google Scholar] [CrossRef]
Dai, J.; Zhang, G.; Guo, P.; Zeng, T.; Cui, M.; Xue, J. Information extraction of cotton lodging based on multi-spectral image from UAV remote sensing. Trans. CSAE 2019, 35, 63–70. [Google Scholar]
Chu, T.; Starek, M.J.; Brewer, M.J.; Murray, S.C.; Pruter, L.S. Assessing lodging severity over an experimental maize (Zea mays L.) field using UAS images. Remote Sens. 2017, 9, 923. [Google Scholar]
Yang, M.-D.; Huang, K.-S.; Kuo, Y.-H.; Tsai, H.P.; Lin, L.-M. Spatial and spectral hybrid image classification for rice lodging assessment through UAV imagery. Remote Sens. 2017, 9, 583. [Google Scholar] [CrossRef]
Rajapaksa, S.; Eramian, M.; Duddu, H.; Wang, M.; Shirtliffe, S.; Ryu, S.; Josuttes, A.; Zhang, T.; Vail, S.; Pozniak, C. Classification of crop lodging with gray level co-occurrence matrix. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018. [Google Scholar]
Han, L.; Yang, G.; Yang, X.; Song, X.; Xu, B.; Li, Z.; Wu, J.; Yang, H.; Wu, J. An explainable XGBoost model improved by SMOTE-ENN technique for maize lodging detection based on multi-source unmanned aerial vehicle images. Comput. Electron. Agric. 2022, 194, 106804. [Google Scholar] [CrossRef]
Sarkar, S.; Zhou, J.; Scaboo, A.; Zhou, J.; Aloysius, N.; Lim, T.T. Assessment of Soybean Lodging Using UAV Imagery and Machine Learning. Plants 2023, 12, 2893. [Google Scholar] [CrossRef]
Chemchem, A.; Alin, F.; Krajecki, M. Combining SMOTE sampling and machine learning for forecasting wheat yields in France. In Proceedings of the 2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), Sardinia, Italy, 3–5 June 2019. [Google Scholar]
Zhu, W.; Li, S.; Zhang, X.; Li, Y.; Sun, Z. Estimation of winter wheat yield using optimal vegetation indices from unmanned aerial vehicle remote sensing. Trans. Chin. Soc. Agric. Eng. 2018, 34, 78–86. [Google Scholar]
Gong, Y.; Duan, B.; Fang, S.; Zhu, R.; Wu, X.; Ma, Y.; Peng, Y. Remote estimation of rapeseed yield with unmanned aerial vehicle (UAV) imaging and spectral mixture analysis. Plant Methods 2018, 14, 1–14. [Google Scholar] [CrossRef]
Fei, S.; Hassan, M.A.; Xiao, Y.; Su, X.; Chen, Z.; Cheng, Q.; Duan, F.; Chen, R.; Ma, Y. UAV-based multi-sensor data fusion and machine learning algorithm for yield prediction in wheat. Precis. Agric. 2023, 24, 187–212. [Google Scholar] [CrossRef] [PubMed]
Yang, Q.; Shi, L.; Han, J.; Zha, Y.; Zhu, P. Deep convolutional neural networks for rice grain yield estimation at the ripening stage using UAV-based remotely sensed images. Field Crops Res. 2019, 235, 142–153. [Google Scholar] [CrossRef]
Zhong, L.; Hu, L.; Zhou, H.; Tao, X. Deep learning based winter wheat mapping using statistical data as ground references in Kansas and northern Texas, US. Remote Sens. Environ. 2019, 233, 111411. [Google Scholar] [CrossRef]
Hao, X.; Jia, J.; Khattak, A.M.; Zhang, L.; Guo, X.; Gao, W.; Wang, M. Growing period classification of Gynura bicolor DC using GL-CNN. Comput. Electron. Agric. 2020, 174, 105497. [Google Scholar] [CrossRef]
Karmakar, P.; Teng, S.W.; Murshed, M.; Pang, S.; Li, Y.; Lin, H. Crop monitoring by multimodal remote sensing: A review. Remote Sens. Appl. Soc. Environ. 2024, 33, 101093. [Google Scholar] [CrossRef]
Maimaitijiang, M.; Sagan, V.; Sidike, P.; Hartling, S.; Esposito, F.; Fritschi, F.B. Soybean yield prediction from UAV using multimodal data fusion and deep learning. Remote Sens. Environ. 2020, 237, 111599. [Google Scholar] [CrossRef]
Zhang, Y.; Yang, Y.; Zhang, Q.; Duan, R.; Liu, J.; Qin, Y.; Wang, X. Toward multi-stage phenotyping of soybean with multimodal UAV sensor data: A comparison of machine learning approaches for leaf area index estimation. Remote Sens. 2022, 15, 7. [Google Scholar] [CrossRef]
Chauhan, S.; Darvishzadeh, R.; Boschetti, M.; Pepe, M.; Nelson, A. Remote sensing-based crop lodging assessment: Current status and perspectives. ISPRS J. Photogramm. Remote Sens. 2019, 151, 124–140. [Google Scholar] [CrossRef]
Fischer, R.; Stapper, M. Lodging effects on high-yielding crops of irrigated semidwarf wheat. Field Crops Res. 1987, 17, 245–258. [Google Scholar] [CrossRef]
Kendall, S.; Holmes, H.; White, C.; Clarke, S.; Berry, P. Quantifying lodging-induced yield losses in oilseed rape. Field Crops Res. 2017, 211, 106–113. [Google Scholar] [CrossRef]
Berry, P.M.; Spink, J. Predicting yield losses caused by lodging in wheat. Field Crops Res. 2012, 137, 19–26. [Google Scholar] [CrossRef]
Zhang, N.; Zhang, X.; Yang, G.; Zhu, C.; Huo, L.; Feng, H. Assessment of defoliation during the Dendrolimus tabulaeformis Tsai et Liu disaster outbreak using UAV-based hyperspectral images. Remote Sens. Environ. 2018, 217, 323–339. [Google Scholar] [CrossRef]
Shu, M.; Zhou, L.; Gu, X.; Ma, Y.; Sun, Q.; Yang, G.; Zhou, C. Monitoring of maize lodging using multi-temporal Sentinel-1 SAR data. Adv. Space Res. 2020, 65, 470–480. [Google Scholar] [CrossRef]
Bendig, J.; Yu, K.; Aasen, H.; Bolten, A.; Bennertz, S.; Broscheit, J.; Gnyp, M.L.; Bareth, G. Combining UAV-based plant height from crop surface models, visible, and near infrared vegetation indices for biomass monitoring in barley. Int. J. Appl. Earth Obs. Geoinf. 2015, 39, 79–87. [Google Scholar] [CrossRef]
Nguyen, C.; Sagan, V.; Skobalski, J.; Severo, J.I. Early detection of wheat yellow rust disease and its impact on terminal yield with multi-spectral UAV-imagery. Remote Sens. 2023, 15, 3301. [Google Scholar] [CrossRef]
Meyer, G.E.; Neto, J.C. Verification of color vegetation indices for automated crop imaging applications. Comput. Electron. Agric. 2008, 63, 282–293. [Google Scholar] [CrossRef]
Woebbecke, D.M.; Meyer, G.E.; Von Bargen, K.; Mortensen, D.A. Color indexes for weed identification under various soil, residue, and lighting conditions. Trans. Asae 1995, 38, 259–269. [Google Scholar] [CrossRef]
Steward, B.L.; Tian, L.F. Real-time machine vision weed-sensing. In Proceedings of the ASAE Annual International Meeting, Orlando, FL, USA, 12–16 July 1998; p. 11. [Google Scholar]
Rasmussen, J.; Ntakos, G.; Nielsen, J.; Svensgaard, J.; Poulsen, R.N.; Christensen, S. Are vegetation indices derived from consumer-grade cameras mounted on UAVs sufficiently reliable for assessing experimental plots? Eur. J. Agron. 2016, 74, 75–92. [Google Scholar] [CrossRef]
Verrelst, J.; Schaepman, M.E.; Koetz, B.; Kneubühler, M. Angular sensitivity analysis of vegetation indices derived from CHRIS/PROBA data. Remote Sens. Environ. 2008, 112, 2341–2353. [Google Scholar] [CrossRef]
Metternicht, G. Vegetation indices derived from high-resolution airborne videography for precision crop management. Int. J. Remote Sens. 2003, 24, 2855–2877. [Google Scholar] [CrossRef]
Kataoka, T.; Kaneko, T.; Okamoto, H.; Hata, S. Crop growth estimation system using machine vision. In Proceedings of the 2003 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM 2003), Kobe, Japan, 20–24 July 2003. [Google Scholar]
Gitelson, A.A.; Kaufman, Y.J.; Stark, R.; Rundquist, D. Novel algorithms for remote estimation of vegetation fraction. Remote Sens. Environ. 2002, 80, 76–87. [Google Scholar] [CrossRef]
Guijarro, M.; Pajares, G.; Riomoros, I.; Herrera, P.; Burgos-Artizzu, X.; Ribeiro, A. Automatic segmentation of relevant textures in agricultural images. Comput. Electron. Agric. 2011, 75, 75–83. [Google Scholar] [CrossRef]
Gamon, J.A.; Surfus, J.S. Assessing leaf pigment content and activity with a reflectometer. New Phytol. 1999, 143, 105–117. [Google Scholar] [CrossRef]
Hague, T.; Tillett, N.D.; Wheeler, H. Automated crop and weed monitoring in widely spaced cereals. Precis. Agric. 2006, 7, 21–32. [Google Scholar] [CrossRef]
Gitelson, A.A.; Gritz, Y.; Merzlyak, M.N. Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. J. Plant Physiol. 2003, 160, 271–282. [Google Scholar] [CrossRef]
Tucker, C.J. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sens. Environ. 1979, 8, 127–150. [Google Scholar] [CrossRef]
Gitelson, A.A.; Kaufman, Y.J.; Merzlyak, M.N. Use of a green channel in remote sensing of global vegetation from EOS-MODIS. Remote Sens. Environ. 1996, 58, 289–298. [Google Scholar] [CrossRef]
Daughtry, C.S.T.; Walthall, C.; Kim, M.; De Colstoun, E.B.; McMurtrey Iii, J. Estimating corn leaf chlorophyll concentration from leaf and canopy reflectance. Remote Sens. Environ. 2000, 74, 229–239. [Google Scholar] [CrossRef]
Gong, P.; Pu, R.; Biging, G.S.; Larrieu, M.R. Estimation of forest leaf area index using vegetation indices derived from Hyperion hyperspectral data. IEEE Trans. Geosci. Remote Sens. 2003, 41, 1355–1362. [Google Scholar] [CrossRef]
Chen, Y.; Gillieson, D. Evaluation of Landsat TM vegetation indices for estimating vegetation cover on semi-arid rangelands: A case study from Australia. Can. J. Remote Sens. 2009, 35, 435–446. [Google Scholar] [CrossRef]
Gitelson, A.A.; Viña, A.; Verma, S.B.; Rundquist, D.C.; Arkebauer, T.J.; Keydan, G.; Leavitt, B.; Ciganda, V.; Burba, G.G.; Suyker, A.E. Relationship between gross primary production and chlorophyll content in crops: Implications for the synoptic monitoring of vegetation productivity. J. Geophys. Res.-Atmos. 2006, 111, D08S11. [Google Scholar] [CrossRef]
Gitelson, A.A.; Merzlyak, M.N. Remote estimation of chlorophyll content in higher plant leaves. Int. J. Remote Sens. 1997, 18, 2691–2697. [Google Scholar] [CrossRef]
Rouse, J.W., Jr.; Haas, R.H.; Deering, D.; Schell, J.; Harlan, J.C. Monitoring the Vernal Advancement and Retrogradation (Green Wave Effect) of Natural Vegetation. 1974. Available online: https://ntrs.nasa.gov/citations/19740022555 (accessed on 16 April 2025).
Chen, H.; Huang, W.; Li, W.; Niu, Z.; Zhang, L.; Xing, S. Estimation of LAI in Winter Wheat from Multi-Angular Hyperspectral VNIR Data: Effects of View Angles and Plant Architecture. Remote Sens. 2018, 10, 1630. [Google Scholar] [CrossRef]
Rondeaux, G.; Steven, M.; Baret, F. Optimization of soil-adjusted vegetation indices. Remote Sens. Environ. 1996, 55, 95–107. [Google Scholar] [CrossRef]
Roujean, J.L.; Breon, F.M. Estimating par absorbed by vegetation from bidirectional reflectance measurements. Remote Sens. Environ. 1995, 51, 375–384. [Google Scholar] [CrossRef]
Tian, Y.; Li, Y.; Feng, W.; Tian, Y.; Yao, X.; Cao, W. Monitoring leaf nitrogen in rice using canopy reflectance spectra. In Proceedings of the International Symposium on Intelligent Information Technology in Agriculture, Online, 21–23 November 2007. [Google Scholar]
Haboudane, D.; Miller, J.R.; Tremblay, N.; Zarco-Tejada, P.J.; Dextraze, L. Integrated narrow-band vegetation indices for prediction of crop chlorophyll content for application to precision agriculture. Remote Sens. Environ. 2002, 81, 416–426. [Google Scholar] [CrossRef]
Qi, J.; Chehbouni, A.; Huete, A.R.; Kerr, Y.H.; Sorooshian, S. A modified soil adjusted vegetation index. Remote Sens. Environ. 1994, 48, 119–126. [Google Scholar] [CrossRef]
Broge, N.H.; Leblanc, E. Comparing prediction power and stability of broadband and hyperspectral vegetation indices for estimation of green leaf area index and canopy chlorophyll density. Remote Sens. Environ. 2001, 76, 156–172. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
Srivastava, R.K.; Greff, K.; Schmidhuber, J. Training very deep networks. arXiv 2015, arXiv:1507.06228. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Meng, Z.; Li, L.; Tang, X.; Feng, Z.; Jiao, L.; Liang, M. Multipath residual network for spectral-spatial hyperspectral image classification. Remote Sens. 2019, 11, 1896. [Google Scholar] [CrossRef]
Ioffe, S. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part IV 14. Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
Batista, G.E.; Prati, R.C.; Monard, M.C. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 2004, 6, 20–29. [Google Scholar] [CrossRef]
Chicco, D.; Warrens, M.J.; Jurman, G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. Peerj Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef]
Tian, B.; Luan, S.; Zhang, L.; Liu, Y.; Zhang, L.; Li, H. Penalties in yield and yield associated traits caused by stem lodging at different developmental stages in summer and spring foxtail millet cultivars. Field Crops Res. 2018, 217, 104–112. [Google Scholar] [CrossRef]
Zhang, X.; Liu, D.; Ma, J.; Wang, X.; Li, Z.; Zheng, D. Visible Near-Infrared Hyperspectral Soil Organic Matter Prediction Based on Combinatorial Modeling. Agronomy 2024, 14, 789. [Google Scholar] [CrossRef]
Tian, M.; Ban, S.; Yuan, T.; Ji, Y.; Ma, C.; Li, L. Assessing rice lodging using UAV visible and multispectral image. Int. J. Remote Sens. 2021, 42, 8840–8857. [Google Scholar] [CrossRef]
Zhang, T.-X.; Su, J.-Y.; Liu, C.-J.; Chen, W.-H. Potential bands of sentinel-2A satellite for classification problems in precision agriculture. Int. J. Autom. Comput. 2019, 16, 16–26. [Google Scholar] [CrossRef]
Zhan, Z.; Qin, Q.; Ghulan, A.; Wang, D. NIR-red spectral space based new method for soil moisture monitoring. Sci. China Ser. D Earth Sci. 2007, 50, 283–289. [Google Scholar] [CrossRef]
Laroche-Pinel, E.; Albughdadi, M.; Duthoit, S.; Chéret, V.; Rousseau, J.; Clenet, H. Understanding vine hyperspectral signature through different irrigation plans: A first step to monitor vineyard water status. Remote Sens. 2021, 13, 536. [Google Scholar] [CrossRef]
Lu, Y.; Lu, R. Detection of surface and subsurface defects of apples using structured-illumination reflectance imaging with machine learning algorithms. Trans. ASABE 2018, 61, 1831–1842. [Google Scholar] [CrossRef]
Marsland, S. Machine Learning: An Algorithmic Perspective; Chapman and Hall/CRC: London, UK, 2011. [Google Scholar]
Easson, D.; White, E.; Pickles, S. The effects of weather, seed rate and cultivar on lodging and yield in winter wheat. J. Agric. Sci. 1993, 121, 145–156. [Google Scholar] [CrossRef]
Lang, Y.-Z.; Yang, X.-D.; Wang, M.-E.; Zhu, Q.-S. Effects of lodging at different filling stages on rice yield and grain quality. Rice Sci. 2012, 19, 315–319. [Google Scholar] [CrossRef]
Mi, C.; Zhang, X.; Li, S.; Yang, J.; Zhu, D.; Yang, Y. Assessment of environment lodging stress for maize using fuzzy synthetic evaluation. Math. Comput. Model. 2011, 54, 1053–1060. [Google Scholar] [CrossRef]

Figure 1. The geographical location of the study area and experimental site.

Figure 2. SMOTE-ResNet architecture for lodging grade classification. (a) Image feature extraction block: Multiscale feature extraction is achieved through a series of convolutional layers with different kernel sizes (1 × 1, 3 × 3) and multiple channel depths. (b) SMOTE block: A feature vector sample balancing module based on the oversampling strategy. Downstream model: A machine learning model used to complete the lodging grade classification task.

Figure 3. The proportion of plots with different lodging grades in (a) 2022 and (b) 2023.

Figure 4. Relationship between soybean lodging grades and yield distribution.

Figure 5. Two fusion strategies for yield prediction. (a) ResNet-EF: Multimodal raw data are directly merged into a high-dimensional feature vector and input into a single model for processing. (b) ResNet-MF: Features from each modality are independently extracted and fused into a unified feature space using dual self-attention modules for optimized processing.

Figure 6. Correlation coefficient plot between soybean lodging and (a) original spectral wavelengths, (b) first-order differential.

Figure 7. Comparison of ACC, recall and F1 scores between (a–c) RF and (d–f) SVC. Origin: Original reflectance and vegetation indices. FOD: First-order differential. ResNet: ResNet feature vectors. Fusion: Fusion of all features.

Figure 8. Confusion matrix for the optimal lodging prediction date (65 DAE). (a) Origin, (b) ResNet, (c) SMOTE-ResNet.

Figure 9. Spatial distribution of lodging. (a) Measured lodging, (b) lodging predicted by the random forest model, (c) lodging predicted by the SMOTE-ResNet model.

Figure 10. Yield prediction regression plot at 85 DAE. (a) ResNet-EF; (b) ResNet-EF combined with measured lodging; (c) ResNet-EF combined with estimated lodging; (d) ResNet-MF; (e) ResNet-MF combined with measured lodging; (f) ResNet-MF combined with estimated lodging.

Table 1. Vegetation index calculation formulas based on RGB and hyperspectral images.

Vegetation Indices	Definition	References
$g, r, b, re, nir$	The $DN$ value of each band	/
EXR	$1.4 \times r - g$	[31]
EXG	$2 \times g - r - b$	[32]
EXGR	$3 \times g - 2.4 \times r - b$	[33]
MGRVI	$(g^{2} - r^{2}) / (g^{2} + r^{2})$	[29]
NGRDI	$(g - r) / (g + r)$	[34]
RGRI	$r / g$	[35]
PPRb	$(g - b) / (g + b)$	[36]
CIVE	$0.441 \times r - 0.881 \times g + 0.385 \times b + 18.78$	[37]
VARI	$(g - r) / (g + r - b)$	[38]
WI	$(g - b) / (r - g)$	[32]
GLA	$(2 \times g - r - b) / (2 \times g + r + b)$	[39]
RGBVI	$(g^{2} - b \times r) / (g^{2} + b \times r)$	[40]
VEG	$g^{'} (r^{k} b^{1 - k}), k = 0.667$	[41]
COM	$0.25 \times EXG + 0.3 \times EXGR + 0.33 \times CIVE + 0.12 \times VEG$	[39]
COM2	$0.36 \times EXG + 0.47 \times CIVE + 0.17 \times VEG$	[39]
CI	$(nir / re) - 1$	[42]
DVI	nir $- r$	[43]
GNDVI	$($ nir − g)/(nir + g)	[44]
GRVI	$(g - r) / (g + r)$	[41,45]
MCARI	$((re - r) - 0.2 \times (re - g)) \times (re / r)$	[45]
MNVI	$(1.5 \times ({nir}^{2} - r)) / ({nir}^{2} + r + 0.5)$	[46]
MSR	$(nir / r - 1) / (sqrt (nir / r) + 1)$	[47]
MTCI	(nir $-$ re) $/ ($ re $- r)$	[48]
NDRE	(nir $-$ re)/(nir $+$ re)	[49]
NDVI	$(nir - r) / (nir + r)$	[50]
NLI	$({nir}^{2} - r) / ({nir}^{2} + r)$	[51]
OSAVI	$(1.16 \times (nir - r) / (nir + r + 0.16))$	[52]
RDVI	$(nir - r) / ($ sqrt(nir $+$ r) $)$	[53]
RVI1	$nir / r$	[43]
RVI2	$nir / g$	[54]
TO	$3 \times (($ reg $- r) - 0.2 \times ($ reg $- g) \times (reg / r)) / OSAVI$	[55]
SAVI	$1.5 \times (nir - r) / (nir + r + 0.5)$	[56]
TVI	$60 \times (nir - g) - 100 \times (r - g)$	[57]

Table 2. Comparison of classification performance between ResNet and SMOTE-ResNet models at different growth stages.

Date	ResNet			SMOTE-ResNet
Date	Accuracy	Recall	F1 Score	Accuracy	Recall	F1 Score
55 DAE	0.69	0.68	0.67	0.72	0.73	0.71
65 DAE	0.76	0.76	0.73	0.77	0.76	0.77
76 DAE	0.66	0.64	0.61	0.73	0.72	0.70
85 DAE	0.72	0.71	0.70	0.75	0.72	0.71
95 DAE	0.66	0.66	0.64	0.70	0.70	0.69

Table 3. Soybean yield estimation results.

Date	Metrics	ResNet-EF	ResNet-EF + Measured Lodging	ResNet-EF + Estimated Lodging	ResNet-MF	ResNet-MF + Measured Lodging	ResNet-MF + Estimated Lodging
55 DAE	R²	0.19	0.22	0.21	0.20	0.22	0.21
	RMSE	798.01	786.45	790.72	795.33	784.98	789.18
65 DAE	R²	0.40	0.42	0.41	0.40	0.44	0.43
	RMSE	677.63	662.63	670.03	676.48	653.92	655.13
76 DAE	R²	0.47	0.50	0.49	0.48	0.51	0.49
	RMSE	636.13	615.23	624.27	630.20	612.41	622.94
85 DAE	R²	0.59	0.63	0.60	0.62	0.65	0.63
	RMSE	564.97	541.25	559.04	547.18	529.56	539.75
95 DAE	R²	0.49	0.53	0.52	0.50	0.54	0.53
	RMSE	625.59	600.69	606.48	618.34	594.25	599.53

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, X.; Fang, Y.; Sun, G.; Zhang, Y.; Wang, L.; Chen, C.; Ren, L.; Meng, L.; Li, Y.; Qiu, L.; et al. Soybean Lodging Classification and Yield Prediction Using Multimodal UAV Data Fusion and Deep Learning. Remote Sens. 2025, 17, 1490. https://doi.org/10.3390/rs17091490

AMA Style

Xu X, Fang Y, Sun G, Zhang Y, Wang L, Chen C, Ren L, Meng L, Li Y, Qiu L, et al. Soybean Lodging Classification and Yield Prediction Using Multimodal UAV Data Fusion and Deep Learning. Remote Sensing. 2025; 17(9):1490. https://doi.org/10.3390/rs17091490

Chicago/Turabian Style

Xu, Xingmei, Yushi Fang, Guangyao Sun, Yong Zhang, Lei Wang, Chen Chen, Lisuo Ren, Lei Meng, Yinghui Li, Lijuan Qiu, and et al. 2025. "Soybean Lodging Classification and Yield Prediction Using Multimodal UAV Data Fusion and Deep Learning" Remote Sensing 17, no. 9: 1490. https://doi.org/10.3390/rs17091490

APA Style

Xu, X., Fang, Y., Sun, G., Zhang, Y., Wang, L., Chen, C., Ren, L., Meng, L., Li, Y., Qiu, L., Guo, Y., Yu, H., & Ma, Y. (2025). Soybean Lodging Classification and Yield Prediction Using Multimodal UAV Data Fusion and Deep Learning. Remote Sensing, 17(9), 1490. https://doi.org/10.3390/rs17091490

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Soybean Lodging Classification and Yield Prediction Using Multimodal UAV Data Fusion and Deep Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Area Profile

2.2. Data Collection

2.2.1. UAV Image Acquisition and Preprocessing

2.2.2. Ground Data Collection

2.3. Feature Extraction

2.3.1. Construction of Vegetation Indices

2.3.2. First-Order Differential

2.4. Soybean Lodging Grade Classification Based on RGB and Hyperspectral Images with SMOTE-ResNet

2.4.1. Representation Learning and Feature Fusion Based on ResNet18

2.4.2. Category Balancing of Feature Vectors Based on the SMOTE Strategy

2.4.3. Lodging Classification and Performance Evaluation

2.5. The Construction and Validation of the Yield Estimation Model

3. Results

3.1. Spectral Changes in Soybean Canopy Under Lodging Stress

3.2. Lodging Classification

3.3. Sample Balancing Strategy Based on the SMOTE Module

3.4. Yield Estimation Optimization Based on Lodging Information

4. Discussion

4.1. The Impact of Hyperspectral Wavelengths on Lodging Classification

4.2. Comparison of Different Modeling Methods

4.3. Performance of the SMOTE Module

4.4. The Impact of Lodging on Yield Prediction

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI