Application of UAV-Borne Visible-Infared Pushbroom Imaging Hyperspectral for Rice Yield Estimation Using Feature Selection Regression Methods

Shen, Yiyang; Yan, Ziyi; Yang, Yongjie; Tang, Wei; Sun, Jinqiu; Zhang, Yanchao

doi:10.3390/su16020632

Open AccessArticle

Application of UAV-Borne Visible-Infared Pushbroom Imaging Hyperspectral for Rice Yield Estimation Using Feature Selection Regression Methods

by

Yiyang Shen

¹,

Ziyi Yan

¹,

Yongjie Yang

²,

Wei Tang

²

,

Jinqiu Sun

²

and

Yanchao Zhang

^1,*

¹

School of Information Science and Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China

²

State Key Laboratory of Rice Biology, China National Rice Research Institute, Hangzhou 311400, China

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(2), 632; https://doi.org/10.3390/su16020632

Submission received: 28 November 2023 / Revised: 5 January 2024 / Accepted: 8 January 2024 / Published: 11 January 2024

(This article belongs to the Special Issue Advances in Measurement, Instrument, and Sensing Methods for Sustainable Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

Rice yield estimation is vital for enhancing food security, optimizing agricultural management, and promoting sustainable development. However, traditional satellite/aerial and ground-based/tower-based platforms face limitations in rice yield estimation, and few studies have explored the potential of UAV-borne hyperspectral remote sensing for this purpose. In this study, we employed a UAV-borne push-broom hyperspectral camera to acquire remote sensing data of rice fields during the filling stage, and the machine learning regression algorithms were applied to rice yield estimation. The research comprised three parts: hyperspectral data preprocessing, spectral feature extraction, and model construction. To begin, the preprocessing of hyperspectral data involved geometric distortion correction, relative radiometric calibration, and rice canopy mask construction. Challenges in geometric distortion correction were addressed by tracking linear features during flight and applying a single-line correction method. Additionally, the NIR reflectance threshold method was applied for rice canopy mask construction, which was subsequently utilized for average reflectance extraction. Then, spectral feature extraction was carried out to reduce multicollinearity in the hyperspectral data. Recursive feature elimination (RFE) was then employed to identify the optimal feature set for model performance. Finally, six machine learning regression models (SVR, RFR, AdaBoost, XGBoost, Ridge, and PLSR) were used for rice yield estimation, achieving significant results. PLSR showed the best R² of 0.827 with selected features, while XGBoost had the best R² of 0.827 with full features. In addition, the spatial distribution of absolute error in rice yield estimation was assessed. The results suggested that this UAV-borne imaging hyperspectral-based approach held great potential for crop yield estimation, not only for rice but also for other crops.

Keywords:

UAV; imaging hyperspectral; rice; yield estimation; spectral feature extraction; regression models

1. Introduction

Rice is a primary cereal crop that serves as a primary food source for more than half of the world’s population, particularly in East Asia [1]. It is an essential component of the global food system and plays a crucial role in ensuring food security, nutrition, and income generation for millions of people. Rice yield estimation is crucial for optimizing agricultural management, informing land use planning, and facilitating food trade, all of which contribute to food security and sustainable development.

Remote sensing technology has always been an important approach for crop yield estimation [2]. Traditional remote sensing crop yield estimation research platforms can be mainly divided into two major parts: satellite/airborne remote sensing and tower/ground remote sensing, corresponding to macroscopic monitoring and local observation, respectively [3]. However, macroscopic monitoring presents several challenges, such as (i) low spatial resolution [4], with rice yield estimation results being easily disturbed by background factors such as bare soil, shadows, and other non-plant targets; (ii) the revisit cycle is relatively short [5] and cannot match the growth cycle of rice in terms of time and frequency, thus it cannot guarantee the timeliness of yield estimation; (iii) heavy reliance on costly data types, with few hyperspectral satellite data sources available for civilian use [6]. Furthermore, although tower/ground-based platforms can provide sufficient temporal and spatial resolution, and their potential to assess plant health status has already been proven, the remote sensing data acquired from these specific sites often only enables rice yield estimation within an extremely limited area [7]. Limited studies have been conducted on yield estimation at near-surface medium-to-macro scales due to data acquisition challenges. However, with the advent of advanced UAVs and miniaturized imaging technology, research on UAV-based meso-macro scale remote sensing yield estimation for rice has become feasible.

Hyperspectral cameras have emerged as a powerful tool for remote sensing applications in agriculture. Regarding acquisition modes, Wu categorizes them into four main ones: point scanning (or whiskbroom), line scanning (or push-broom), plane scanning, and single shot [8]. Whiskbroom mode acquires bands pixel-by-pixel, resulting in a BIP cube, while plane scanning creates a BSQ cube from multiple images. The single shot mode collects data simultaneously, producing a data cube in one integration [9]. Push-broom forms a BIL cube from pixel sequences. Whiskbroom’s slowness and plane scanning’s incompatibility with movement make push-broom and single shot the prevalent solutions [10]. For example, with the help of UAV-borne single shot hyperspectrometer, Wang enhanced rice yield model accuracy over two years [11]. Yet, single shot’s developmental phase restricts its resolution. Push-broom hyperspectral, being compact and efficient, is apt for UAV-based agriculture sensing [12]. For example, with the help of remote sensing images acquired by the UAV-mounted push-broom hyperspectral scanner, Feng [13] accurately predicted alfalfa yields. Such UAV platforms, known for their timely data collection and precision, can swiftly capture vital vegetation growth stages, highlighting their potential in crop estimation.

To leverage the benefits of hyperspectral imagery, researchers have explored numerous agricultural applications, including crop classification, disease detection, stress assessment, and more [14]. Notably, crop yield prediction has emerged as a particularly prominent area of study. For instance, Sun [15] employed full-band reflectivity for potato yield prediction using UAVs but faced challenges like overfitting due to correlated hyperspectral features. To counter these issues, many turn to vegetation indices (VIs), which encapsulate plant health. For instance, Feng [16] leveraged various VIs and red-edge parameters for accurate winter wheat yield estimation using PLSR and ANN methods. Furthermore, Sensitive bands, determined through techniques like PCA, enhance model performance. As an example, a study on sweet corn employed PCA to discern crucial bands, which, when combined with regression algorithms and Sellami [17], accurately predicted yield under different conditions. Hence, constructing spectral features based on rice canopy reflectance and analyzing their correlation with yield is crucial for enhancing the accuracy of remote sensing-based rice yield estimation.

Generally, yield estimation has relied on process-based crop simulation models that simulate crop growth by integrating known plant physiological characteristics with environmental factors, as seen in models like SIMED [18], ALSIM [19], and DSSAT-CROPGRO-Perennial Forage [20]. While these models, such as ALFALFA 1.4 and ALF2LP, have been successful, they face limitations due to their extensive need for data about crop variety, management practices, and soil conditions, which are often challenging to acquire. Furthermore, the calibration of these mechanistic models is complex due to the intricate nature of crop physiological processes. In contrast, machine learning approaches, which develop empirical relationships between variables and yields, offer advantages in forecasting without relying on specific crop parameters.

Machine learning is widely adopted in remote sensing for crop yield estimation and prediction [21], and machine learning models adeptly handle complex relationships in crop yield estimation, offering adaptability and improved accuracy [22]. In this context, Diverse models, including SVR and RFR, have been utilized for yield prediction. For example, Fan [23] used UAV imagery and ridge regression for grain yield estimation, showcasing machine learning’s promise in the domain. Further, Yang [24] leveraged a convolutional neural network (CNN) for rice yield estimation from UAV-captured imagery, outpacing traditional methods and underscoring the CNN’s potential for broad and timely forecasts. Therefore, building a rice yield estimation model based on machine learning algorithms is of great significance for improving the accuracy of rice yield estimation.

The critical periods of rice growth include the tillering, panicle initiation, booting, and grain-filling stages. Among these, the grain-filling period is particularly important, as it is the phase during which the rice grains accumulate nutrients and determine the final yield. However, there is limited research on rice yield estimation during the grain-filling stage using UAV-based hyperspectral remote sensing data. Given this context, the objectives of this study are (i) investigating the potential of using push-broom hyperspectral images for rice yield estimation and (ii) comparing the performance of different machine learning models for rice yield estimation.

2. Materials and Methods

The graphical abstract of our study is illustrated in Figure 1.

2.1. Experimental Design and Field Data Collection

The study area is located at the China National Rice Research Institute (30°03′00″ N, 119°56′13″ E) in Fuyang County, Hangzhou, Zhejiang Province. The overview of the experimental site is shown in Figure 2. The experimental site covers an area of 3200 m² (64 m × 50 m), with each plot having an average size of 28 m². All plots are planted with the same rice variety. It is important to note that the plots in Block 1 received excessive herbicide application during field management, while those in Block 2 were treated with standard herbicide doses. The purpose of applying different herbicide treatments is not to analyze their effects on yield but to increase the variation range of rice yield, thereby making the developed model more robust [11]. The rice was manually harvested in mid-October after threshing, drying, and weighing, and the dry matter yield data were used in this study.

2.2. Aerial Image Acquisition

Aerial image acquisition was conducted on 24 September 2022 at 1 p.m. using UAV-based hyperspectral and RGB cameras. The DJI Matrice 600 Pro hexacopter (DJI Technology Co., Ltd., Shenzhen, China) was used, equipped with RTK, and GNSS, providing a horizontal accuracy of ±0.12 m. The hexacopter carried a Specim FX10 hyperspectral push-broom scanner (Specim, Spectral Imaging Ltd., Oulu, Finland). The scanner, with a 32 mm focal length lens offering a 40° field of view, captured 480 pixels and 300 spectral bands at 50 Hz, spanning 385 to 1020 nm. The spectral resolution was 2.2 nm, and the full width at half maximum (FWHM) was 5.5 nm. Pre-flight mission planning used the DJI GS Pro app, where flight speed and height were set at 1.2 m/s and 30 m, respectively. The flight missions were designed to keep an 80% lateral overlap, guaranteeing a minimum 33% overlap in the hyperspectral images, which yielded a ground sampling distance (GSD) of 0.021 m/pixel. Besides, A custom-developed pc application, ISUZU_GR 2.5.0 software (ISUZU OPTICS Corp., Shanghai, China) was employed to set the scanner parameters and receive the feedback. Additionally, RGB data was acquired using a DJI Mavic Pro 2 quadcopter, equipped with a Hasselblad L1D-20C 20 MP digital camera. The camera, with a 10 mm focal length lens and 77° FOV, recorded HD photos at a 12 m flight height and 1.2 m/s velocity. The hardware and software equipment used in this study is shown in Figure 3.

2.3. Aerial Image Preprocessing

2.3.1. RGB Images Mosaic

The processing of orthomosaic maps from UAV-acquired RGB images was automated using Pix4DMapper 4.4.12 software (Pix4D SA, Lausanne, Switzerland). This comprehensive procedure involved aligning 534 raw images, reconstructing a 3D dense point cloud, and generating an orthomosaic map using specific parameters.

2.3.2. Geometric Correction and Ortho-Stitching of the Hyperspectral Data

The hyperspectral image processing was executed in several stages: (1) Hyperspectral image swaths were created by clipping the image along each flight line and subsequently applying a geometric correction through the ‘single line correction’ method [25]; (2) A minimum of eight Ground Control Points (GCPs), chosen for their easily identifiable features like field corners and red line intersections, were used for georeferencing, with the RGB orthomosaic serving as the basemap; (3) The spatial alignment and overlay of the corrected swaths resulted in a hyperspectral orthomosaic. In overlapping regions, the pixel value was dictated by the last mosaicked swath. These procedures were executed using the Arcgis Pro version 3.0.1 software, offering a streamlined and efficient process.

2.3.3. Radiometric Calibration and Average Reflectance Extraction

The process was streamlined into several steps: (1) The Digital Number (DN) value of the hyperspectral orthomosaic was converted into reflectance through radiometric calibration, utilizing an empirical line correction (ELC) method [26] and three radiometric calibration targets with varied reflectance levels (11%, 32%, 51%); (2) The orthomosaic was segmented into plots, and a 0.45 threshold on the 800 nm NIR band was set to isolate rice canopy pixels from the background, thereby excluding lower reflectance pixels; (3) The pixel spectra from the segmented rice canopy were averaged and denoised using a Savitzky-Golay filter, resulting in a more correct dataset for further analysis. These steps can be easily executed using Python 3.6 packages such as GDAL and Scipy. The denoised rice canopy spectra for each plot served as the basis for further analysis, and the results of the aerial image preprocessing were shown in Figure 4.

2.4. Spectral Feature Extraction

2.4.1. Vegetation Indices Construction

Hyperspectral data, comprising hundreds of correlated bands rich in spectral information, pose a challenge due to their multicollinearity, affecting model interpretation and performance [17]. Yet, these spectral characteristics of crops reflect physiological changes, and the derived VIs serve as reliable crop yield indicators. Considering these aspects, we selected 40 VIs as spectral features which, were extracted from the original average reflectance (Table A1 in Appendix A). Commonly used NDVI and its alternatives, such as mNDVI, GNDVI, EVI, NDCI, NDRE, MSR, MTVI, CARI, MCARI, adjust certain wavelengths or introduce scale factors to enhance sensitivity to specific physiological traits. Other indices like SAVI, OSAVI, MSAVI incorporate adjustment factors to mitigate atmospheric and soil background effects [27]. Additionally, NI_Tian and NI_Wang target nitrogen estimation, CRI focuses on carotenoid content, and WBI is typically employed for water content estimation.

2.4.2. Principal Component Analysis

The hyperspectral reflectance data from the rice canopy was processed using Principal Component Analysis (PCA), a multivariate statistical method, to minimize multilinearity and optimize spectral information usage. PCA reconfigures correlated variables into uncorrelated Principal Components (PCs), sequentially ordered based on their explained variance. In this study, two PCs were selected, which explained a total variance of 95.5%.

The correlation among the spectral features was examined and visualized using a correlation heatmap (Figure 5). The heatmap’s color-coded matrix shows the strength and direction of pairwise correlation coefficients between variables. As depicted in Figure 3, a huge portion of the spectral features (about 75%) exhibit a correlation coefficient below 0.5. Notably, MSAVI, MCARI, PRI, and PC2 show minimal correlations, implying their independence and limited influence from other factors. Therefore, compared to the direct use of hyperspectral full-band reflectance, our derived spectral features notably reduce inter-variable correlation, facilitating improved performance of machine learning regression models [15].

2.5. Feature Selection

Recursive Feature Elimination (RFE) is an efficient technique for feature selection, operating by progressively excluding the least important features from a dataset. This methodology aims to pinpoint an optimal set of impactful features that substantially influence a model’s performance. The RFE process typically involves iterative steps: Initially, a base estimator is trained using all available features, and the significance of each feature is determined—usually based on the estimator’s coefficients or feature importance. The least crucial feature is then identified and eliminated, and the base estimator is retrained using the remaining features. This process is repeated until the desired number of features is obtained. In our study, we utilized a linear regressor as the base estimator and performed 10 repetitions.

2.6. Model Construction

To address the complexities within the dataset, including nonlinear relationships, six machine learning regression models, namely, Support Vector Regression (SVR), Random Forest Regressor (RFR), AdaBoost, XGBoost, Ridge, and Partial Least Squares Regressor (PLSR) were employed.

SVR is a variant of Support Vector Machines, adapted for regression tasks, capable of dealing with nonlinear relationships by transforming input data into a higher-dimensional space. The model attempts to balance between reducing prediction error and maintaining model simplicity.

RFR, an ensemble learning method, employs multiple decision trees, each built using random subsets of features and training data. The final prediction is the average of individual tree predictions, improving generalizability and robustness against noise and outliers [28].

AdaBoost combines multiple weak learners into a robust model. It iteratively trains learners while adjusting the training instance weights. Higher weights are assigned to instances with higher prediction errors, prompting subsequent learners to focus on them [29].

XGBoost is a gradient-boosting algorithm that constructs decision trees sequentially; each tree is added to minimize the loss function. It employs a second-order approximation of the loss function, reducing overfitting, and improving prediction accuracy [30].

Ridge regression handles multicollinearity by adding a regularization term to the loss function, penalizing large coefficients, and ensuring a more balanced distribution of the effect across correlated features.

PLSR reduces dimensionality by projecting both input and output onto a lower-dimensional space, maximizing their covariance. It constructs a linear model in this reduced space, proving advantageous when dealing with multicollinearity or numerous features [31].

In evaluating the computational complexity of the algorithms, we considered the time consumption

T

and memory usage

M

as proxies for computational demand. These metrics are critical in gauging the feasibility of algorithms in resource-constrained environments.

2.7. Cross-Validation and Model Performance Evaluation

Five-fold cross-validation is a widely used statistical technique that can evaluate the performance of a model in the case of insufficient data. The dataset is divided into five equivalent subsets. For each model, the training process is repeated five times, using four subsets for training and the remaining subset for test each time. This process ensures each subset gets an opportunity to serve as the test set.

The model performance was assessed using three widely accepted evaluation metrics: coefficient of determination (R²), root mean square error (RMSE), and mean absolute deviation (MAE). The formulae for the three evaluation metrics are as follows:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

MAE = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |

where

y_{i}

and

{\hat{y}}_{i}

are the observed yields and predicted yields for plots

i

, respectively.

\bar{y}

is the mean of the observed yields, and

n

is the number of plots. Higher R² values and lower values of RMSE and MAE indicate superior performance in prediction for regression models.

3. Results

3.1. Spatial Distribution of the Rice Yield

The spatial distribution of observed rice yield is shown in the Figure 6. According to the initial experimental design, the field should have consisted of 84 rice plots. However, during the drying process, unexpected rainfall caused the ink on 14 of the plot labels to blur, rendering them illegible. This led to the exclusion of certain plots from the final analysis. Consequently, only 70 rice plots were included in the statistical evaluation of the experiment, as the data from the unmarked plots could not be accurately attributed or verified due to the rain-induced damage to the labels. The spatial distribution of rice yield in the experimental field reveals a significant contrast between the northern and southern blocks, designated as Block 1 and Block 2, respectively. The highest yield per unit was recorded at 0.350 kg/m² for Plot 57, while the lowest yield was only 0.019 kg/m² for Plot 19. The overall yield of Block 1 appears to be lower than that of Block 2, which could be attributed to the excessive application of herbicides during the rice growing period in Block 1. On the other hand, Block 2 demonstrated relatively normal yields due to the proper dosage of herbicide application, resulting in a healthier crop. Therefore, despite the reduction in the number of rice plots to 70, it is still possible to visualize and analyze the spatial distribution of rice yield effectively. According to previous studies, a sample size of 70 rice plots is adequate for conducting a regression analysis, ensuring that the results obtained are both statistically significant and convincing [4,13,32].

3.2. Efficacy and Interpretation of Feature Ranking

We applied the RFE strategy to rank the 40 VIs and 2 PCs. The robustness of our feature ranking process is demonstrated by the relatively stable rankings of the VIs and PCs across 50 experiments. These rankings allowed us to discern which indices contributed significantly to the models’ predictive capabilities and which were less influential. For instance, the consistently high rankings of mNDVI and TVI signal their vital role in the models. Both indices, specifically developed to estimate vegetation density and health [33], have proven particularly effective in predicting rice yields. Moreover, MTVI1, an index designed to capture information related to chlorophyll content [34], was consistently placed among the top features, demonstrating its important contribution to the models’ performance. Conversely, indices like PRI, NWI2, and NI_Wang consistently ranked lower, indicating their lesser impact on the models’ predictive accuracy. In addition to these, the PCs’ lower rankings may be attributed to their inability to capture information as relevant for predicting rice yields as VIs. Given that PCA is an unsupervised method, it does not consider the relationship between the features and rice yield data. Consequently, despite capturing a considerable proportion of the variance in the original spectral data, the generated PCs might not necessarily be the best predictors for the target variable.

3.3. Feature Screening

To further investigate the impact of high-performing features on model performance, we incrementally incorporated the top-ranked features into six different machine learning models and assessed the training performance as each feature was added, considering a total of 40 VIs and 2 PCs, as shown in Figure 7. The training accuracies obtained by six different models are illustrated in the dot line graph (Figure 8), showing the relationship between the number of features and the R² values for each model. The R² trend for the training set with different models can be observed as follows: As the number of features increases, the R² values generally increase for all models. However, the rate of increase diminishes as more features are added. The model that performs the best is the RFR, consistently achieving the highest R² values across the varying number of features. In contrast, the model that performs the worst is the SVR, consistently showing the lowest R² values compared to the other models. As a result, the top 15 features were selected as the final features for rice yield prediction. The decision is supported by the observation that, for all models, the increase in R² values starts to plateau around 15 features, indicating that adding more features beyond this point has a diminishing impact on the model’s training performance. By selecting the top 15 features, we can maintain a balance between model performance and complexity, making the model more interpretable and computationally efficient.

3.4. Model Performance Evaluation and Comparison

This section evaluated the performance of six distinct models using both full and selected features, as detailed in Section 2.4. Using selected features, the models showed a noticeable decrease in performance. For instance, SVR’s R² value dropped to 0.533, while RFR achieved an R² of 0.649. However, XGBoost displayed a relatively higher R² value of 0.656 among the models in this category, followed closely by PLSR with an R² of 0.643.e test sample results are summarized in Table 1, incorporating metrics such as coefficient of determination (R²), root mean square error (RMSE in kg/m²), and mean absolute error (MAE in kg/m²) to determine model precision and accuracy in rice yield prediction. Superior performance is indicated by higher R² and lower RMSE and MAE values. Using selected features, PLSR and RFR displayed the highest R² values, at 0.827 and 0.826, respectively, coupled with relatively low RMSE and MAE scores, indicating strong predictive efficacy. The Ridge, while boasting the lowest RMSE (0.034), had a slightly lower R² of 0.808. In terms of full-feature application, XGBoost led with an R² value of 0.827 and low RMSE and MAE, followed closely by PLSR and Ridge, with R² values of 0.820 and 0.823, respectively. Considering the full spectrum, the models showed a noticeable decrease in performance. For instance, SVR’s R² value dropped to 0.533, while RFR achieved an R² of 0.649. However, XGBoost displayed a relatively higher R² value of 0.656 among the models in this category, followed closely by PLSR with an R² of 0.643.

The computational requirements for the six machine learning models varied, with XGBoost and RFR exhibiting higher memory usage across both selected and full features, reflective of their intensive tree-building processes. In contrast, Ridge and SVR were more memory efficient. Despite the computational load, XGBoost managed to maintain high accuracy, leading in R² values, particularly with full features. The time efficiency was relatively consistent among the models, suggesting that the choice of feature set did not significantly affect the processing speed. These results highlight a trade-off between memory usage and predictive performance that must be considered in practical applications.

Overall, PLSR and XGBoost consistently demonstrated robust predictive capabilities across all feature sets in estimating rice yield, though there was a clear variance in performance when considering the full spectrum.

Our study, shown in Figure 9, reveals PLSR as the most accurate model with R² of 0.827 and RMSE of 0.036, indicating a strong correlation between observed and predicted yields. Other models show moderate-to-high performances, while SVR is less favorable.

In addition, we have compared our UAV hyperspectral image-based method with those employing Synthetic aperture radar (SAR) like the Sentinel-1A [35]. Our UAV-borne push-broom hyperspectral camera captures high-resolution spectral signatures and superior spatial resolution, which are essential for detailed rice yield estimation. In contrast, Sentinel-1A SAR’s strength lies in its all-weather capability, though it may not achieve the spectral granularity of hyperspectral imaging. Performance-wise, our model achieves an R² of 0.827, outperforming the SAR-based method’s 0.67. This difference underscores our method’s predictive prowess. Although satellites cover larger areas, UAVs’ precision and adaptability make them ideal for regional-scale yield estimations. In essence, while SAR offers broad, all-weather insights, our UAV-hyperspectral approach combines precision with spectral depth, making it a preferable choice for regional yield assessments.

3.5. Spatial Distribution of Absolute Residuals

To evaluate the distribution of prediction errors in rice grain yield estimation, Figure 10 presents a visualization of the absolute residuals across the entire experimental area for each of the six models. The absolute error magnitude is depicted with a color bar that ranges from 0.002 kg/m² to 0.1020 kg/m². The lower values in lighter shades represent areas where the predicted yields are closer to the observed yields, indicating higher prediction accuracy. Conversely, higher values in darker shades represent regions where the predicted yields deviate more significantly from the observed yields, suggesting lower prediction accuracy. The observed yield does not display any distinct pattern of over-prediction or under-prediction.

The adaptability of UAV-based hyperspectral methods to diverse agricultural settings, especially in terms of spatial resolution, is a key challenge. Traditional remote sensing often fails to depict small farm plots accurately, with fine details lost in satellite imagery. UAVs, however, with their superior resolution, as illustrated in Figure 10, offer precise, detailed crop condition insights essential for accurate yield estimations. This ability to provide high-resolution data in various landscapes is crucial for the effectiveness of our rice yield estimation methodology and its broader application in precision agriculture.

4. Discussion

4.1. Optimizing Canopy Mask Extraction

In this research, we employed the NIR reflectance threshold method to formulate the rice canopy mask. Although there is a multitude of methodologies to extract crop canopy masks—including vegetation index thresholds, spectral angle mapping, and machine learning or neural networks—we found these alternatives suboptimal for our case. VIs tend to become saturated, spectral angle mapping can induce noise, compromising the canopy mask quality, and machine learning necessitates a large set of labeled samples while producing results akin to the NIR reflectance threshold method. We decided upon a threshold of 0.45, as per previous studies [36], but also considered an automated selection method using reflectance pixel value histograms [37]. This process identified peaks and troughs in these histograms, and the threshold was selected midway between the peak vegetation signal corresponding to the rice canopy and the neighboring trough. Interestingly, the outcomes of both the empirically determined and automated methods were akin, leading us to choose the simpler approach.

While the NIR reflectance threshold method performs well for high-density crops such as rice, more research is required to enhance the fine-grained canopy mask for low-density crops with wider inter-row spacing. Future studies may also benefit from multi-view analysis techniques to reduce false positives, ensuring that yield estimations accurately reflect true crop performance rather than environmental artifacts [38,39].

4.2. Addressing Scalability Challenges of UAV-Based Hyperspectral Imagery in Rice Yield Estimation

The scalability of UAV-based hyperspectral imagery for rice yield estimation is paramount for widespread adoption. Our proposed method, integrating UAV hyperspectral data with machine learning, has shown potential within a controlled environment. Yet, broader application faces two main challenges: the high costs of hyperspectral technology and the adaptability of the method to varied agricultural landscapes.

The first scalability challenge arises from the significant costs associated with the complex hyperspectral equipment necessary for capturing detailed spectral data and requiring precise calibration techniques. Such sophistication leads to higher financial outlays compared to more conventional imaging methods. Additionally, the preprocessing of hyperspectral images demands extensive time and expertise, particularly when contrasted with simpler multispectral or RGB data processing, for which more developed software solutions exist (like Pix4d Mapper 4.4.12, Envi 5.6).

The second challenge involves the method’s adaptability to different study areas. UAV-based approaches offer superior spatial resolution over traditional remote sensing methods, which is critical for accurately representing small-scale farm plots, as shown in Figure 10. For example, a typical plot size of 5 m × 3 m could be reduced to a single pixel in satellite imagery, potentially compromising spectral data accuracy due to mixed pixels and canopy separation issues. This is where the high-resolution capabilities of UAVs become invaluable, offering detailed insights into crop conditions that are essential for precise yield estimation.

4.3. Future Work

In our future work, we aim to harness the advancements in sensor technology to mitigate the financial burden of hyperspectral imaging. The anticipated miniaturization and mass production of hyperspectral sensors will pave the way for cost-effective data acquisition [40]. In parallel, we propose developing an automated algorithm tailored for hyperspectral data preprocessing to streamline the workflow and minimize manual intervention. This will complement our multiscale approach that combines high-resolution UAV data with the extensive coverage of satellite imagery, ensuring a balance between detail and cost-efficiency [15]. Moreover, by employing transfer learning, we plan to refine our models with data from subsequent seasons, improving their generalizability and mitigating overfitting [41]. Training with diverse datasets from different regions will also enhance the robustness of our models, extending the scalability of our method for global application and capturing inter-annual variability in yield estimation.

5. Conclusions

This paper describes the application of a UAV-borne push-broom hyperspectral camera for rice yield estimation during the filling stage. At first, the preprocessing of remote sensing data, including RGB images mosaic, hyperspectral images mosaic, radiometric calibration, and rice canopy segmentation, was performed to obtain the spectral reflectance information of the rice canopy. Next, spectral features extraction was conducted, as the hyperspectral data contains a wealth of spectral information with high multicollinearity. Then, the recursive feature elimination (RFE) was employed for feature screening to identify the optimal feature set contributing to model performance. Finally, six machine learning regression models (SVR, RFR, AdaBoost, XGBoost, Ridge, and PLSR) were selected for rice yield estimation. The performance analysis revealed that these models achieved good results on both selected and full features. Specifically, PLSR achieved the best R² of 0.827 in selected features and XGBoost achieved the best R² of 0.827 in full features. Based on the model performance in selected features, the spatial distribution of the absolute error of rice yield was mapped. This study highlights the feasibility of using UAV hyperspectral remote sensing for rice field yield estimation and suggests that this approach could potentially benefit researchers working with other crops as well.

Furthermore, the applicability of our findings extends beyond academic research, offering practical benefits to agronomists, crop scientists, and agricultural technology developers. These professionals, particularly those in precision agriculture, can leverage our methods to obtain precise data during critical growth stages like grain filling. This enhanced decision-making tool has the potential to optimize crop yields and improve agricultural practices. Our approach, therefore, not only contributes to the scientific understanding of crop yield estimation but also empowers those in the agricultural sector with actionable insights for effective crop management.

Author Contributions

Methodology, Y.S.; Software, Y.S.; Investigation, Z.Y. and J.S.; Resources, Y.Y. and W.T.; Writing—original draft, Y.S.; Writing—review & editing, Y.Z.; Supervision, Y.Z.; Funding acquisition, Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (NSFC), grant number: 61905219 and the Fundamental Research Funds of Zhejiang Sci-Tech University: 22222318-Y, and the Zhejiang Science and Technology Cooperation Plan, grant number: 2024SNJF073.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

VI	Vegetation Index
PCA	Principal Component Analysis
DN	Digital Number
SVR	Support Vector Regression
RFR	Random Forest Regression
PLSR	Partial Least Squares Regression
RMSE	Root Mean Square Error
MAE	Mean Absolute Deviation
ELC	Empirical Line Correction

Appendix A

Table A1. List of Vegetation Indices with Formulas—This table enumerates the 40 VIs employed in our study, outlining their full names, acronyms, and mathematical formulations, which are integral to our hyperspectral data analysis for rice yield estimation.

Acronym	Full Form	Formula	Reference
NDVI	Normalized Difference Vegetation Index	$\frac{R_{780} {- R}_{670}}{R_{780} {+ R}_{670}}$	[42]
DVI	Difference Vegetation Index	$R_{780} - R_{670}$	[43]
TVI	Triangular Vegetation Index	$0.5 [120 (R_{750} {- R}_{550}) - 200 (R_{670} {- R}_{550})]$	[33]
RDVI	Renormalized Difference Vegetation Index	$\frac{R_{800} {- R}_{670}}{\sqrt{R_{800} {+ R}_{670}}}$	[44]
SPVI	Spectral Polygon Vegetation Index	$0.4 (3.7 (R_{800} - R_{670}) - 1.2 \|R_{530} - R_{670}\|)$	[45]
MSAVI	Modified Soil Adjusted Vegetation Index	$0.5 (2 R_{800} + 1 + \sqrt{{(2 R_{800} + 1)}^{2} - 8 R_{800} \times R_{670}})$	[46]
RE	Red Edge Ratio Index	$\frac{R_{750}}{R_{710}}$	[47]
mNDVI	Modified Normalized Difference Vegetation Index	$\frac{R_{775} - R_{670}}{R_{775} + R_{670}}$	[43]
GNDVI	Green Normalized Difference Vegetation Index	$\frac{R_{860} - R_{550}}{R_{860} + R_{550}}$	[44]
NDRE	Normalized Difference Red-Edge	$\frac{R_{790} - R_{720}}{R_{790} + R_{720}}$	[48]
EVI	Enhanced Vegetation Index	$2.5 \times \frac{R_{860} {- R}_{650}}{R_{860} + (6 \times R_{650}) - (7.5 \times R_{470}) + 1}$	[49]
LCI	Leaf Chlorophyll Index	$\frac{\|R_{850}\| - \|R_{710}\|}{\|R_{850}\| + \|R_{680}\|}$	[44]
SAVI	Soil Adjusted Vegetation Index	$\frac{R_{860} - R_{650}}{R_{860} + R_{650} + 0.25} \times (1 + 0.25)$	[44]
OSAVI	Optimized Soil Adjusted Vegetation Index	$\frac{R_{860} - R_{650}}{R_{860} + R_{650} + 0.16}$	[50]
CARI	Chlorophyll Absorption Reflectance Index	$(R_{700} - R_{670}) - 0.2 \times (R_{700} - R_{550})$	[51]
MCARI	Modified Chlorophyll Absorption Reflectance Index	$[(R_{700} - R_{670}) - 0.2 (R_{700} - R_{550})] (\frac{R_{700}}{R_{670}})$	[52]
TCARI	Transformed Chlorophyll Absorption Reflectance Index	$3 [(R_{700} - R_{670}) - 0.2 (R_{700} - R_{550}) (\frac{R_{700}}{R_{670}})]$	[53]
OVI	Optimal Vegetation Index	$\frac{(1 + 0.45) (R_{800}^{2} + 1)}{R_{670} + 0.45}$	[51]
TCARI/OSAVI	Integrated TCARI and OSAVI	$\frac{TCARI}{OSAVI}$	[53]
MCARI/OSAVI	Integrated MCARI and OSAVI	$\frac{MCARI}{OSAVI}$	[52]
NDCI	Normalized Difference Chlorophyll Index	$\frac{R_{708} {- R}_{665}}{R_{708} {+ R}_{665}}$	[47]
CIgreen	Chlorophyll Index	$\frac{R_{730}}{R_{530}} - 1$	[54]
CIred-edge	Chlorophyll Index Red-Edge	$\frac{R_{850}}{R_{730}} - 1$	[54]
CIrededge710	Chlorophyll Index Red-Edge	$\frac{R_{750}}{R_{710}} - 1$	[54]
REIP	Red-Edge Inflection Point	$700 + 40 \times \frac{\frac{R_{670} {+ R}_{780}}{2} {- R}_{700}}{R_{740} {- R}_{700}}$	[52]
RVSI	Red-Edge Vegetation Stress Index	$\frac{R_{714} {+ R}_{752}}{2} {- R}_{733}$	[52]
MTVI1	Modified Triangular Vegetation Index 1	$1.2 [1.2 (R_{800} {- R}_{550}) - 2 . 5 (R_{670} {- R}_{550})]$	[55]
MTCI2	MERIS Terrestrial Chlorophyll Index 2	$\frac{R_{754} {- R}_{709}}{R_{709} {- R}_{681}}$	[55]
NI_Tian	-	$\frac{R_{705}}{R_{717} + R_{491}}$	[56]
NI_Wang	-	$\frac{R_{924} - R_{703} + 2 R_{423}}{R_{924} + R_{703} - 2 R_{423}}$	[57]
DPI	Double Peak Index	$R_{688} + \frac{R_{710}}{R_{697}^{2}}$	[58]
PRI	Photochemical Reflectance Index	$\frac{R_{531} {- R}_{570}}{R_{531} {+ R}_{570}}$	[58]
PSRI	Plant Senescence Reflectance Index	$\frac{R_{680} {- R}_{500}}{R_{750}}$	[58]
SIPI	Structure Insensitive Pigment Index	$\frac{R_{800} {- R}_{445}}{R_{800} {- R}_{680}}$	[59]
CRI	Carotenoid Reflectance Index	$\frac{1}{R_{510}} - \frac{1}{R_{550}}$	[59]
DD	Double Difference Index	$(R_{749} {- R}_{720}) - R_{701} - R_{672}$	[60]
WBI	Water Band Index	$\frac{R_{970}}{R_{900}}$	[60]
NWI1	Normalized Water Index 1	$\frac{R_{970} {- R}_{900}}{R_{970} {+ R}_{900}}$	[61]
NWI2	Normalized Water Index 2	$\frac{R_{970} {- R}_{850}}{R_{970} {+ R}_{850}}$	[62]
WBI/NDVI	Integrated WBI and NDVI	$\frac{WBI}{NDVI}$	[62]

All the VIs were adapted to the SPECIM FX10 band set using the closest bands available. R is the reflectance

References

Ray, D.K.; Mueller, N.D.; West, P.C.; Foley, J.A. Yield Trends Are Insufficient to Double Global Crop Production by 2050. PLoS ONE 2013, 8, e66428. [Google Scholar] [CrossRef] [PubMed]
Yang, G.; Liu, J.; Zhao, C.; Li, Z.; Huang, Y.; Yu, H.; Xu, B.; Yang, X.; Zhu, D.; Zhang, X.; et al. Unmanned Aerial Vehicle Remote Sensing for Field-Based Crop Phenotyping: Current Status and Perspectives. Front. Plant Sci. 2017, 8, 1111. [Google Scholar] [CrossRef] [PubMed]
Mangalraj, P.; Cho, B.-K. Recent Trends and Advances in Hyperspectral Imaging Techniques to Estimate Solar Induced Fluorescence for Plant Phenotyping. Ecol. Indic. 2022, 137, 108721. [Google Scholar] [CrossRef]
Kanning, M.; Kühling, I.; Trautz, D.; Jarmer, T. High-Resolution UAV-Based Hyperspectral Imagery for LAI and Chlorophyll Estimations from Wheat for Yield Prediction. Remote Sens. 2018, 10, 2000. [Google Scholar] [CrossRef]
Chapman, S.; Merz, T.; Chan, A.; Jackway, P.; Hrabar, S.; Dreccer, M.; Holland, E.; Zheng, B.; Ling, T.; Jimenez-Berni, J. Pheno-Copter: A Low-Altitude, Autonomous Remote-Sensing Robotic Helicopter for High-Throughput Field-Based Phenotyping. Agronomy 2014, 4, 279–301. [Google Scholar] [CrossRef]
Colomina, I.; Molina, P. Unmanned Aerial Systems for Photogrammetry and Remote Sensing: A Review. ISPRS J. Photogramm. Remote Sens. 2014, 92, 79–97. [Google Scholar] [CrossRef]
Sagan, V.; Maimaitijiang, M.; Sidike, P.; Eblimit, K.; Peterson, K.; Hartling, S.; Esposito, F.; Khanal, K.; Newcomb, M.; Pauli, D.; et al. UAV-Based High Resolution Thermal Imaging for Vegetation Monitoring, and Plant Phenotyping Using ICI 8640 P, FLIR Vue Pro R 640, and thermoMap Cameras. Remote Sens. 2019, 11, 330. [Google Scholar] [CrossRef]
Wu, D.; Sun, D.-W. Advanced Applications of Hyperspectral Imaging Technology for Food Quality and Safety Analysis and Assessment: A Review—Part I: Fundamentals. Innov. Food Sci. Emerg. Technol. 2013, 19, 1–14. [Google Scholar] [CrossRef]
Lucieer, A.; Malenovský, Z.; Veness, T.; Wallace, L. HyperUAS-Imaging Spectroscopy from a Multirotor Unmanned Aircraft System: HyperUAS-Imaging Spectroscopy from a Multirotor Unmanned. J. Field Robot. 2014, 31, 571–590. [Google Scholar] [CrossRef]
Adão, T.; Hruška, J.; Pádua, L.; Bessa, J.; Peres, E.; Morais, R.; Sousa, J. Hyperspectral Imaging: A Review on UAV-Based Sensors, Data Processing and Applications for Agriculture and Forestry. Remote Sens. 2017, 9, 1110. [Google Scholar] [CrossRef]
Wang, F.; Yao, X.; Xie, L.; Zheng, J.; Xu, T. Rice Yield Estimation Based on Vegetation Index and Florescence Spectral Information from UAV Hyperspectral Remote Sensing. Remote Sens. 2021, 13, 3390. [Google Scholar] [CrossRef]
Ghamisi, P.; Yokoya, N.; Li, J.; Liao, W.; Liu, S.; Plaza, J.; Rasti, B.; Plaza, A. Advances in Hyperspectral Image and Signal Processing: A Comprehensive Overview of the State of the Art. IEEE Geosci. Remote Sens. Mag. 2017, 5, 37–78. [Google Scholar] [CrossRef]
Feng, L.; Zhang, Z.; Ma, Y.; Du, Q.; Williams, P.; Drewry, J.; Luck, B. Alfalfa Yield Prediction Using UAV-Based Hyperspectral Imagery and Ensemble Learning. Remote Sens. 2020, 12, 2028. [Google Scholar] [CrossRef]
Ray, D.K.; Ramankutty, N.; Mueller, N.D.; West, P.C.; Foley, J.A. Recent Patterns of Crop Yield Growth and Stagnation. Nat. Commun. 2012, 3, 1293. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Yang, W.; Sun, Y.; Chang, C.; Yu, J.; Zhang, W. Fusion of Multispectral Aerial Imagery and Vegetation Indices for Machine Learning-Based Ground Classification. Remote Sens. 2021, 13, 1411. [Google Scholar] [CrossRef]
Feng, H.; Tao, H.; Fan, Y.; Liu, Y.; Li, Z.; Yang, G.; Zhao, C. Comparison of Winter Wheat Yield Estimation Based on Near-Surface Hyperspectral and UAV Hyperspectral Remote Sensing Data. Remote Sens. 2022, 14, 4158. [Google Scholar] [CrossRef]
Sellami, M.H.; Albrizio, R.; Čolović, M.; Hamze, M.; Cantore, V.; Todorovic, M.; Piscitelli, L.; Stellacci, A.M. Selection of Hyperspectral Vegetation Indices for Monitoring Yield and Physiological Response in Sweet Maize under Different Water and Nitrogen Availability. Agronomy 2022, 12, 489. [Google Scholar] [CrossRef]
Schreiber, M.M.; Miles, G.E.; Holt, D.A.; Bula, R.J. Sensitivity Analysis of SIMED¹. Agron. J. 1978, 70, 105–108. [Google Scholar] [CrossRef]
ALSIM 1 (LEVEL 2).pdfALSIM 1 (Level 2) User’s Manual.
Malik, W.; Boote, K.J.; Hoogenboom, G.; Cavero, J.; Dechmi, F. Adapting the CROPGRO Model to Simulate Alfalfa Growth and Yield. Agron. J. 2018, 110, 1777–1790. [Google Scholar] [CrossRef]
Liakos, K.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine Learning in Agriculture: A Review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef]
Zhang, C.; Kovacs, J.M. The Application of Small Unmanned Aerial Systems for Precision Agriculture: A Review. Precis. Agric. 2012, 13, 693–712. [Google Scholar] [CrossRef]
Fan, J.; Zhou, J.; Wang, B.; De Leon, N.; Kaeppler, S.M.; Lima, D.C.; Zhang, Z. Estimation of Maize Yield and Flowering Time Using Multi-Temporal UAV-Based Hyperspectral Data. Remote Sens. 2022, 14, 3052. [Google Scholar] [CrossRef]
Yang, Q.; Shi, L.; Han, J.; Zha, Y.; Zhu, P. Deep Convolutional Neural Networks for Rice Grain Yield Estimation at the Ripening Stage Using UAV-Based Remotely Sensed Images. Field Crops Res. 2019, 235, 142–153. [Google Scholar] [CrossRef]
Jensen, R.R. Single Line Correction Method to Remove Aircraft Roll Errors in Hyperspectral Imagery. J. Appl. Remote Sens. 2008, 2, 023529. [Google Scholar] [CrossRef]
Wang, C.; Myint, S.W. A Simplified Empirical Line Method of Radiometric Calibration for Small Unmanned Aircraft Systems-Based Remote Sensing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 1876–1885. [Google Scholar] [CrossRef]
Naito, H.; Ogawa, S.; Valencia, M.O.; Mohri, H.; Urano, Y.; Hosoi, F.; Shimizu, Y.; Chavez, A.L.; Ishitani, M.; Selvaraj, M.G.; et al. Estimating Rice Yield Related Traits and Quantitative Trait Loci Analysis under Different Nitrogen Treatments Using a Simple Tower-Based Field Phenotyping System with Modified Single-Lens Reflex Cameras. ISPRS J. Photogramm. Remote Sens. 2017, 125, 50–62. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Goyal, P.; Rani, R.; Singh, K. Comparative Analysis of Machine Learning and Ensemble Learning Classifiers for Alzheimer’s Disease Detection. In Proceedings of the 2022 IEEE International Conference on Current Development in Engineering and Technology, CCET, Bhopal, India, 23 December 2022; pp. 1–6. [Google Scholar]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
Wold, S.; Sjöström, M.; Eriksson, L. PLS-Regression: A Basic Tool of Chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109–130. [Google Scholar] [CrossRef]
Sun, C.; Feng, L.; Zhang, Z.; Ma, Y.; Crosby, T.; Naber, M.; Wang, Y. Prediction of End-Of-Season Tuber Yield and Tuber Set in Potatoes Using In-Season UAV-Based Hyperspectral Imagery and Machine Learning. Sensors 2020, 20, 5293. [Google Scholar] [CrossRef]
Broge, N.H.; Leblanc, E. Comparing Prediction Power and Stability of Broadband and Hyperspectral Vegetation Indices for Estimation of Green Leaf Area Index and Canopy Chlorophyll Density. Remote Sens. Environ. 2001, 76, 156–172. [Google Scholar] [CrossRef]
Li, F.; Miao, Y.; Feng, G.; Yuan, F.; Yue, S.; Gao, X.; Liu, Y.; Liu, B.; Ustin, S.L.; Chen, X. Improving Estimation of Summer Maize Nitrogen Status with Red Edge-Based Spectral Vegetation Indices. Field Crops Res. 2014, 157, 111–123. [Google Scholar] [CrossRef]
Wang, J.; Dai, Q.; Shang, J.; Jin, X.; Sun, Q.; Zhou, G.; Dai, Q. Field-Scale Rice Yield Estimation Using Sentinel-1A Synthetic Aperture Radar (SAR) Data in Coastal Saline Region of Jiangsu Province, China. Remote Sens. 2019, 11, 2274. [Google Scholar] [CrossRef]
Guo, A.; Huang, W.; Dong, Y.; Ye, H.; Ma, H.; Liu, B.; Wu, W.; Ren, Y.; Ruan, C.; Geng, Y. Wheat Yellow Rust Detection Using UAV-Based Hyperspectral Technology. Remote Sens. 2021, 13, 123. [Google Scholar] [CrossRef]
Otsu, K.; Pla, M.; Duane, A.; Cardil, A.; Brotons, L. Estimating the Threshold of Detection on Tree Crown Defoliation Using Vegetation Indices from UAS Multispectral Imagery. Drones 2019, 3, 80. [Google Scholar] [CrossRef]
Wiseman, Y. Real-Time Monitoring of Traffic Congestions. In Proceedings of the 2017 IEEE International Conference on Electro Information Technology (EIT), Lincoln, NE, USA, 14–17 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 501–505. [Google Scholar]
Setio, A.A.A.; Ciompi, F.; Litjens, G.; Gerke, P.; Jacobs, C.; Van Riel, S.J.; Wille, M.M.W.; Naqibullah, M.; Sanchez, C.I.; Van Ginneken, B. Pulmonary Nodule Detection in CT Images: False Positive Reduction Using Multi-View Convolutional Networks. IEEE Trans. Med. Imaging 2016, 35, 1160–1169. [Google Scholar] [CrossRef] [PubMed]
Angel, Y.; Turner, D.; Parkes, S.; Malbeteau, Y.; Lucieer, A.; McCabe, M.F. Automated Georectification and Mosaicking of UAV-Based Hyperspectral Imagery from Push-Broom Sensors. Remote Sens. 2019, 12, 34. [Google Scholar] [CrossRef]
Nevavuori, P.; Narra, N.; Linna, P.; Lipping, T. Crop Yield Prediction Using Multitemporal UAV Data and Spatio-Temporal Deep Learning Models. Remote Sens. 2020, 12, 4000. [Google Scholar] [CrossRef]
Rouse, W.; Haas, R.H. Monitoring vegetation systems in the great plains with erts. NASA Spec. Publ. 1974, 351, 309. [Google Scholar]
Wang, Q.; Adiku, S.; Tenhunen, J.; Granier, A. On the Relationship of NDVI with Leaf Area Index in a Deciduous Forest Site. Remote Sens. Environ. 2005, 94, 244–255. [Google Scholar] [CrossRef]
Gitelson, A.A. Wide Dynamic Range Vegetation Index for Remote Quantification of Biophysical Characteristics of Vegetation. J. Plant Physiol. 2004, 161, 165–173. [Google Scholar] [CrossRef]
Thenkabail, P.S.; Smith, R.B.; De Pauw, E. Hyperspectral Vegetation Indices and Their Relationships with Agricultural Crop Characteristics. Remote Sens. Environ. 2000, 71, 158–182. [Google Scholar] [CrossRef]
Qi, J.; Chehbouni, A.; Huete, A.R.; Kerr, Y.H.; Sorooshian, S. A Modified Soil Adjusted Vegetation Index. Remote Sens. Environ. 1994, 48, 119–126. [Google Scholar] [CrossRef]
Gitelson, A.; Merzlyak, M.N. Spectral Reflectance Changes Associated with Autumn Senescence of Aesculus hippocastanum L. and Acer platanoides L. Leaves. Spectral Features and Relation to Chlorophyll Estimation. J. Plant Physiol. 1994, 143, 286–292. [Google Scholar] [CrossRef]
Gitelson, A.A.; Merzlyak, M.N. Remote Estimation of Chlorophyll Content in Higher Plant Leaves. Int. J. Remote Sens. 1997, 18, 2691–2697. [Google Scholar] [CrossRef]
Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the Radiometric and Biophysical Performance of the MODIS Vegetation Indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
Rondeaux, G.; Steven, M.; Baret, F. Optimization of Soil-Adjusted Vegetation Indices. Remote Sens. Environ. 1996, 55, 95–107. [Google Scholar] [CrossRef]
Daughtry, C. Estimating Corn Leaf Chlorophyll Concentration from Leaf and Canopy Reflectance. Remote Sens. Environ. 2000, 74, 229–239. [Google Scholar] [CrossRef]
Haboudane, D. Hyperspectral Vegetation Indices and Novel Algorithms for Predicting Green LAI of Crop Canopies: Modeling and Validation in the Context of Precision Agriculture. Remote Sens. Environ. 2004, 90, 337–352. [Google Scholar] [CrossRef]
Haboudane, D.; Miller, J.R.; Tremblay, N.; Zarco-Tejada, P.J.; Dextraze, L. Integrated Narrow-Band Vegetation Indices for Prediction of Crop Chlorophyll Content for Application to Precision Agriculture. Remote Sens. Environ. 2002, 81, 416–426. [Google Scholar] [CrossRef]
Richardson, A.D.; Duigan, S.P.; Berlyn, G.P. An Evaluation of Noninvasive Methods to Estimate Foliar Chlorophyll Content. New Phytol. 2002, 153, 185–194. [Google Scholar] [CrossRef]
Dash, J.; Curran, P.J. The MERIS Terrestrial Chlorophyll Index. Int. J. Remote Sens. 2004, 25, 5403–5413. [Google Scholar] [CrossRef]
Tian, Y.C.; Yao, X.; Yang, J.; Cao, W.X.; Hannaway, D.B.; Zhu, Y. Assessing Newly Developed and Published Vegetation Indices for Estimating Rice Leaf Nitrogen Concentration with Ground- and Space-Based Hyperspectral Reflectance. Field Crops Res. 2011, 120, 299–310. [Google Scholar] [CrossRef]
Wang, W.; Yao, X.; Yao, X.; Tian, Y.; Liu, X.; Ni, J.; Cao, W.; Zhu, Y. Estimating Leaf Nitrogen Concentration with Three-Band Vegetation Indices in Rice and Wheat. Field Crops Res. 2012, 129, 90–98. [Google Scholar] [CrossRef]
Main, R.; Cho, M.A.; Mathieu, R.; O’Kennedy, M.M.; Ramoelo, A.; Koch, S. An Investigation into Robust Spectral Indices for Leaf Chlorophyll Estimation. ISPRS J. Photogramm. Remote Sens. 2011, 66, 751–761. [Google Scholar] [CrossRef]
Sims, D.A.; Gamon, J.A. Relationships between Leaf Pigment Content and Spectral Reflectance across a Wide Range of Species, Leaf Structures and Developmental Stages. Remote Sens. Environ. 2002, 81, 337–354. [Google Scholar] [CrossRef]
Penuelas, J.; Pinol, J.; Ogaya, R.; Filella, I. Estimation of Plant Water Concentration by the Reflectance Water Index WI (R900/R970). Int. J. Remote Sens. 1997, 18, 2869–2875. [Google Scholar] [CrossRef]
Gao, B. NDWI—A Normalized Difference Water Index for Remote Sensing of Vegetation Liquid Water from Space. Remote Sens. Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]
McFeeters, S.K. The Use of the Normalized Difference Water Index (NDWI) in the Delineation of Open Water Features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]

Figure 1. Flowchart of feature selection-based machine learning for rice yield estimation using UAV aerial hyperspectral imaging. The flowchart encompasses four stages: data acquisition, preprocessing, feature engineering, model construction, and evaluation. [m plots × n features] is the shape of the dataset, and m, n are the number of plots and features, respectively.

Figure 2. RGB orthomosaic map for experimental site location at CNRRI in Fuyang County, Zhejiang Province, on 24 September 2022 (left). The experimental site was divided into 2 blocks, which are marked by orange dotted lines. Each block consisted of 3 × 14 plots, which are marked with white boxes.

Figure 3. Overview of the UAV system and components used in the study. (a) DJI M600 Pro UAV system carries; (b) SPECIM FX10 Hyperspectral camera; (c) DJI Mavic Pro 2 served as an auxiliary UAV; and (d) DJI GS Pro was used for flight mission planning.

Figure 4. The aerial image preprocessing: (a) geometric correction of hyperspectral swath; (b) one of the GCPs (the intersection of the red line) for hyperspectral swath georeferencing; (c) hyperspectral orthomosaic; (d) rice canopy segmentation; (e) average rice canopy spectrum with standard deviation for rice canopy.

Figure 5. The correlation coefficient of different VIs and PCs.

Figure 6. Spatial distribution of observed rice yield.

Figure 7. Feature importance statistics of in 50 experiments. The vertical axis indicates the frequency of each feature being ranked within the intervals 1–14, 15–28, and 29–42 across the 50 experiments.

Figure 8. Training accuracy of six different models as a function of the number of features.

Figure 9. Scatter plots of observed vs. predicted yields from (a) SVR; (b) RFR; (c) AdaBoost; (d) XGBoost; (e) Ridge; and (f) PLSR.

Figure 10. Spatial distribution of absolute residuals of observed and predicted yields calculated from (a) SVR; (b) RFR; (c) AdaBoost; (d) XGBoost; (e) Ridge; and (f) PLSR.

Table 1. Test accuracies and computational complexity of SVR, RFR, AdaBoost, XGBoost, Ridge, and PLSR in predicting the rice yield.

Features	Model	R²	RMSE (kg/m²)	MAE (kg/m²)	Time (s)	Memory (MB)
Selected features	SVR	0.804	0.037	0.028	4.077	20.421
	RFR	0.826	0.036	0.029	4.050	45.225
	AdaBoost	0.807	0.040	0.030	3.906	27.090
	XGBoost	0.818	0.041	0.033	3.951	32.175
	Ridge	0.808	0.034	0.033	4.131	18.495
	PLSR	0.827	0.036	0.029	4.221	22.635
Full features	SVR	0.817	0.036	0.028	4.983	24.959
	RFR	0.805	0.035	0.028	4.950	55.275
	AdaBoost	0.810	0.037	0.029	4.774	33.110
	XGBoost	0.827	0.036	0.029	4.829	39.325
	Ridge	0.823	0.037	0.029	5.049	22.605
	PLSR	0.820	0.034	0.027	5.159	27.665
Full spectrum	SVR	0.533	0.059	0.054	22.122	22.69
	RFR	0.649	0.049	0.037	32.214	50.252
	AdaBoost	0.602	0.042	0.052	27.502	30.106
	XGBoost	0.656	0.055	0.041	28.642	35.755
	Ridge	0.541	0.037	0.046	22.238	20.552
	PLSR	0.643	0.034	0.041	22.204	25.153

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shen, Y.; Yan, Z.; Yang, Y.; Tang, W.; Sun, J.; Zhang, Y. Application of UAV-Borne Visible-Infared Pushbroom Imaging Hyperspectral for Rice Yield Estimation Using Feature Selection Regression Methods. Sustainability 2024, 16, 632. https://doi.org/10.3390/su16020632

AMA Style

Shen Y, Yan Z, Yang Y, Tang W, Sun J, Zhang Y. Application of UAV-Borne Visible-Infared Pushbroom Imaging Hyperspectral for Rice Yield Estimation Using Feature Selection Regression Methods. Sustainability. 2024; 16(2):632. https://doi.org/10.3390/su16020632

Chicago/Turabian Style

Shen, Yiyang, Ziyi Yan, Yongjie Yang, Wei Tang, Jinqiu Sun, and Yanchao Zhang. 2024. "Application of UAV-Borne Visible-Infared Pushbroom Imaging Hyperspectral for Rice Yield Estimation Using Feature Selection Regression Methods" Sustainability 16, no. 2: 632. https://doi.org/10.3390/su16020632

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of UAV-Borne Visible-Infared Pushbroom Imaging Hyperspectral for Rice Yield Estimation Using Feature Selection Regression Methods

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Design and Field Data Collection

2.2. Aerial Image Acquisition

2.3. Aerial Image Preprocessing

2.3.1. RGB Images Mosaic

2.3.2. Geometric Correction and Ortho-Stitching of the Hyperspectral Data

2.3.3. Radiometric Calibration and Average Reflectance Extraction

2.4. Spectral Feature Extraction

2.4.1. Vegetation Indices Construction

2.4.2. Principal Component Analysis

2.5. Feature Selection

2.6. Model Construction

2.7. Cross-Validation and Model Performance Evaluation

3. Results

3.1. Spatial Distribution of the Rice Yield

3.2. Efficacy and Interpretation of Feature Ranking

3.3. Feature Screening

3.4. Model Performance Evaluation and Comparison

3.5. Spatial Distribution of Absolute Residuals

4. Discussion

4.1. Optimizing Canopy Mask Extraction

4.2. Addressing Scalability Challenges of UAV-Based Hyperspectral Imagery in Rice Yield Estimation

4.3. Future Work

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI