Field Rice Growth Monitoring and Fertilization Management Based on UAV Spectral and Deep Image Feature Fusion

Chen, Bingnan; Su, Qihe; Li, Yansong; Chen, Rui; Yang, Wanneng; Huang, Chenglong

doi:10.3390/agronomy15040886

Open AccessArticle

Field Rice Growth Monitoring and Fertilization Management Based on UAV Spectral and Deep Image Feature Fusion

by

Bingnan Chen

^1,2,3,

Qihe Su

¹,

Yansong Li

¹,

Rui Chen

¹,

Wanneng Yang

^2,3,4

and

Chenglong Huang

^1,2,3,4,*

¹

College of Engineering, Huazhong Agricultural University, Wuhan 430070, China

²

Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan 430070, China

³

Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518000, China

⁴

National Key Laboratory of Crop Genetic Improvement, National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China

^*

Author to whom correspondence should be addressed.

Agronomy 2025, 15(4), 886; https://doi.org/10.3390/agronomy15040886

Submission received: 7 March 2025 / Revised: 26 March 2025 / Accepted: 27 March 2025 / Published: 1 April 2025

(This article belongs to the Special Issue Revolutionizing Crop Management: Integrating UAV Technology for Precision Agriculture)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Rice, as a globally vital staple crop, requires efficient field monitoring to ensure optimal growth conditions. This study proposed a novel framework for classifying nutrient deficiencies and formulating fertilization strategies in field-grown rice by fusing UAV-derived vegetation indices (VIs) with deep image features extracted via deep neural networks. The framework integrated visible light VIs, spectral VIs, and image features to provide a comprehensive reflection of crop nutritional conditions, aligning closely with practical production needs. The deep image features achieved nutrition classification accuracies of 88.78% and 84.56% for rice spikelet protection fertilizer application stage (S1) and bud-promoting fertilizer application stage (S2), while the fusion of VIs and deep image features significantly enhanced the accuracy of nutrient classification, with the RF model achieving the highest accuracy (97.50% in S1 and 96.56% in S2). The proposed fertilization strategy effectively improved rice growth traits, demonstrating the potential of UAV-based remote sensing for precision agriculture, which would provide a scalable solution for optimizing rice cultivation and ensuring food security.

Keywords:

unmanned aerial vehicle; vegetation indices; deep learning and machine learning; feature fusion; fertilization management

1. Introduction

Rice, a vital staple crop cultivated across 155 million hectares globally, serves as the primary food source for approximately 3 billion people. By 2050, the global population is projected to reach 9 billion, necessitating a 70% to 100% increase in food production to meet escalating demand [1,2]. Rapid classification and diagnosis of nutrient deficiencies in field-grown rice, followed by timely management interventions, are critical for ensuring healthy crop growth, accelerating the breeding of superior varieties, and safeguarding food security.

To address the need for high-throughput, efficient, and low-cost field detection, unmanned aerial vehicles (UAVs) have emerged as transformative tools in modern agriculture [3]. Equipped with advanced sensors such as RGB cameras, multispectral imagers, and thermal devices, UAVs enable real-time, high-resolution, and non-invasive monitoring of crop health and field conditions [4,5,6]. Over the past decade, UAVs have been widely adopted in precision agriculture [7], facilitating applications such as pest detection [8], crop health assessment [9], yield prediction [10], and nutrient management [11,12]. The integration of UAVs with remote sensing technologies [13], particularly multispectral and hyperspectral imaging, has revolutionized agricultural data collection and analysis [14,15,16]. For instance, vegetation indices (VIs) derived from multispectral data, such as the Normalized Difference Vegetation Index (NDVI), Green Normalized Difference Vegetation Index (GNDVI), Enhanced Vegetation Index (EVI), and so on, have been extensively utilized to assess crop health, nutrient conditions, and biomass [17,18]. These indices leverage the spectral reflectance properties of plants to provide insights into their physiological conditions, enabling early detection of nutrient deficiencies, water stress, and disease outbreaks [19,20]. Despite these advancements, challenges such as limited flight time, regulatory restrictions, and data processing complexities remain barriers to widespread adoption [21]. Moreover, UAVs have been employed for precision spraying of fertilizers and pesticides, reducing chemical usage and minimizing environmental impact [22,23]. Despite these advancements, challenges such as limited flight time, regulatory restrictions, and data processing complexities remain significant barriers to the widespread adoption of UAVs in agriculture [24,25].

Machine learning (ML) algorithms have further enhanced the utility of VIs by enabling the development of predictive models for crop classification, yield estimation, and nutrient diagnosis [26,27]. For instance, random forest (RF) and support vector machine (SVM) methods have been successfully applied in terrain classification, crop trait detection, and the prediction of nitrogen content in crops such as rice, wheat, maize, and cotton [28,29]. The combination of VIs and ML has proven particularly effective in addressing the challenges of spatial and temporal variability in agricultural fields, providing a robust framework for precision nutrient management [30]. Moreover, the integration of various types of vegetation indices has further enhanced their performance in the aforementioned aspects [31,32]. In recent years, deep learning (DL) techniques have gained significant traction in agricultural remote sensing due to their ability to automatically extract complex features from high-dimensional data. Convolutional neural networks (CNNs) and other deep architectures have been employed for tasks such as crop classification, field segmentation, and disease detection [33]. For example, U-Net, a popular deep learning model, has been used for precise field boundary detection and crop type mapping, while Vision Transformers (ViTs) have shown promise in handling multi-spectral and multi-temporal data for yield prediction [34,35]. Deep learning models excel in capturing spatial and spectral patterns in UAV imagery, making them ideal for applications such as nutrient deficiency classification and topdressing management [36,37].

However, field fertilization practices do not exhibit ideal uniform distribution [38], but rather demonstrate variability depending on application methods and crop growth stages [39]. Additionally, excessive or indiscriminate fertilizer application can adversely affect land productivity [40,41]. Therefore, this study proposes a novel framework for classifying nutrient deficiencies and formulating fertilization management strategies in field-grown rice, aiming to accurately reflect the actual nutrient deficiency conditions during rice growth. The framework integrated visible light VIs, multispectral VIs, and image features extracted through deep neural networks, leveraging the complementary strengths of these feature types to achieve accurate and robust nutrient deficiency classification. By utilizing UAV platforms for high-throughput and regionalized field detection, the framework generates real-time fertilization prescription maps based on actual nutrient deficiency conditions, enabling intelligent and precise management of rice growth. This approach not only enhanced the efficiency and accuracy of nutrient diagnosis but also aligned more closely with practical production needs, offering a scalable solution for optimizing rice cultivation and ensuring food security.

2. Materials and Methods

The overall experimental design and technical approach of this study are illustrated in Figure 1. The controlled field experiment was conducted on high-quality ratoon rice under nutrient deficiency conditions. Multispectral imageries of the rice field were captured using an unmanned aerial vehicle (UAV) remote sensing platform, and data were extracted from the imagery using ENVI version 5.3 software (HARRIS Geospatial, Wokingham, UK). Subsequently, visible light VIs and spectral VIs were calculated based on the multispectral imagery data, while deep features were extracted from the visible light imagery. Following this, the VI features, after significance screening, were fused with the deep features and utilized as input features for the machine learning models. Machine learning models for classifying nutrient deficiency in the field were constructed using XGBoost, random forest (RF), and support vector machine (SVM) classifiers. Finally, the optimal classification model was employed to identify actual nutrient deficiency conditions in the field, and corresponding fertilization strategies for the ratoon season were formulated based on the deficiency conditions.

2.1. Study Area and Field Experimental Design

As illustrated in Figure 2A, this study was conducted in 2024 at the rice cultivation base of Huazhong Agricultural University in Wuhan, Hubei Province, China (30.474852° N, 114.356769° E). The study focused on a controlled nutrient deficiency experiment for high-quality ratoon rice. The study area is characterized by a subtropical monsoon climate, with favorable environmental conditions, distinct seasons, abundant sunlight, and ample rainfall. The average annual rainfall is 1269 mm, and the average annual temperature ranges between 15.8 °C and 17.5 °C. In this study, the main growing season of the ratoon rice spanned from 30 April 2024 (transplanting date) to 3 August 2024 (main season harvest date), while the ratoon season continued until 10 October 2024 (ratoon season harvest date). Field management practices followed local standards, including sufficient irrigation, necessary herbicide and pesticide applications, and other standard agronomic practices.

In this study, base fertilizer was applied before transplanting, with additional fertilizers (tillering, panicle initiation, and spikelet protection) applied during the main season. To capture nutrient deficiency variations, four fertilization levels (25%, 50%, 75%, and 100% of the standard rate, labeled N1–N4) were implemented. In the ratoon season, bud-promoting and tillering fertilizers were applied at levels consistent with the main season. Detailed fertilization rates are provided in Table 1. The experiment involved six ratoon rice varieties (HHZ, LY6326, ZDQY1610, LLY68812, YY4949, and WLY6312), labeled V1–V6, with the field experimental design illustrated in Figure 2B.

2.2. UAV Platform for Field Imagery Acquisition, Processing, and Analysis

2.2.1. UAV Multispectral Imagery Acquisition

In this study, multispectral imagery of the field was captured one week before the application windows for the spikelet protection fertilizer in the main season (18 June 2024) and the bud-promoting fertilizer in the ratoon season (20 July 2024). These imageries, labeled S1 and S2, guided specific fertilization strategies. The UAV platform used was the DJI M3M RTK quadcopter (SZ DJI Technology Co., Ltd., Shenzhen, China), equipped with a multispectral camera capturing reflectance in five bands: visible light (RGB), near-infrared (NIR), and red edge (RE). The camera’s specific parameters are shown in Figure 3A, with an imagery resolution of 2592 × 1944, a field of view of 84°, and an equivalent focal length of 24 mm.

During acquisition, the UAV flew at 15 m altitude and 1.5 m/s speed, with 75% forward and side overlap. Each flight captured approximately 800 multispectral imageries, covering the entire experimental field. To ensure accuracy, imagery was acquired under clear, cloudless conditions around 10:00 a.m. The discrete images were stitched using DJI Terra software Version 4.1.0 (SZ DJI Technology Co., Ltd., Shenzhen, China) to generate a complete field multispectral imagery, saved in high-resolution TIFF format, as shown in Figure 3B.

2.2.2. Vegetation Index Features and ROIs Extraction

Vegetation indices (VIs) were calculated based on the multispectral imagery captured by the UAV. These indices are linear or nonlinear combinations of multiple spectral bands, and they were used to replace single-band reflectance values. Depending on the types of bands involved in the calculation, VIs were categorized into visible light VIs (RGB-VIs) and spectral VIs (S-VIs). RGB-VIs were derived from the reflectance values of the R, G, and B bands, while S-VIs were calculated using all five bands: R, G, B, NIR, and RE. Typically, RGB-VIs provided a more intuitive representation of the surface color distribution of crops in the field, whereas S-VIs offered more detailed insights into the nutritional conditions of crops. These two types of VI features complemented each other.

In this study, 13 S-VIs and 10 RGB-VIs related to rice growth conditions were calculated from the field multispectral imagery and used as features. The values of each feature, including 23 VIs and RGB, were treated as a separate band and merged to create a new TIFF image. This process transformed the original multispectral imagery into a new image containing 26 band values, which was then used for subsequent image extraction and dataset creation. The specific VIs and their calculation formulas are listed in Table 2.

ROI refers to the region of interest, which is manually selected to identify specific areas of interest. In this study, ROIs were extracted for each variety within each fertilization treatment area. Two ROIs were selected for each variety in each fertilization treatment area, ensuring that these ROIs covered the entire rice planting region. After extraction, the boundaries of the rice planting area and field pathways within each ROI were removed. Following this procedure, 48 valid ROIs were extracted from the multispectral imagery of the field during both the S1 and S2 periods, covering 6 varieties and 4 fertilization treatment areas. Subsequently, each ROI image was randomly cropped to obtain 125 sub-images, with each sub-image containing the 26 features mentioned earlier. A dataset containing 1000 sub-images was created for each variety, which was then used for subsequent training. This process is illustrated in Figure 3C.

2.2.3. Significance Analysis of Vegetation Index Features

A large number of VI features could hinder the accurate and efficient construction of classification models. To identify more relevant VI feature combinations, reduce data dimensionality, minimize interference from irrelevant data, and enhance the sensitivity of the UAV platform to imagery of different nutrient-deficient areas in the field, this study employed the Pearson correlation coefficient method. Formula (1) was used to calculate the correlation coefficient between each VI feature and the nutrient deficiency categories of field rice. Subsequently, the statistical significance was tested using p-values, with p < 0.05 considered statistically significant. The correlation coefficients and p-values of each VI feature were statistically analyzed, and VI features with |r| < 0.2 were further excluded from the significant VI feature combinations. Finally, by combining data from both the S1 and S2 periods, the intersection of highly correlated VI features was taken to obtain a VI feature combination applicable to different periods.

r = \frac{\sum_{i = 1}^{n} (X_{i} - \bar{X}) (Y_{i} - \bar{Y})}{\sqrt{{\sum_{i = 1}^{n} (X_{i} - \bar{X})}^{2}} \sqrt{{\sum_{i = 1}^{n} (Y_{i} - \bar{Y})}^{2}}}

(1)

where r represents the correlation coefficient between variables X and Y. If r > 0, then X and Y are positively correlated; if r < 0, then X and Y are negatively correlated; and if r = 0, then there is no correlation between the two. X_i is the measured value of the i-th variable X, and

\bar{X}

is the mean value of variable X, which in this study corresponds to the feature values of the VIs. Y_i is the measured value at the i-th position corresponding to X_i, and

\bar{Y}

is the mean value of variable Y, which in this study corresponds to the label values of different nutrient-deficient areas. n is the number of samples for variables X or Y, which in this study refers to the number of experimental plots.

2.3. Deep Feature Extraction from Imagery Based on VGG16

VGG16, an excellent convolutional neural network (CNN) architecture, was proposed by the Visual Geometry Group at the University of Oxford in 2014. The use of VGG16 as the backbone for feature extraction in image classification tasks has demonstrated strong performance in land cover classification, multi-scene classification, and phenological change detection, indicating its applicability in the field of agricultural remote sensing imagery classification. Additionally, this study focused on classifying nutrient deficiency in different areas of field rice based on imagery captured by a UAV platform, which aligns with the application scenarios of VGG16. Therefore, VGG16 was employed to train the dataset, and deep features were extracted using the optimal model obtained from the training process.

As shown in Figure 4a, the original architecture of the VGG16 network involved input imagery passing through multiple convolutional layers for feature extraction. The feature vectors were then flattened and fed into two fully connected layers, each containing 4096 neurons (FC-4096), followed by a final output layer for classification into 1000 categories. In this study, since the final classification task involved only 4 categories, the use of the original FC-4096 fully connected structure would lead to overfitting in the deep learning model due to excessive parameters. This was evidenced by the classification accuracy exceeding 99% during training for nutrient deficiency in each rice variety across both the S1 and S2 periods. In addition to model overfitting, the original fully connected structure also failed to effectively extract the required number of image features. Therefore, dimensionality reduction was applied to the fully connected part of the original network. The mathematical description of the fully connected layers in the original VGG16 network is shown in Equation (2).

\{\begin{cases} {y_{1}}^{(4096)} = W^{(1)} \cdot x^{(7 * 7 * 512)} + b^{(1)} \\ {y_{2}}^{(4096)} = W^{(2)} \cdot x^{(4096)} + b^{(2)} \\ {y_{3}}^{(1000)} = W^{(3)} \cdot x^{(4096)} + b^{(3)} \end{cases}

(2)

where x is the input vector, y is the output vector, and their subscripts represent the number of elements in the vectors; W is the weight matrix, b is the bias matrix, and their subscripts represent the parameters of different layers.

Meanwhile, max pooling selected the maximum value from the pooling window as the output, making it more suitable for applications that required highlighting prominent features. On the other hand, average pooling calculated the average value of all pixel values within the pooling window as the output, which smoothed the feature map. Since the target scenario of this study was more inclined to reflect the overall characteristics of a specific area in the field, the max pooling layers in the VGG16 network were replaced with average pooling layers. For an input feature map x with a pooling window size of k × k, the mathematical descriptions of max pooling and average pooling operations were given by Equations (3) and (4), respectively.

y_{\max} (i, j) = \begin{matrix} k \\ M A X \\ m = 1, n = 1 \end{matrix} [x (i + m - 1, j + n - 1)]

(3)

y_{a v e r .} (i, j) = \frac{1}{k^{2}} \sum_{m = 1}^{k} \sum_{n = 1}^{k} x (i + m - 1, j + n - 1)

(4)

Here, x (i + m − 1, j + n − 1) represents the pixel value within the pooling window of the input feature map, and y (i, j) represents the pixel value of the output feature map.

As shown in Figure 4b–d, each network variant included a fully connected layer with 15 neurons (FC-15) before the final 4-class classification layer. This was designed to reduce the dimensionality of image features to an appropriate number. The value of 15 was chosen because it was higher than the number of visible light VI features (10) and multispectral VI features (13) mentioned earlier but smaller than their combined total (23). Before the FC-15 layer, three different configurations were implemented. Figure 4b shows the structure where only one of the two original FC-4096 layers was retained, followed immediately by an FC-15 layer for feature dimensionality reduction. Figure 4c shows the structure where both of the original FC-4096 layers were retained, followed immediately by an FC-15 layer for feature dimensionality reduction. Figure 4d shows the structure where both of the original FC-4096 layers were retained, followed by an FC-50 layer for preliminary feature dimensionality reduction and then an FC-15 layer to further reduce the features to the specified number.

Finally, the dataset for each variety was divided into training and testing sets in a 1:1 ratio. The modified network was then trained to obtain the optimal classification model, which was regarded as an image deep feature extractor for the corresponding variety. In this study, the model’s performance was evaluated using classification accuracy and average precision, with their calculation formulas provided in Equations (5) and (6). Subsequently, the trained model was called, and the final classification layer of the network was removed. By inputting imagery for feature extraction, 15-dimensional deep image features, denoted as Features-Deep, were obtained.

A c c . = \frac{\sum_{i = 1}^{C} T P_{i}}{N} \times 100 %

(5)

Here, Acc. represents the value of classification accuracy, C is the total number of categories (in this study, C = 4), TP_i is the number of correctly predicted samples for the i-th category, and N is the total number of samples in the test set.

P r e c i s i o n = \frac{T P_{i}}{T P_{i} + F P_{i}}

(6)

Here, Precision represents the value of average precision, and FP_i is the number of samples from other categories that are incorrectly predicted as the i-th category.

2.4. Machine Learning Classification Based on Fusion Features

As described earlier, this study extracted three types of features: RGB-VIs, S-VIs, and Features-Deep. In subsequent usage, these features were denoted as R, S, and F, respectively. The mathematical expressions for these three feature vectors are provided in Equation (7).

\{\begin{cases} R = {(r_{1}, r_{2}, r_{3}, \dots \dots, r_{10})}^{T} \\ S = {(s_{1}, s_{2}, s_{3}, \dots \dots, s_{13})}^{T} \\ F = {(f_{1}, f_{2}, f_{3}, \dots \dots, f_{15})}^{T} \end{cases}

(7)

Here, r₁–r₁₀ represent the feature values of the 10 RGB-VIs, s₁–s₁₃ represent the feature values of the 13 S-VIs, and f₁–f₁₅ represent the values of the 15-dimensional Features-Deep.

Using the trained Modified-VGG16 model, the forward propagation function was employed to extract the output of the final fully connected layer as deep image features (represented as a set of one-dimensional vectors). Subsequently, VI features were fused with the deep image features into a unified set of one-dimensional feature vectors, followed by normalization to achieve feature fusion. By linearly combining these feature vectors, four additional feature vector combinations were obtained: RS, FR, FS, and FRS. The feature combinations that did not include F (i.e., R, S, and RS) were referred to as original features, while those that included F (i.e., FR, FS, and FRS) were referred to as fusion features. The mathematical expression for the fusion feature vectors is given by Equation (8).

F_{f u s i o n} = F \oplus F_{o r i}

(8)

Here, F_fusion represents the fused feature vector, and F_ori represents the feature vector composed of original features.

In this study, three classifiers—support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost)—were used to construct machine learning models for classifying nutrient deficiency in field rice. SVM is a classification algorithm based on margin maximization. It handles nonlinear data through kernel functions, performs well on high-dimensional data, and has strong generalization capabilities. However, SVM is sensitive to the choice of hyperparameters and kernel functions, requiring multiple optimizations during model construction. RF is an ensemble learning algorithm based on decision trees. It improves performance by constructing multiple trees and voting or averaging their results. RF is robust, insensitive to noise and outliers, interpretable, and fast to train, making it suitable for large-scale data. However, RF models are complex, with high storage and inference costs, and they generally perform poorly on high-dimensional sparse data. XGBoost is an efficient ensemble learning algorithm based on gradient-boosted trees. It iteratively constructs decision trees to optimize the loss function, achieving high performance on structured data. XGBoost supports customizable loss functions and regularization, and it accelerates training through parallel computing and sparse data processing. However, it is sensitive to hyperparameters, requires complex tuning, and demands significant training time. To better evaluate the machine learning model’s sensitivity and specificity, this study calculated the classification model’s recall rate and F1-score, as shown in Equations (9) and (10), respectively.

R e c a l l = \frac{T P_{i}}{T P_{i} + F N_{i}}

(9)

F 1 = \frac{2 \times (P r e c i s i o n \times R e c a l l)}{P r e c i s i o n + R e c a l l}

(10)

Here, Recall represents the model’s recall rate, FN_i indicates the number of false negatives (samples that are actually positive but were predicted as negative) for the i-th category, and F1 is the model’s F1-score.

The three original feature vectors (R, S, and RS) and the three fused feature vectors (FR, FS, and FRS) were used as input features for constructing machine learning classification models. These six feature vectors were input into the SVM, RF, and XGBoost classifiers to compare the performance differences between original and fused feature combinations in classifying nutrient deficiency in field rice. The best classification model was determined based on classification accuracy.

To enhance the effectiveness of the proposed classification model, the study on nutrient deficiency classification based on the six feature vectors was conducted during both the S1 and S2 periods. The classification model constructed during the S2 period was used to guide the topdressing strategies for the ratoon season.

2.5. Classification of Nutrient Deficiency and Topdressing Strategies in the Field

In this study, six varieties (V1–V6) were tested. Three of these varieties (V3, V4, and V6) were randomly selected, and their feature data were used to train and construct machine learning classification models. The remaining three varieties (V1, V2, and V5) were used to evaluate the actual nutrient deficiency in the field by applying the trained classification model and formulating corresponding topdressing strategies.

The feature vector data from the S2 period (before ratoon season fertilization) were used. The datasets for V3, V4, and V6 were combined and split into training and testing sets in a 7:3 ratio for model training and construction. The optimal model was then determined. Next, the multispectral imagery of the field areas for V1, V2, and V5 was evenly divided. Each variety in each fertilization treatment area was divided into four subplots, resulting in 16 subplots per variety across four treatment areas. The fused feature data for each subplot were extracted and input into the optimal classification model to determine the actual nutrient deficiency distribution. The deficiency levels were categorized as 75%, 50%, 25%, or no deficiency.

After determining the actual nutrient deficiency in the treatment areas for V1, V2, and V5, additional topdressing was applied before the ratoon season tillering fertilizer application. Specifically, 25%, 50%, and 75% incremental topdressing was applied to areas with corresponding deficiency levels to compensate for the nutrient shortage in the main season. Then, the standard 100% fertilization rate was applied to the N1, N2, and N4 treatment areas during the ratoon season, while the N3 treatment area received only 75% of the standard rate without incremental topdressing. This was done to create a control group for evaluating the effectiveness of the topdressing strategy.

Finally, the effectiveness of the topdressing strategy was evaluated based on four traits that reflect rice nutrient utilization: tiller number (growth dimension), thousand-grain weight and seed setting rate (yield dimension), and grain area (quality dimension). Before harvesting in both the main and ratoon seasons, 20 plants were randomly sampled from each variety in each treatment area. The specific values of these traits were measured through manual evaluation. The changes in these traits between the main and ratoon seasons were analyzed, and the improvement in the four traits during the ratoon season was used as the evaluation metric for the topdressing strategy’s effectiveness.

3. Results and Discussion

3.1. Significant VI Features and Corresponding Field Multispectral Imagery

The results of the correlation analysis between VI features and nutrient deficiency conditions are illustrated in Figure 5. Specifically, Figure 5a displays the corresponding heatmap for RGB-Vis, and Figure 5b presents the heatmap for S-VIs, both during the two distinct periods S1 and S2. Additionally, based on the p-value method, the 10 RGB-VIs and 13 S-VIs selected in this study were all found to be significant factors (p < 0.05) in evaluating nutrient deficiency in field rice.

Therefore, to reduce feature dimensionality and minimize the interference of weakly correlated features in the construction of the classification model, this study screened the VI features based on the absolute correlation coefficient values (|r|) between the features and nutrient deficiency conditions. Features with weak correlations (|r| < 0.2) were excluded, and only strongly correlated features were retained. As shown in Figure 5a, among the 10 RGB-VI features, two features (NGRDI and VEG) had |r| values below 0.2 during either S1 or S2, while the |r| values of the remaining features were all greater than 0.2. Similarly, as shown in Figure 5b, among the 13 S-VI features, two features (EVI and CIRE) had |r| values below 0.2 during either S1 or S2, while the |r| values of the remaining features were all greater than 0.2. The results of the VI feature screening are shown in Table 3. Ultimately, 8 out of the 10 RGB-VI features and 11 out of the 13 S-VI features were selected as the feature combinations for this study.

The selected VI feature combinations will be used in the subsequent construction of the classification model. Since the S2 period falls within the fertilization window for the ratoon season, its classification results will be used to guide specific topdressing measures. The intuitive color distributions of the field vegetation index features during the S2 period are shown in Figure 6, where Figure 6a illustrates the distribution of visible light vegetation indices, and Figure 6b illustrates the distribution of multispectral vegetation indices.

The selected VI feature combinations were used in the subsequent construction of the classification model. Since the S2 period fell within the fertilization window for the ratoon season, its classification results were used to guide specific topdressing measures. The intuitive color distributions of the field VI features during the S2 period are shown in Figure 6, where Figure 6a illustrates the distribution of visible light VIs, and Figure 6b illustrates the distribution of multispectral VIs.

3.2. Results of Deep Learning Training and Deep Feature Extraction from Imagery

To validate the effectiveness of the three different network structures described in Section 2.3 for deep feature extraction, 100 rounds of model pre-training were conducted using the same dataset for each structure. The average classification accuracy of the rice variety imagery was used as the evaluation metric, and the results are shown in Table 4. From the results, it can be seen that the classification accuracy of the network structure shown in Figure 4d was significantly lower than that of the other two structures. This indicated that some information was lost during the feature dimensionality reduction process, and two rounds of dimensionality reduction resulted in twice the loss, leading to a decrease in classification accuracy. The network structures shown in Figure 4b,c showed no significant difference in classification accuracy, indicating that the parameter size of the FC-4096 structure was fully sufficient for the four-class classification task and did not require additional stacking. In conclusion, considering both the effectiveness of the feature extraction model and the parameter size, this study selected the network structure shown in Figure 4b for training the feature extraction model.

During training, the dataset and hyperparameters constructed earlier were used for 150 rounds of formal training. The training results for each variety during the two periods are shown in Figure 7, and the confusion matrices for the average precision of nutrient deficiency classification in each region are shown in Figure 8. During the S1 period, the average classification accuracy for nutrient deficiency across the six rice varieties was 88.78%, while the optimal classification result using a single VI feature (either R or S alone) during the same period was 87.50% with the RF classifier. During the S2 period, the average classification accuracy for nutrient deficiency across the six rice varieties was 84.56%, while the optimal classification result using a single VI feature during the same period was 90.28% with the RF classifier. Compared to the machine learning classification accuracy based on single VI features, the image classification accuracy based on the deep learning model achieved similarly high accuracy on the same dataset, indicating that both feature types had already demonstrated good performance in classifying rice nutrient deficiency.

However, VI features and deep image features represent significantly different crop physiological mechanisms and belong to distinct feature types; this is evidenced in Figure 8, where the deep learning approach demonstrates superior performance in both low-fertilization (N1) and high-fertilization (N4) zones. This enhanced performance can be attributed to the more pronounced manifestation of color features under extreme fertilization conditions. In contrast, greater classification errors were observed in medium-fertilization zones (N2 and N3), particularly in the N3 experimental area. Therefore, this study proposes that combining these two feature types could theoretically achieve improved classification accuracy. Using the pre-trained deep neural network described earlier, the parameters computed after the FC-15 layer were saved as a one-dimensional vector containing 15 elements, serving as the deep image features.

3.3. Classification Results Based on VI Features and Fusion Features

During the two different periods, S1 and S2, the VI features (R, S, and RS) and the fused features (FR, FS, and FRS) were used as feature vectors and input into the machine learning classifiers SVM, RF, and XGBoost to construct classification models. The effectiveness of the fused features compared to the original features was evaluated based on the changes in classification accuracy for nutrient deficiency in the field. The classification accuracy of each classifier is shown in Table 5. The results indicated that the fused features improved the classification accuracy to some extent in the machine learning models constructed by all three classifiers, although the overall performance of the models varied significantly.

3.3.1. Performance of the SVM RF and XGBoost Classifier

The SVM-based model achieved the lowest accuracy among the three models for most feature combinations during both periods, except for the FR feature combination in S1. Its highest accuracy was 95.83% (S1) and 88.89% (S2) with FR, though these did not surpass the study’s overall best accuracy. The SVM model exhibited lower sensitivity to multispectral VI features, as shown by (1) higher accuracy on R than S in S1, with no improvement from RS; (2) only a 3.61% accuracy increase on S compared to R in S2; and (3) substantial accuracy improvements only with FR (11.39% in S1 and 13.89% in S2), while FS and FRS showed minimal gains (approximately 1%).

The RF model consistently achieved the highest accuracy across most feature combinations, except for FR, where it slightly underperformed (94.72% in S1 and 90.72% in S2). Its sensitivity aligned with crop growth mechanisms was as follows: (1) In S1, high accuracy on R (86.11%) and FR (94.72%) reflected visible light color differences during vigorous growth, while outperforming SVM on multispectral VI features (S and RS). (2) In S2, as visible light color characteristics weakened and multispectral VI features strengthened, the RF model’s accuracy on R and S was 78.33% and 90.72%, respectively, showing a significant gap compared to SVM. (3) Deep image fusion features (FR, FS, FRS) significantly improved accuracy. Accuracy improvements were most significant for the FR feature (8.61% in S1 and 12.39% in S2), followed by FS (8.89% in S1 and 3.61% in S2) and FRS (6.67% in S1 and 4.34% in S2).

The XGBoost model performed similarly to RF with fused features, achieving notable accuracy improvements in S1 (FR: 8.33%, FS: 14.58%, FRS: 11.66%) and S2 (FR: 17.08%, FS: 6.5%, FRS: 5.42%). However, its accuracy on non-fused features was lower than RF in both S1 (R: 3.19%, S: 5.83%, RS: 5.41%) and S2 (R: 3.33%, S: 4.70%, RS: 3.47%). With fused features, XGBoost nearly matched RF, with FS and FRS accuracies within approximately 2% during S2.

The F1-score analysis demonstrates that the RF model consistently outperforms SVM and XGBoost across most feature combinations, achieving the highest scores (F1-score: 0.808–0.976). Although XGBoost exhibits comparable performance (F1-score: 0.732–0.974), it underperforms with simpler features and shows greater performance variability. SVM achieves the least favorable results overall (F1-score: 0.708–0.949), particularly demonstrating significant limitations when handling complex feature sets. Consequently, RF exhibits superior robustness in managing diverse feature combinations, establishing it as the optimal classification model for this study.

3.3.2. Comprehensive Comparison of the Three Machine Learning Classifiers and Determination of the Optimal Model

The results are visually represented in Figure 9, illustrating the performance differences among the three machine learning models across various feature combinations. From the bar charts (Figure 9a,c), incorporating deep image features improved classification accuracy for nutrient deficiency, with an average increase of 7.52% across both periods. Excluding the SVM model, the average improvement with fused features rose to 9.01%. This trend is further reflected in the radar charts (Figure 9b,d), where points representing fused feature accuracy clustered closer to the best classification accuracy (blue border) compared to non-fused features.

The RF and XGBoost models exhibited similar overall performance, with RF slightly outperforming XGBoost and both surpassing SVM. This is evident in the radar charts, where the areas enclosed by RF and XGBoost accuracy points were larger and closer to the best accuracy region than those of SVM.

The proposed method, combining VI and deep image features, achieved optimal performance during both S1 and S2 periods. The highest classification accuracy was observed on the FRS feature, reaching 97.50% (RF, S1) and 96.56% (RF, S2). Consequently, the RF model was selected for subsequent field nutrient deficiency identification and topdressing strategy formulation during the ratoon season.

3.4. Field Nutrient Deficiency Diagnosis and Topdressing Strategies Based on Optimal Classification Model

In the previous sections, the dataset and classification models were constructed using ratoon rice varieties V3, V4, and V6, as shown in the white dashed box in Figure 10a. The optimal RF model, trained in Section 3.3, was applied to classify the nutrient deficiency conditions of varieties V1, V2, and V5. Despite the experimental design dividing the field into four fertilization zones (N1–N4), actual nutrient deficiency distribution deviated due to factors such as uneven light exposure, elevation differences, and fertilizer mixing. Thus, the classification results provided a critical basis for precise topdressing strategies.

The nutrient deficiency classification results for V1, V2, and V5 are shown in Figure 10a, with colors representing deficiency levels: red (75%), orange (50%), yellow (25%), and blue (no deficiency). Variety V5 aligned well with the original fertilization gradients, demonstrating superior fertilizer utilization efficiency by outperforming the original nutrient conditions in multiple grid areas. In contrast, V1 and V2 showed less alignment with the experimental design but still exhibited a trend of decreasing deficiency from N1 to N4. Both varieties displayed poorer fertilizer utilization efficiency, falling behind the original nutrient conditions in more grid areas.

The topdressing results for V1, V2, and V5 are shown in Figure 10b, with fertilization amounts (75%, 50%, and 25% of the main season rate) corresponding to deficiency levels. To emphasize topdressing effects, no additional fertilization was applied to the N3 zone. Each treatment zone was divided into four grid areas, with supplementary fertilization calculated as 1/4 of the standard main season rate. The ratoon season bud-promoting fertilizer was applied at the standard rate to the N1, N2, and N4 zones, while N3 received 75% of the standard rate, consistent with the main season strategy. The supplementary amounts were mixed with the bud-promoting fertilizer and applied accordingly (Table 6).

3.5. Evaluation of Topdressing Effects

After applying the topdressing strategy described above to the rice varieties in the N1, N2, and N4 treatment zones, field sampling was conducted before the ratoon season harvest to measure the values of four traits: tiller number, seed setting rate, thousand-grain weight, and grain area. These values were then compared with the trait values measured during the main season harvest. The specific changes in trait values for each variety across the two seasons are shown in Figure 11a–d. From left to right, the three figures display the relevant information for varieties V1, V2, and V5, with solid lines representing the main season and dash-dot lines representing the ratoon season.

From the main season perspective, the four traits (tiller number, grain area, seed setting rate, and thousand-grain weight) exhibited consistent sensitivity to nutrient deficiency across all varieties. Tiller number showed the highest sensitivity, with significant variation across the N1–N4 zones, followed by grain area, which displayed a clear but less-pronounced trend. Seed setting rate and thousand-grain weight were the least sensitive, with no significant differences across zones. Despite these variations, all traits generally increased from N1 to N4, indicating that greater nutrient deficiency corresponded to poorer growth.

The topdressing strategy demonstrated significant effectiveness for variety V5. The N1 and N2 zones, which received additional fertilization, showed substantial trait improvements compared to the non-fertilized N3 zone, as detailed in Table 7. The N4 zone, receiving standard fertilization, achieved the highest values in all four traits without additional fertilization, reflecting its efficient fertilizer utilization. Variety V5 in N4 also showed more significant improvement relative to the main season compared to N3.

Variety V2 exhibited the effectiveness of topdressing in three traits: tiller number, thousand-grain weight, and grain area, while seed setting rate remained unchanged. Due to its lower fertilizer utilization efficiency compared to V5, V2 showed significant improvements in tiller number and grain area under limited fertility, aligning with the main season sensitivity trend. However, only thousand-grain weight improved among yield-related traits. The N4 zone did not achieve the highest values in all traits, as insufficient original fertility prevented optimal growth, and it lacked the additional fertilization applied to N1 and N2. Nevertheless, it still outperformed N3 relative to the main season.

Variety V1 showed topdressing effectiveness only in grain area, with N1 and N2 underperforming compared to the non-fertilized N3 in the other three traits. V1’s poor fertilizer utilization efficiency indicated that conventional topdressing could not effectively correct nutrient deficiency under extremely insufficient fertility, as field conditions could not be restored quickly through additional fertilization.

Additionally, the improvement rates of varieties V1, V2, and V5 in the four traits during the main and ratoon seasons relative to the non-fertilized N3 zone were listed in Table 7. This included the average improvement rates for the N1–N2 zones and the improvement rate for the N4 zone relative to the N3 zone. An improvement rate in the ratoon season higher than that in the main season indicated the effectiveness of the topdressing strategy. Among the 24 comparison groups, 16 showed higher improvement rates in the ratoon season than in the main season. If variety V1 was excluded, 13 out of 16 comparison groups showed higher improvement rates in the ratoon season. Furthermore, among these 13 groups, 12 showed improvements of more than 5%, 9 showed improvements of more than 10%, and 6 showed improvements of more than 20%.

4. Conclusions

This study focused on monitoring regionalized rice growth and establishing a nutrient deficiency classification model using a UAV remote sensing platform while formulating differentiated fertilization management strategies. By fusing field rice VI features with image deep features, we further improved the accuracy of monitoring and classification, demonstrating the effectiveness of the fertilization strategy across multiple rice traits. The results are summarized as follows:

The two types of VI features selected in this study (RGB-VIs and S-VIs) showed significant correlations with rice nutrient deficiency. Among the 23 VI features, only 4 exhibited weak correlations (|r| < 0.2), and all VI features passed the p-value test.
Using either image deep features (S1: 88.78%, S2: 84.56%) or VI features (S1: 90.83%, S2: 92.22%) alone for nutrient deficiency classification did not yield optimal results. However, fusing these features achieved the highest classification accuracy in this study (S1: 97.50%, S2: 96.56%), with the RF model performing best.
Based on the optimal classification model, the actual nutrient deficiency status in the field trial area was evaluated. Fertilizer supplementation was applied to regions with insufficient soil fertility, resulting in the generation of a site-specific fertilization prescription map suitable for rice growth. Implementation according to this prescription map effectively optimized crop growth performance.
The fertilization management strategy demonstrated effectiveness across four traits, tiller number, seed setting rate, thousand-grain weight, and grain area, for rice varieties V1, V2, and V5. The improvement rates in N1–N2 zones compared to N3 reached up to 35.79%, while N4 zones showed improvement rates of up to 72.11% relative to N3.

The results indicated that fusing VI features with image deep features significantly enhanced the accuracy of rice nutrient deficiency classification. The constructed model, applied to the UAV remote sensing platform, enabled precise fertilization management across field regions. However, the validation of this method was limited to a single ecological site and a small number of varieties. Further research is needed to verify its generalizability across different ecological locations and varieties. Additionally, exploring richer phenotypic traits and crop growth mechanisms could provide more effective guidance for UAV remote sensing in field management.

Author Contributions

Conceptualization, W.Y. and C.H.; methodology, B.C. and C.H.; software, B.C., Q.S. and C.H.; validation, Q.S., Y.L. and R.C.; formal analysis, Q.S., Y.L. and R.C.; investigation, Q.S., Y.L. and R.C.; resources, B.C., Y.L. and R.C.; data curation, B.C., Y.L. and R.C.; writing—original draft preparation, B.C. and C.H.; writing—review and editing, B.C., W.Y. and C.H.; visualization, B.C., W.Y. and C.H.; supervision, W.Y. and C.H.; project administration, W.Y. and C.H.; funding acquisition, C.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by grants from the National Science and Technology Major Project of China (2022ZD0115705), the National Natural Science Foundation of China (32270431), and the Key Research and Development Plan of Hubei Province (2022BBA0045). This work was also given a grant from the Fundamental Research Funds for the Central Universities (2662024GXPY002) and the HZAU-AGIS Cooperation Fund (SZYJY2023012).

Data Availability Statement

The raw datasets generated and used in this study are available upon request by contacting the first or corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Foley, J.; Ramankutty, N.; Brauman, K.; Cassidy, E.; Gerber, J.; Johnston, M.; Mueller, N.; O’Connell, C.; Ray, D.; West, P.; et al. Solutions for a cultivated planet. Nature 2011, 478, 337–342. [Google Scholar] [CrossRef] [PubMed]
Godfray, H.; Beddington, J.; Crute, I.; Haddad, L.; Lawrence, D.; Muir, J.; Pretty, J.; Robinson, S.; Thomas, S.; Toulmin, C.; et al. Food Security: The Challenge of Feeding 9 Billion People. Science 2010, 327, 812–818. [Google Scholar] [CrossRef]
Abderahman, R.; Alireza, A.; Karim, R.; Horst, T. Drones in agriculture: A review and bibliometric analysis. Comput. Electron. Agric. 2022, 198, 107017. [Google Scholar]
Almeida, A.; Tarquis, A.; López, J.; Perez, E.; Pancorbo, J.; Raya, M.; Quemada, M. Optimization of soil background removal to improve the prediction of wheat traits with UAV imagery. Comput. Electron. Agric. 2023, 205, 107559. [Google Scholar] [CrossRef]
Pan, Y.; Wu, W.; Zhang, J.; Zhao, Y.; Zhang, J.; Gu, Y.; Yao, X.; Cheng, T.; Zhu, Y.; Cao, W.; et al. Estimating leaf nitrogen and chlorophyll content in wheat by correcting canopy structure effect through multi-angular remote sensing. Comput. Electron. Agric. 2023, 208, 107769. [Google Scholar] [CrossRef]
Gu, Q.; Huang, F.; Lou, W.; Zhu, Y.; Hu, H.; Zhao, Y.; Zhou, H.; Zhang, X. Unmanned aerial vehicle-based assessment of rice leaf chlorophyll content dynamics across genotypes. Comput. Electron. Agric. 2024, 221, 108939. [Google Scholar] [CrossRef]
Li, D.; Yang, S.; Du, Z.; Xu, X.; Zhang, P.; Yu, K.; Zhang, J.; Shu, M. Application of unmanned aerial vehicle optical remote sensing in crop nitrogen diagnosis: A systematic literature review. Comput. Electron. Agric. 2024, 227, 109565. [Google Scholar] [CrossRef]
Mwinuka, P.; Mourice, S.; Mbungu, W.; Mbilinyi, B.; Tumbo, S.; Schmitter, P. UAV-based multispectral vegetation indices for assessing the interactive effects of water and nitrogen in irrigated horticultural crops production under tropical sub-humid conditions: A case of African eggplant. Agric. Water Manag. 2024, 227, 109565. [Google Scholar] [CrossRef]
Mogili, R.U.; Deepak, L. Review on Application of Drone Systems in Precision Agriculture. Procedia Comput. Sci. 2018, 133, 502–509. [Google Scholar] [CrossRef]
Hafeez, A.; Husain, M.; Singh, S.; Chauhan, A.; Khan, M.; Kumar, N.; Chauhan, A.; Soni, S. Implementation of drone technology for farm monitoring & pesticide spraying: A review. Inf. Process. Agric. 2023, 10, 192–203. [Google Scholar]
Aslan, M.; Durdu, A.; Sabanci, K.; Ropelewska, E.; Gültekin, S. A Comprehensive Survey of the Recent Studies with UAV for Precision Agriculture in Open Fields and Greenhouses. Appl. Sci. 2022, 12, 1047. [Google Scholar] [CrossRef]
Istiak, A.; Syeed, M.; Hossain, S.; Uddin, M.; Hasan, M.; Khan, R.; Azad, N. Adoption of Unmanned Aerial Vehicle (UAV) imagery in agricultural management: A systematic literature review. Ecol. Inform. 2023, 78, 102305. [Google Scholar] [CrossRef]
Ban, S.; Liu, W.; Tian, M.; Wang, Q.; Yuan, T.; Chang, Q.; Li, L. Rice Leaf Chlorophyll Content Estimation Using UAV-Based Spectral Images in Different Regions. Agronomy 2022, 11, 2832. [Google Scholar] [CrossRef]
Hu, H.; Zhou, H.; Cao, K.; Lou, W.; Zhang, G.; Gu, Q.; Wang, J. Biomass Estimation of Milk Vetch Using UAV Hyperspectral Imagery and Machine Learning. Remote Sens. 2024, 12, 2183. [Google Scholar] [CrossRef]
Lu, J.; Cheng, D.; Geng, C.; Zhang, Z.; Xiang, Y.; Hu, T. Combining plant height, canopy coverage and vegetation index from UAV-based RGB images to estimate leaf nitrogen concentration of summer maize. Biosyst. Eng. 2021, 202, 42–54. [Google Scholar] [CrossRef]
Sahoo, R.; Rejith, R.; Gakhar, S.; Ranjan, R.; Meena, M.; Dey, A.; Mukherjee, J.; Dhakar, R.; Meena, A.; Daas, A.; et al. Drone remote sensing of wheat N using hyperspectral sensor and machine learning. Precis. Agric. 2023, 2, 704–728. [Google Scholar]
Qiu, Z.; Ma, F.; Li, Z.; Xu, X.; Ge, H.; Du, C. Estimation of nitrogen nutrition index in rice from UAV RGB images coupled with machine learning algorithms. Comput. Electron. Agric. 2021, 189, 106421. [Google Scholar]
Wang, Y.; Chang, Y.; Shen, Y. Estimation of nitrogen status of paddy rice at vegetative phase using unmanned aerial vehicle based multispectral imagery. Precis. Agric. 2022, 23, 1–17. [Google Scholar] [CrossRef]
Wang, L.; Chen, S.; Li, D.; Wang, C.; Jiang, H.; Zheng, Q.; Peng, Z. Estimation of Paddy Rice Nitrogen Content and Accumulation Both at Leaf and Plant Levels from UAV Hyperspectral Imagery. Remote Sens. 2021, 13, 2956. [Google Scholar] [CrossRef]
Zheng, H.; Cheng, T.; Li, D.; Yao, X.; Tian, Y.; Cao, W.; Zhu, Y. Combining Unmanned Aerial Vehicle (UAV)-Based Multispectral Imagery and Ground-Based Hyperspectral Data for Plant Nitrogen Concentration Estimation in Rice. Front. Plant Sci. 2018, 9, 936. [Google Scholar] [CrossRef]
Yang, M.; Hassan, M.; Xu, K.; Zheng, C.; Rasheed, A.; Zhang, Y.; Jin, X.; Xia, X.; Xiao, Y.; He, Z. Assessment of Water and Nitrogen Use Efficiencies Through UAV-Based Multispectral Phenotyping in Winter Wheat. Front. Plant Sci. 2020, 11, 927. [Google Scholar]
Guebsi, R.; Mami, S.; Chokmani, K. Drones in Precision Agriculture: A Comprehensive Review of Applications, Technologies, and Challenges. Drones 2024, 11, 686. [Google Scholar] [CrossRef]
Nunec, E.C. Employing Drones in Agriculture: An Exploration of Various Drone Types and Key Advantages. arXiv 2023, arXiv:2307.04037. [Google Scholar]
Xu, X.; Fan, L.; Li, Z.; Meng, Y.; Feng, H.; Yang, H.; Xu, B. Estimating Leaf Nitrogen Content in Corn Based on Information Fusion of Multiple-Sensor Imagery from UAV. Remote Sens. 2021, 13, 340. [Google Scholar] [CrossRef]
Ge, H.; Xiang, H.; Ma, F.; Li, Z.; Qiu, Z.; Tan, Z.; Du, C. Estimating Plant Nitrogen Concentration of Rice through Fusing Vegetation Indices and Color Moments Derived from UAV-RGB Images. Remote Sens. 2021, 9, 1620. [Google Scholar] [CrossRef]
Huang, Y.; Li, D.; Liu, X.; Ren, Z. Monitoring canopy SPAD based on UAV and multispectral imaging over fruit tree growth stages and species. Front. Plant Sci. 2024, 15, 1435613. [Google Scholar]
Li, T.; Wang, H.; Cui, J.; Wang, W.; Li, W.; Jiang, M.; Shi, X.; Song, J.; Wang, J.; Lv, X.; et al. Improving the accuracy of cotton seedling emergence rate estimation by fusing UAV-based multispectral vegetation indices. Front. Plant Sci. 2024, 15, 1333089. [Google Scholar]
Yu, J.; Wang, J.; Leblon, B.; Song, Y. Nitrogen Estimation for Wheat Using UAV-Based and Satellite Multispectral Imagery, Topographic Metrics, Leaf Area Index, Plant Height, Soil Moisture, and Machine Learning Methods. Nitrogen 2021, 1, 1–25. [Google Scholar] [CrossRef]
Pei, S.; Zeng, H.; Dai, Y.; Bai, W.; Fan, J. Nitrogen nutrition diagnosis for cotton under mulched drip irrigation using unmanned aerial vehicle multispectral images. J. Integr. Agric. 2023, 22, 2536–2552. [Google Scholar]
Lu, D.; Ye, J.; Wang, Y.; Yu, Z. Plant Detection and Counting: Enhancing Precision Agriculture in UAV and General Scenes. IEEE Access 2023, 11, 116196. [Google Scholar] [CrossRef]
Fu, Y.; Yang, G.; Li, Z.; Song, X.; Li, Z.; Xu, X.; Wang, P.; Zhao, C. Winter Wheat Nitrogen Status Estimation Using UAV-Based RGB Imagery and Gaussian Processes Regression. Remote Sens. 2020, 12, 3778. [Google Scholar] [CrossRef]
Xu, S.; Xu, X.; Zhu, Q.; Meng, Y.; Yang, G.; Feng, H.; Yang, M.; Zhu, Q.; Xue, H.; Wang, B. Monitoring leaf nitrogen content in rice based on information fusion of multi-sensor imagery from UAV. Precis. Agric. 2023, 24, 2327–2349. [Google Scholar]
Norhashim, N.; Kamal, M.N.; Shah, S.A.; Sahwee, Z.; Ruzani, A.A. A Review of Unmanned Aerial Vehicle Technology Adoption for Precision Agriculture in Malaysia. Unmanned Syst. 2023, 12, 707–725. [Google Scholar]
Zhang, X.; Han, L.; Sobeih, T.; Lappin, L.; Lee, M.; Howard, A.; Kisdi, A. The self-supervised spectral-spatial attention-based transformer network for automated, accurate prediction of crop nitrogen status from UAV imagery. arXiv 2021, arXiv:2111.06839. [Google Scholar]
Hossen, M.; Diwakar, P.; Ragi, S. Total nitrogen estimation in agricultural soils via aerial multispectral imaging and LIBS. Sci. Rep. 2021, 11, 12693. [Google Scholar]
Qiao, L.; Tang, W.; Gao, D.; Zhao, R.; An, L.; Li, M.; Sun, H.; Song, D. UAV-bas ed chlorophyll content estimation by evaluating vegetation index responses under different crop coverages. Comput. Electron. Agric. 2022, 196, 106775. [Google Scholar]
Maddikunta, P.; Hakak, S.; Alazab, M.; Bhattachawrya, S.; Gadekallu, T.; Khan, W. Unmanned Aerial Vehicles in Smart Agriculture: Applications, Requirements, and Challenges. IEEE Sens. J. 2021, 21, 17608–17619. [Google Scholar] [CrossRef]
Wang, W.H.; Cai, L.L.; Peng, P.Y.; Gong, Y.D.; Yang, X.Q. Soil Sampling Spacing Based on Precision Agriculture Variable Rate Fertilization of Pomegranate Orchard. Commun. Soil Sci. Plan. 2021, 20, 2445–2461. [Google Scholar]
Siqueira, R.; Mandal, D.; Longchamps, L.; Khosla, R. Assessing Nitrogen Variability at Early Stages of Maize Using Mobile Fluorescence Sensing. Remote Sens. 2022, 14, 5077. [Google Scholar] [CrossRef]
Wang, H.; Xu, J.; Chen, B.; Li, Y.; Li, S.; Liang, H.; Jiang, Q.; He, Y.; Xi, W. Performance of an Automatic Variable-Rate Fertilization System Subject to Different Initial Field Water Conditions and Fertilizer Doses in Paddy Fields. Agronomy 2023, 13, 1629. [Google Scholar] [CrossRef]
Zhou, P.; Ou, Y.; Yang, W.; Gu, Y.; Kong, Y.; Zhu, Y.; Jin, C.; Hao, S. Variable-Rate Fertilization for Summer Maize Using Combined Proximal Sensing Technology and the Nitrogen Balance Principle. Agriculture 2024, 14, 1180. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of the overall experimental design and technical approach.

Figure 2. Overview of the study area and experimental design. (A) Geographical location of the experimental area. (B) Rice varieties and corresponding fertilization treatments.

Figure 3. Field imagery acquisition and dataset preparation. (A) Specific parameters of the UAV camera and the field flight mission. (B) Acquisition and stitching of field imagery. (C) Calculation and merging of features as well as dataset preparation.

Figure 4. Original VGG16 network architecture and three modified variants. (a) VGG16 original network structure. (b) Single FC-4096s and one dimensionality reduction structure. (c) Two FC-4096s and one dimensionality reduction structure. (d) Two FC-4096s and two dimensionality reductions structure.

Figure 5. Correlation analysis results between VIs and nutrient deficiency conditions. (a) Corresponding heatmap for RGB-Vis. (b) Corresponding heatmap for S-VIs.

Figure 6. Distribution of field VI features during the S2 period. (a) Distribution of visible light Vis. (b) Distribution of multispectral VIs.

Figure 7. Training process of the deep learning model for each rice variety during the S1 and S2 periods. (a) S1 period and (b) S2 period.

Figure 8. Confusion matrix of the average precision for nutrient deficiency classification. (a) S1 period and (b) S2 period.

Figure 9. Visualization of classification results for VIs and fusion features. (a,c) Histogram of classification accuracy for different models. (b,d) Radar chart of classification accuracy for different models. R indicates feature combination of RGB-VIs, S indicates feature combination of Spectral-VIs, and F indicates deep image features.

Figure 10. Field nutrient deficiency identification and topdressing management strategy guidance. (a) Actual nutrient deficiency distribution in the field. (b) Fertilization prescription map developed in this study.

Figure 11. Numerical differences of the four traits in different treatment zones during the two seasons. (a) Tiller number. (b) Seed setting rate. (c) Thousand-grain weight. (d) Grain area.

Table 1. Detailed fertilization design for the field experiment.

Growth Stage	Fertilization Type	Fertilization Rate	Total Fertilization
Main season	Base fertilization	Compound fertilizer Pure N: 84 kg/ha Pure P: 84 kg/ha Pure K: 84 kg/ha	Pure N: 210 kg/ha Pure P: 84 kg/ha Pure K: 168 kg/ha
	Tillering fertilization	Urea Pure N: 42 kg/ha
	Panicle initiation fertilizer	Urea Pure N: 42 kg/ha
	Spikelet protection fertilizer	Urea and Potassium chloride Pure N: 42 kg/ha Pure K: 84 kg/ha
Ratoon season	Bud-promoting fertilizer	Compound fertilizer and Urea Pure N: 67.5 kg/ha Pure P: 56.5 kg/ha Pure K: 56.5 kg/ha	Pure N: 79 kg/ha Pure P: 56.5 kg/ha Pure K: 56.5 kg/ha
Ratoon season	Tillering fertilization	Urea Pure N: 11.5 kg/ha	Pure N: 79 kg/ha Pure P: 56.5 kg/ha Pure K: 56.5 kg/ha

Table 2. Summary of VIs selected in this study.

VI Type	VI Name	VI Meanings	Calculation Formula
Spectral-VIs	NDVI	Normalized difference vegetation index	(NIR − R)/(NIR + R)
	NDCI	Normalized difference chlorophyll index	(RE − R)/(RE + R)
	NDRE	Normalized difference red edge index	(NIR − RE)/(NIR + RE)
	GNDVI	Green NDVI	(NIR − G)/(NIR + G)
	MTCI	MERIS terrestrial chlorophyll index	(NIR − RE)/(RE − R)
	CIRE	Red-edge chlorophyll index	(NIR/RE) − 1
	CIG	Green chlorophyll index	(NIR/G) − 1
	TCARI	Transformed Chl absorption in reflectance index	3 × [(RE − R) − 0.2 × (RE − G)(RE/R)]
	TVI	Triangular vegetation index	60 × (NIR − G) − 100 × (R − G)
	mTVI	Red-edge instead of Red TVI	60 × (NIR − G) − 100 × (RE − G)
	MCARI	Modified chlorophyll absorption ratio index	(RE − R) − 0.2 × (RE − G) × (RE/R)
	EVI	Enhanced vegetation index	2.5 × (NIR − R)/(NIR + 6 × R − 7.5 × B + 1)
	DVI	Difference vegetation index	NIR − R
RGB-VIs	NGRDI	Normalized green–red difference index	(G − R)/(G + R)
	NGBDI	Normalized green–blue difference index	(G − B)/(G + B)
	RGBVI	Red–green–blue vegetation index	(G² − B × R)/(G² + B × R)
	MGRVI	Modified green–red vegetation index	(G² − R²)/(G² + R²)
	GLI	Green leaf index	(2G − R − B)/(2G + R + B)
	CIVE	Color index of vegetation extraction	0.441r − 0.881g − 0.385b + 18.78745
	VEG	Vegetation index	g/r^0.667 × b^0.333
	EXG	Excess green index	2G − R − B
	EXR	Excess red index	1.4R − G
	TGI	Triangular green index	G − 0.39R − 0.61B

Note: r = R/(R + G + B), g = G/(R + G + B), b = B/(R + G + B).

Table 3. Results of VI features screening: excluded features based on weak correlations (|r| < 0.2).

VI Type	VI Name	Correlation Coefficient
Spectral VIs	CIRE	0.15 (S1), 0.06 (S2)
Spectral VIs	EVI	−0.17 (S1), −0.36 (S2)
RGB VIs	NGRDI	0.04 (S1), −0.14 (S2)
RGB VIs	VEG	−0.02 (S1), −0.06 (S2)

Table 4. Classification performance of different network structures.

Period	Structure of Figure 4b	Structure of Figure 4c	Structure of Figure 4d
S1	86.31%	87.63%	80.32%
S2	82.39%	81.97%	77.46%

Table 5. Classification results of VIs and fusion features using machine learning classifiers.

Period	Features Type	SVM (%)	F1-Score	RF (%)	F1-Score	XGBoost (%)	F1-Score
S1	FR	95.83	0.949	94.72	0.961	91.25	0.945
	FRS	76.94	0.729	97.50	0.971	97.08	0.974
	FS	74.17	0.708	96.39	0.976	96.25	0.970
	R	84.44	0.817	86.11	0.869	82.92	0.828
	RS	75.83	0.730	90.83	0.915	85.42	0.873
	S	74.72	0.719	87.50	0.887	81.67	0.847
S2	FR	88.89	0.871	90.72	0.889	92.08	0.907
	FRS	85.56	0.836	96.56	0.939	94.17	0.915
	FS	79.72	0.785	93.89	0.925	92.08	0.906
	R	75.00	0.731	78.33	0.808	75.00	0.732
	RS	85.28	0.837	92.22	0.921	88.75	0.877
	S	78.61	0.778	90.28	0.898	85.58	0.842

Table 6. Detailed values of the topdressing management strategy.

Rice	Type of Fertilizer	N1(kg/ha) Topdressing	N2(kg/ha) Topdressing	N4(kg/ha) Topdressing
V1	Compound fertilizer and Urea	N 15.75	N 14.45	N 5.25
		P 6.3	P 5.78	P 2.1
		K 12.6	K 11.55	K 4.2
V2		N 14.45	N 11.82	N 2.63
		P 5.78	P 4.73	P 1.05
		K 11.55	K 9.45	K 2.1
V5		N 14.45	N 11.82	N 0
		P 5.78	P 4.73	P 0
		K 11.55	K 9.45	K 0

Table 7. Numerical representation of the effectiveness of the topdressing management strategy.

Rice	Traits	Main Crop Average-N1-N2–N3 (%)	Ratoon Crop Average-N1-N2–N3 (%)	Main Crop N4–N3 (%)	Ratoon Crop N4–N3 (%)
V1	Tiller Number	−11.97	−17.47	3.19	2.60
	Seed Setting Rate	−6.75	−29.40	−0.88	−10.44
	Thousand-Grain Weight	−0.83	−3.11	0.79	1.71 [+]
	Grain Area	−0.64	6.46 [+]	1.94	4.22 [+]
V2	Tiller Number	−23.84	12.40 [+]	3.04	11.08 [+]
	Seed Setting Rate	−4.29	−41.33	−1.63	−21.26
	Thousand-Grain Weight	−3.18	24.10 [+]	1.08	5.76 [+]
	Grain Area	−5.09	12.54 [+]	0.12	7.76 [+]
V5	Tiller Number	−2.92	25.76 [+]	48.39	41.55
	Seed Setting Rate	−4.68	35.79 [+]	1.23	72.11 [+]
	Thousand-Grain Weight	−0.58	29.48 [+]	0.69	60.29 [+]
	Grain Area	−0.28	3.65 [+]	0.32	5.57 [+]

Note: [+] indicates an improvement in the relevant trait value of the ratoon season compared to the main season.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, B.; Su, Q.; Li, Y.; Chen, R.; Yang, W.; Huang, C. Field Rice Growth Monitoring and Fertilization Management Based on UAV Spectral and Deep Image Feature Fusion. Agronomy 2025, 15, 886. https://doi.org/10.3390/agronomy15040886

AMA Style

Chen B, Su Q, Li Y, Chen R, Yang W, Huang C. Field Rice Growth Monitoring and Fertilization Management Based on UAV Spectral and Deep Image Feature Fusion. Agronomy. 2025; 15(4):886. https://doi.org/10.3390/agronomy15040886

Chicago/Turabian Style

Chen, Bingnan, Qihe Su, Yansong Li, Rui Chen, Wanneng Yang, and Chenglong Huang. 2025. "Field Rice Growth Monitoring and Fertilization Management Based on UAV Spectral and Deep Image Feature Fusion" Agronomy 15, no. 4: 886. https://doi.org/10.3390/agronomy15040886

APA Style

Chen, B., Su, Q., Li, Y., Chen, R., Yang, W., & Huang, C. (2025). Field Rice Growth Monitoring and Fertilization Management Based on UAV Spectral and Deep Image Feature Fusion. Agronomy, 15(4), 886. https://doi.org/10.3390/agronomy15040886

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Field Rice Growth Monitoring and Fertilization Management Based on UAV Spectral and Deep Image Feature Fusion

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Field Experimental Design

2.2. UAV Platform for Field Imagery Acquisition, Processing, and Analysis

2.2.1. UAV Multispectral Imagery Acquisition

2.2.2. Vegetation Index Features and ROIs Extraction

2.2.3. Significance Analysis of Vegetation Index Features

2.3. Deep Feature Extraction from Imagery Based on VGG16

2.4. Machine Learning Classification Based on Fusion Features

2.5. Classification of Nutrient Deficiency and Topdressing Strategies in the Field

3. Results and Discussion

3.1. Significant VI Features and Corresponding Field Multispectral Imagery

3.2. Results of Deep Learning Training and Deep Feature Extraction from Imagery

3.3. Classification Results Based on VI Features and Fusion Features

3.3.1. Performance of the SVM RF and XGBoost Classifier

3.3.2. Comprehensive Comparison of the Three Machine Learning Classifiers and Determination of the Optimal Model

3.4. Field Nutrient Deficiency Diagnosis and Topdressing Strategies Based on Optimal Classification Model

3.5. Evaluation of Topdressing Effects

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI