Next Article in Journal
Dynamics of Arbuscular Mycorrhizal Fungi in the Rhizosphere of Medicinal Plants and Their Promotion on the Performance of Astragalus mongholicus
Previous Article in Journal
Biochar Is Superior to Organic Substitution for Vegetable Production—A Revised Approach for Net Ecosystem Economic Benefit
Previous Article in Special Issue
Cucumber Leaf Segmentation Based on Bilayer Convolutional Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hyperspectral Imaging Combined with Deep Learning for the Early Detection of Strawberry Leaf Gray Mold Disease

College of Artificial Intelligence, Nanjing Agricultural University, Nanjing 210095, China
*
Author to whom correspondence should be addressed.
The author contributed equally to this work.
Agronomy 2024, 14(11), 2694; https://doi.org/10.3390/agronomy14112694
Submission received: 14 October 2024 / Revised: 2 November 2024 / Accepted: 8 November 2024 / Published: 15 November 2024
(This article belongs to the Special Issue AI, Sensors and Robotics for Smart Agriculture—2nd Edition)

Abstract

:
The presence of gray mold can seriously affect the yield and quality of strawberries. Due to their susceptibility and the rapid spread of this disease, it is important to develop early, accurate, rapid, and non-destructive disease identification strategies. In this study, the early detection of strawberry leaf diseases was performed using hyperspectral imaging combining multi-dimensional features like spectral fingerprints and vegetation indices. Firstly, hyperspectral images of healthy and early affected leaves (24 h) were acquired using a hyperspectral imaging system. Then, spectral reflectance (616) and vegetation index (40) were extracted. Next, the CARS algorithm was used to extract spectral fingerprint features (17). Pearson correlation analysis combined with the SPA method was used to select five significant vegetation indices. Finally, we used five deep learning methods (LSTMs, CNNs, BPFs, and KNNs) to build disease detection models for strawberries based on individual and fusion characteristics. The results showed that the accuracy of the recognition model based on fused features ranged from 88.9% to 96.6%. The CNN recognition model based on fused features performed best, with a recognition accuracy of 96.6%. Overall, the fused feature-based model can reduce the dimensionality of the classification data and effectively improve the predicting accuracy and precision of the classification algorithm.

1. Introduction

Strawberries are a sweet and nutritious fruit. Strawberries are very popular and economically important. So, strawberries are widely grown. Gray mold is a very common disease in the strawberry growing process. The disease is characterized by rapid onset and easy spread in strawberry plants. Therefore, it is necessary to develop a diagnostic method that can be accurate and efficient in the early stage of the disease without causing damage to strawberries.
The traditional method of strawberry disease detection is manual identification by growers. This approach is laborious and time-consuming, and it is difficult to accurately predict disease severity on a large scale. However, these methods require strict detection conditions and cannot achieve the early warning of disease in strawberry plants. Therefore, it is very important to realize the non-destructive detection of early gray mold disease for the development of the strawberry planting industry.
Hyperspectral imaging combined with stoichiometry is an efficient and rapid non-destructive testing technique. At the same time, hyperspectral imaging technology has the advantages of multiple bands, a narrow spectrum, and a large amount of data. Therefore, technology is extensively employed in the precise detection of diseases in agriculture.
When crops are invaded by diseases, their physiological characteristics, structure, and water content will change. These changes can affect the spectral reflectance [1]. The early identification and prevention of multiple diseases can be effectively realized by observing the hyperspectral characteristic bands of the leaves of healthy plants and diseased plants [2]. The changes in spectral reflectance and chlorophyll content in different areas of the spectrum (green light area, yellow light area, and near-infrared area) can adequately reflect the difference between the diseased area and the healthy area of the plant [3]. In addition, our study showed that the spectral characteristics changed regularly with the severity of the disease [4].
The accuracy of disease detection can be improved by hyperspectral techniques and feature extraction fusion. In recent years, competitive adaptive reweighted sampling (CARS) has been applied to plant disease detection. Zhou et al. (2022) [5] used two band screening methods, competitive adaptive reweighted sampling (CARS) and the continuous projection algorithm (SPA), to select characteristic spectral bands, and the prediction accuracy reached 87.88%. Xie et al. (2016) [6] established the CARS-KNN model by using the CARS algorithm to select feature bands. The recognition accuracy of the wavelength model selected by CARS was 95.83%, which is higher than that of the full-wavelength model (94%). The experiments proved that fusion can improve the detection accuracy of feature extraction. A.-K. Mahlein et al. (2013) [7] used the RELIEF-F algorithm to extract the most relevant wavelength. The classification accuracy of leaf spots, rust, and powdery mildew were 92%, 87%, and 85%, respectively. Meng et al. (2020) [8] used the RELIEF-F algorithm to select the most discriminating characteristic wavelength in the hyperspectral reflectance of southern corn rust and used it as the input of the recognition model. An identification accuracy of 87% was obtained.
As machine learning continues to advance, hyperspectral imaging technology combined with deep learning algorithms is also beginning to be applied in the field of disease detection. Nandita Mandal et al. (2023) [9] used the random forest algorithm (RF) and deep neural network (DNN) to detect rice blast disease by combining 26 vegetation indices. And the accuracy of the random forest algorithm reaches 0.9. Cao et al. (2022) [10] used the spectral expanded convolutional three-dimensional Convolutional Neural Network (SDC-3DCNN) to detect the bacterial leaf blight in rice. Principal component analysis (PCA) and random forest (RF) were used to select characteristic bands as input to construct the model, and the accuracy reached 95.4427%. Bao et al. (2024) [11] used a combination of high-spectrum imaging and deep learning to achieve the early detection of sugarcane black scab and sugarcane mosaic disease. The disease detection model input image blocks into ResNet34 and generated predicted labels for the image blocks. Lee et al. (2022) [12] used drone images to automatically detect BSR disease, introducing a multi-layer perceptron model to learn spectral signatures at different stages of infection. The first is hyperspectral image preprocessing, followed by artificial neural network disease detection, which improves time efficiency and automates the detection process. Ju et al. (2023) [13] used the K-means algorithm (KA) and genetic algorithm (GA) to screen the vegetation index. The backpropagation neural network (BPNN) model was constructed with the input of the model, and an accuracy of 0.780% was obtained.
The results show that the combination of spectral technology, CARS feature extraction, and deep learning algorithm is feasible for disease detection. However, disease detection in strawberry fields is still lacking. Therefore, the main purpose of this study is to combine hyperspectral technology and deep learning algorithms to detect early disease in strawberries. The goals are to (1) acquire hyperspectral images of healthy and early diseased leaves (24 h) using a hyperspectral imaging system; (2) obtain the spectral reflectance (616) and vegetation index (40) of the preprocessed hyperspectral images that were extracted; (3) use the CARS algorithm to extract spectral fingerprint features (17), selecting the vegetation index with the higher score using Pearson correlation analysis combined with SPA; and (4) apply deep learning algorithms (LSTM, CNN, BPNN, RF, KNN) to construct the strawberry disease recognition model. And the detection properties of different models were also evaluated.

2. Materials and Methods

2.1. Strawberry Leaf Cultivation and Pathogen Inoculation

All strawberry leaves used in this study were sourced from the Jiangsu Agricultural Expo Garden in China. Initially, 240 leaves without visual defects (such as breakage, withering, or spots) and with a similar and good nutritional status were selected as samples. Following this, 120 leaves were randomly chosen for inoculation with gray mold pathogens, the Botrytis cinerea, which was obtained from the Jiangsu Academy of Agricultural Science. The remaining 120 leaves served as the control group. The manual inoculation of the gray mold pathogen was performed on 120 healthy leaves. Before inoculation, all leaf samples were sprayed with clean water mist to facilitate successful pathogen inoculation. Subsequently, circular mycelium with a diameter of approximately 5 mm was manually applied to the surface of each leaf using a sterile toothpick. The infected (120 leaves) and healthy (120 leaves) samples were then separately cultivated in two growth chambers, each with a 12 h light/dark cycle, under identical environmental conditions of 90.0% relative humidity and 20 °C temperatures, which simulated the natural growth environment of strawberry gray mold leaves. For this reason, the changes in physiological structure (pigment, structure, physiology, and water content) were also similar to those of leaves grown naturally in the plant. According to the spectral characteristics of green plants, the vegetation index is a linear or nonlinear combination of the infrared band and near-infrared band. Its spectral characteristics change with the change in its physiological structure, so the vegetation index can be used in the detection of diseases. High-spectral images of diseased and healthy leaves were collected 24 h after inoculation.

2.2. Hyperspectral Image Collection and Processing

2.2.1. Hyperspectral Imaging System

Spectral images were obtained using a hyperspectral imaging (HIS) system with a spectral range spanning from 400 nm to 1050 nm. The HIS system configuration employed in this study is depicted in Figure 1. It comprises an illuminating system consisting of two 150 W halogen lamps, adjustable to a height of 40 cm and angled at approximately 45 degrees. Additionally, it incorporates a region camera composed of a CCD camera and a movable loading platform. For specific hardware configurations of the HIS, reference can be made to Zhang’s article (2015) [14]. The software (Isuzu Optics Corp, Taiwan, China) utilized in this design allows for the adjustment of various HIS parameters, including the motor speed, detection position, and acquisition parameters.

2.2.2. Image Acquisition and Calibration

The high-spectral imaging system scans all leaf samples line by line at a rate of 0.8 mm/s, with an exposure time of 2.6 ms and a vertical distance of 50.0 cm between the camera and strawberry leaves. A total of 240 high-spectral images were collected, as displayed in Figure 2. Due to the non-uniform illumination intensity between different spectral bands and the presence of internal dark currents in the camera, significant noise exists in smaller spectral bands, leading to substantial interference. Consequently, prior to utilization, the original high-spectral images require correction using the following formula.
R i = R S i R D i R W i R D i × 100 %  
where R i represents the corrected spectral reflectance image. R S , R D , and R W denote the reflectance intensity values of the same pixels in the sample image, the dark reference image, and the white reference image, respectively. The white reference image (RW) can be obtained by measuring a spectral image of a Teflon whiteboard with 99.9% reflectance, while the dark reference image (RD) can be acquired by covering the camera lens with a non-reflective, opaque black cap.
Original high-spectral data typically contain a considerable amount of irrelevant information and noise. To mitigate their impact, it is necessary to preprocess the high-spectral data. SAVITZKY-GOLAY smoothing was employed for preprocessing in this study.

2.3. Data Extraction and Selection

2.3.1. Extraction of Spectral Data

The changes in spectral absorption are closely tied to the molecular structure of substances. Following the onset of gray mold infection in strawberry leaves, there is a distinct alteration in the internal molecular arrangement. Notably, there exists a significant disparity in the spectral reflectance between healthy and infected leaves. Consequently, leveraging the differences in spectral reflectance data presents a viable approach for identifying and detecting diseases in strawberry foliage. Spectral data are localized within regions of interest (ROIs), where appropriately defined ROIs can mitigate the impact of data non-uniformity, thus enhancing the accuracy of identification. In this study, a manual selection process was undertaken for each sample, identifying a 20 × 20-pixel ROI encapsulating the infected area and its surroundings, with the average spectral reflectance of all pixels within each ROI serving as the spectral data for a given sample. The entire process was conducted using the HIS Analyzer software (Isuzu Optics Corp, Zhubei City, Taiwan). The spectral bands spanned the range of 359–1020 nm, encompassing a total of 616 bands. For each sample, the KENNARD STONE algorithm partitioned them into training and prediction sets in a 3:1 ratio. By employing the Euclidean distance, the KS algorithm selected samples that were most distant from each other, resulting in a more evenly distributed calibration subset [15]. The advantage of this method lies in ensuring a uniform spatial distribution of samples within the training set. Consequently, the training set comprises 180 samples (comprising 90 each of gray mold-infected and healthy specimens), while the prediction set encompasses 60 samples (with 30 each of gray mold-infected and healthy specimens).

2.3.2. Selection of Spectral Fingerprint Features

Hyperspectral images contain abundant spectral information characterized by numerous wavelengths, high-dimensional data, and redundancy, leading to slow recognition speeds and low-detection model accuracy. Therefore, it is necessary to perform dimensionality reduction on the original spectral data, extracting the most discriminative fingerprint features. Common effective variable selection methods include CARS, the genetic algorithm (GA), and principal component analysis (PCA), among others. The CARS algorithm excels in both eliminating redundant wavelength variables and selecting effective ones while also tackling the issue of exponential data expansion due to excessive spectra.
Based on Darwin’s theory, adaptive reweighted sampling (ARS) [5], and CARS (competitive adaptive reweighted sampling) is a widely employed algorithm in spectral analysis. Through intertwining Monte Carlo sampling with the regression coefficients derived from the partial-least-squares (PLS) model, CARS identifies the most influential wavelengths within the spectral dataset, which is essential for constructing robust and precise calibration models [16].

2.3.3. Extraction of Vegetation Indices

When strawberry leaves are infected with gray mold, the spectral reflectance varies with changes in the leaf’s physiological structure. Spectral vegetation indices are derived from linear or nonlinear combinations of reflectance at two or more wavelengths. Utilizing spectral vegetation indices for crop disease identification has shown promising results in numerous studies. In this study, 40 vegetation indices related to pigment, structure, physiology, and water content were preliminarily selected from the remote sensing literature for the early detection of strawberry gray mold. The detailed calculation methods are presented in Table 1, where R800 represents the spectral reflectance at 800 nm wavelength, and so forth.

2.3.4. Selection of Significant Vegetation Indices

Selecting the optimal vegetation index can effectively reduce the number of wavelength points and significantly improve the model’s accuracy. Therefore, this study first conducted Pearson correlation analysis on the 40 calculated vegetation indices to filter out 21 indices with lower correlations. The Pearson correlation coefficient method involves ranking two variables and computing their correlation coefficient, which ranges from −1 to 1; values close to 1 indicate a strong correlation, while 0 indicates no correlation.
Subsequently, the Sequential Projection Algorithm (SPA) was utilized to evaluate the 21 selected vegetation indices and identify the best one. SPA is a forward-iterative search technique designed to minimize collinearity within the vector space [47] and is extensively employed in hyperspectral data analysis. SPA iteratively selects wavelengths with minimal redundancy and collinearity by projecting each wavelength onto others [48], storing those with the largest projection vectors in a set, and establishing optimal band combinations based on multiple linear regression analysis, minimizing RMSE values [49].

2.4. Development of the Recognition Model for Strawberry Disease

The combination of hyperspectral imaging systems and deep learning methods presents an efficient and rapid non-destructive detection technique widely employed in the identification of crop diseases [50]. The spectral data were 240 × 616 in total, and the data of vegetation index was 240 × 40 in our study, which is sufficient for neural network model learning. In this study, five deep learning algorithms, including LSTM, CNN, BPNN, RF, and KNN, were utilized, integrating spectral fingerprint features, vegetation indices, and fused features (combining fingerprint features with vegetation indices) to construct an early identification model for strawberry gray mold disease.
RNN, as a significant branch of neural networks, distinguishes itself from CNN primarily by its recurrent neural units, which possess memory capabilities and effectively leverage implicit dependencies among inputs. Long Short-Term Memory (LSTM) networks, a special type of recurrent neural network (RNN), address the issues of vanishing and exploding gradients in traditional RNN structures by incorporating three gates (“input gate”, “forget gate”, and “output gate”) [51]. These gates, added to the LSTM, regulate the flow of information [52], controlling the input, retention, and output of information [51], respectively. The LSTM model built in our study uses the Adam gradient descent algorithm to randomly iterate 1000 times, and the initial learning rate is 0.001. It includes an input layer, LSTM layer, Relu activation layer, full connection layer, and output layer.
Unsupervised learning, notably employing Convolutional Neural Networks (CNNs), has become prevalent in hyperspectral disease detection. CNNs excel in extracting nonlinear and discriminative features from hyperspectral images, reducing complexity by mapping high-dimensional data to lower-dimensional spaces through layers like convolutional, pooling, fully connected, and output layers [53]. CNNs leverage local correlations and weight sharing for efficient feature extraction and dimensionality reduction, thus simplifying models. The CNN model built in this design uses the Adam gradient descent algorithm to randomly iterate 500, and the initial learning rate is 0.001. The convolution kernel size of the convolution layer is 2 × 1, and 16 convolutions are produced. The size of the pool layer is 2 × 1; the output layer size is 2.
The backpropagation neural network (BPNN) is a feedforward learning algorithm consisting of forward and error backpropagation phases. BPNN models, effectively addressing problems with intricate internal mechanisms due to their robust nonlinear mapping capability [13], are characterized by their simplicity, fast computation speed, and suitability for applications such as plant disease detection [54]. The BPNN model we built in this study contains two hidden layers, hidden layer 1 with 6 nodes and hidden layer 2 with 5 nodes, 1000 random iteration times, and a learning rate of 0.01.
The random forest (RF) algorithm is an effective classifier capable of multi-class classification on high-dimensional data with high accuracy [55,56]. RF addresses issues such as the instability and overfitting encountered in decision tree construction, demonstrating significantly improved classification accuracy compared to traditional decision trees. Moreover, RF exhibits characteristics such as a fast classification speed and handling of high-dimensional data, distinguishing it from other mainstream classification algorithms. Additionally, RF is robust against noise and outliers, mitigating overfitting concerns. In this design, the selected feature vector was input into the random forest algorithm classifier for training, and the number of hyperparameter decision trees was set according to different inputs. In total, 500 decision trees were set for the full wavelength (616) and the selected fingerprint feature (17), 230 decision trees were set for the full vegetation index (40), and 300 decision trees were set for the selected vegetation index (5). The fusion feature of the fingerprint and vegetation index (22) set 614 decision trees; in addition, the smallest leaf trees were all 1.
The K-nearest neighbors (KNNs) algorithm is a straightforward approach to data classification [57]. It calculates distances between each sample in the training set and a given sample, then selects the nearest neighbors based on feature similarity. KNNs is advantageous for its fast training speed, practicality, and utilization of real data. It is widely used in disease detection with hyperspectral data. The KNNs model constructed in this design is classified by the sample distance between the SPEARMAN distance calculation test set and the training set, where K is 7.

2.5. Flowchart of the Work

The specific process of the overall work is shown as follows in Figure 3:

3. Results and Discussion

3.1. Spectral Analysis and Modeling

3.1.1. Spectral Behaviors

A total of 616 spectral references were extracted as the spectral data for the strawberry leaf samples. The wavelengths range from 359 to 1020 nm. The 400–780 nm bands are the VIS regions, and the 780–1000 nm bands are the NIR regions. The spectral curves are displayed in Figure 4. Figure 4b,c reveal the spectral curves of 120 healthy samples and 120 samples infected with gray mold for 24 h, respectively. Figure 4d shows the comparison of the original spectra of healthy and diseased leaves. It is evident that the spectral curves trends of healthy leaves and diseased leaves are similar, but the reflectance is different. This is because spectral reflection is related to the physiological structure of plants, but the physiological parameters of strawberry leaves changed little after 24 h of infection [58]. There is a valley at 392 nm and a reference peak at 552 nm. The 552 nm wavelength is a nitrogen absorption band, so it has a strong reflectance peak of chlorophyll [59]. In the 680 nm wavelength, the strawberry leaves absorb more chlorophyll, creating an absorption valley [5]. The reflectance has a significant rise from 700 nm to 760 nm and then approaches a flat. In NIR regions, the disease leaf reflected less light because the chlorophyll content was reduced in the affected area [60].

3.1.2. Modeling Based on Spectral Features

In this design, five deep learning models (LSTM, CNN, BPNN, RF, and KNN) were first established based on the full spectral wavelength (359–1020 nm). The 616 bands of spectral reflectance values and sample types were used as X and Y variables, respectively. Table 2 shows the detection results of the five deep learning models based on the optimal parameters. All the test sets achieved an accuracy using the five deep learning models higher than 80%. Among them, BPNN performed the best. It reached an 85.4% overall detection accuracy on both the training set and the test set. The overall detection accuracy on both CNN and LSTM was 81.7% on the test set. RF and KNN performed worst among the five models, with an accuracy of 80% on the test set. These results indicate that the full spectrum can effectively identify early gray mold in strawberry leaves.
Excessive variables will complicate the model and reduce the classification performance. Hence, the CARS algorithm was employed to select significant finger features. In this study, the 200 repeated random shifts resulted in seventeen optimal wavelengths (469 nm, 495 nm, 523 nm, 536 nm, 537 nm, 569 nm, 593 nm, 653 nm, 698 nm, 733 nm, 760 nm, 816 nm, 845 nm, 856 nm, 866 nm, 921 nm, and 929 nm). They were selected as the finger feature for the follow-up experiments. The process of selecting the significant wavelength with CARS and the distribution of the selected wavelength are shown in Figure 5.
The 17 wavelengths selected by the CARS algorithm represent the full spectrum wavelengths needed to build more simplified and stable models. Table 3 displays the performances of five deep learning models. In the test set, four deep learning models (CNN, LSTM, KNN, and RF) achieved over 90% classification accuracy, among which the accuracy of CNN and LSTM reached 91.67%, and the recognition accuracy of BPNN was the lowest at 86.7%. The detection accuracy of the test set of the five deep learning models based on the selected 17 fingerprint features was higher than that of the test set based on the full wavelength. The encouraging results indicated that an original full wavelength can be well displaced by the selected 17 fingerprint features. At the same time, the input variables of five deep learning classifiers were reduced, thereby simplifying the model and improving the detection speed.

3.2. Vegetation Indices Analysis and Modeling

3.2.1. Vegetation Indices

Many studies have shown that the vegetation index, formed by a combination of visible and near-infrared bands, is a very significant means in the detection of plant disease [61]. In this study, 40 vegetation indices independent of pigment, structure, physiology, and water content were selected from the most appropriate articles and employed to detect early strawberry leaf gray mold disease. Among them, the normalized difference vegetation index (NDVI) is one of the most prevalent and widely used features that can assess the monolithic health of crops [62]. In addition, the photochemical reflectance index (PRI), which is sensitive to the changing carotenoid pigments, was computed in this study [63]. The red edge index (REI) shows the reflectance curve-rising gradient between red to near-infrared bands [64]. The centered bands of PhRI, NBNDVI, and RVSI are adjacent to the near-infrared yellow and red edge bands [65].

3.2.2. Modeling Based on Vegetation Indices

Table 4 reveals the performance of the five identification models based on 40 vegetation indices. The recognition accuracies of BPNN, CNN, LSTM, KNN, and RF models in the prediction set were 83.3%, 84.4%, 80%, 86.67%, and 90%, respectively. The performance of CNN and RF models based on the vegetation index was better than those of full-spectrum data. Among them, the detection accuracy of CNN was 2.7% higher than that of the full wavelength, and the detection accuracy of RF was 10% higher than that of the full wavelength [66,67]. This is because the vegetation index enhances the characteristics of the sample by combining reflectance at two or more wavelengths, amplifying the spectral differences between healthy and diseased leaves. The detection accuracies of BPNN, LSTM, and KNN models based on the vegetation index are not much different from that of the full-wavelength comparison, indicating that the vegetation index can effectively identify early gray mold of strawberry leaves.
However, the calculation of 40 vegetation indices requires the extraction of more than 40 wavelengths of spectral reflectance. In addition, 40 vegetation indices may contain invalid information [68]. Therefore, it is essential to further reduce the input of vegetation indices. Firstly, Pearson correlation analysis was utilized to evaluate the correlations between 40 vegetation indices. Figure 6 and Figure 7 display the correlation coefficients among different vegetation indices. It is clear that some vegetation indices have a strong correlation while others do not. For example, the correlations between Pssb1 and RGI, RARSa, RARSb, PSSRa, RARSc, PRI, SIPI, and NPCI were −0.5307, −0.6551, 0.9224, 0.9587, 0.9745, −0.1495, 0.9434, and 0.4729, respectively. The correlation between RARSc and PRI was low (−0.292), while the correlation between RARSC and SIPI was high (0.9568), and NPCI showed no correlation with PRI and RGI (less than 0.2). The correlation analysis of different vegetation indices provides a theoretical basis for further extracting the most effective vegetation index and improves the robustness of the detection model.
In this study, 21 vegetation indices with correlations less than 0.05 were selected by the Pearson correlation analysis for subsequent use. Then, the SPA algorithm was employed to select the five vegetation indices (RARSb, AntGitelson, ARI, FRI1, and FRI3) with a higher COSS (above 75) among the 21 vegetation indices. Figure 7 reveals the COSS of the 21 vegetation indices. Table 5 shows the detection accuracy of five deep learning recognition models constructed with the five extracted vegetation indices above. The recognition accuracy of the five deep learning models in the test set was 81.6%, 83.3%, 83.3%, 85%, and 84.4%, respectively. It showed a slight decrease compared with the previous 40 vegetation indices. This is because some useful information was also removed when the five most important vegetation indices were selected. However, the number of required spectral reflectance was reduced while the recognition accuracy was ensured. Compared with the model based on 40 vegetation indices, the model based on 5 significant vegetation indices is of less computation and higher disease detection efficiency. It also suggests that vegetation index features have great advantages in the detection of plant disease.

3.3. Modeling Based on Fusion Features and Comparison of Different Features in Modeling

In this study, five deep learning recognition models, LSTM, CNN, BPNN, RF, and KNN, were established with hyperspectral fingerprint characteristics combined with the vegetation index as inputs to identify early (24 h) gray mold in strawberry leaves. Table 6 shows the performance of the five deep learning recognition models based on fusion features. Figure 8 summarizes the detection results of five models based on a single fingerprint feature, a single vegetation index, and fusion feature (the fingerprint feature combined with the vegetation index). The detection accuracies of five kinds of deep learning recognition models based on fusion features are 93.3%, 96.6%, 88.9%, 90%, and 91.8%, respectively. Compared with a single feature, combining fingerprint features and the vegetation index as the input, except KNN and LSTM, the accuracy of five deep learning recognition models with single fingerprint features improved. The detection accuracy of the single high-score vegetation index increased by 6.6%, 4.93%, −2.77%, 0%, and 1.6%, and the contrast difference was 11.7%, 13.3%, 5.6%, 5%, and 7.4%, respectively. This is because the combined features contain a great deal of disease-related information and reveal different aspects of the differences between healthy and diseased leaves. The CNN model based on fusion features has the best performance, and the classification accuracy is as high as 96.6%. By evaluating various models—BPNN, CNN, LSTM, RF, and KNN—on a test set using mixed features, the confusion matrix is shown in Figure 9. The model trained with BPNN had an accuracy of 0.867 and a recall rate of 0.851. The CNN model showed excellent performance with an accuracy of 0.900 and a recall rate of 0.963. The accuracy of the LSTM model was 0.867, and the recall rate was 0.889. The accuracy of the RF model was 0.833, and the recall rate was 0.889. Finally, the accuracy of the KNNs model was 0.833, and the recall rate was 0.925. These results clearly show that the CNN model outperforms other models in both accuracy and recall, highlighting its effectiveness in correctly identifying instances in test datasets. In addition, the detection model is built by combining fingerprint characteristics with the vegetation index, which improves the stability and accuracy of classification. We have reason to believe that the combination of fingerprint characteristics and vegetation index is conducive to the early identification of diseases.
The training time of each model is displayed in Table 7.

4. Conclusions

This study discussed the feasibility of using hyperspectral imaging technology combined with multi-dimensional feature fusion (the spectral fingerprint feature and vegetation index) and a deep learning algorithm for the early detection of gray mold in strawberry leaves. In this study, the spectral reflectance of 616 hyperspectral image regions of interest in the range of visible and near-infrared spectra 359–1020 nm were extracted, and 40 vegetation indices related to leaf physiological characteristics were calculated. To reduce the input of the model and improve the detection speed of the model, the CARS algorithm was employed to reduce the dimension of 616 spectral reflectance, and finally, 17 fingerprint features were selected. The SPA algorithm was utilized to select five significant vegetation indices. Then, five deep learning recognition models (LSTM, CNN, BPNN, RF, and KNN) were established based on the fingerprint feature, vegetation index, and their fusion feature to detect early (with 24 h) gray mold in strawberry leaves. The results obtained are encouraging and hold great promise for future research. The CNN model based on fusion features has the highest recognition accuracy of 96.6%. Moreover, the fusion of spectral fingerprint features and the vegetation index can effectively overcome the sensitivity and uncertainty of a single feature to the spectrum and can improve fault tolerance, robustness, and universality in practical applications. There is reason to believe that the combination of fingerprint characteristics and vegetation index is conducive to the early identification of diseases.
This design presents significant innovations by integrating multi-dimensional feature fusion and employing diverse deep learning algorithms for precise disease detection in strawberries. Specifically, this approach enhances disease recognition model accuracy and stability by fusing spectral fingerprint characteristics with vegetation indices. In addressing the nonlinear relationships across varied dimensions, multiple deep learning algorithms were utilized to create a detection model capable of accurately identifying strawberry diseases. These results provide scientific support for strawberry disease diagnosis, reducing costs associated with delayed traditional diagnostics. The findings further offer theoretical guidance for subsequent disease prevention and field management and serve as an essential reference for future agricultural management.
Future research will prioritize exploring additional effective hyperspectral leaf features to improve feature fusion; for example, various texture or color features could be added as inputs to enable multi-dimensional disease detection across spectral and informational spaces, thereby enhancing the robustness of practical applications. Moreover, expanding the sample dataset to include diverse disease types and levels of severity will enable more rigorous testing of algorithm effectiveness and reliability, with potential models capable of identifying multiple strawberry diseases (such as anthracnose and gray mold) at varying infection stages (e.g., 12, 24, 48, and 72 h). Enhanced deep learning frameworks such as 3D-CNN and VGG16 will be incorporated to allow for more rapid and efficient early-stage disease detection in strawberry leaves. This model will ultimately be field-deployable through a portable strawberry leaf disease detection device equipped with a micro-spectrometer for on-site hyperspectral image acquisition and real-time data analysis. This application will enable quick, precise disease assessments, providing scientific guidance for disease management and fertilization practices, enhancing economic returns in strawberry cultivation, and significantly contributing to the automation of agricultural disease management, thereby accelerating the progress of modern agricultural practices.

Author Contributions

Conceptualization, J.Y. and Y.O.; methodology, Z.L. and Y.O.; validation, Y.O. and J.Y.; formal analysis, J.Y. and Y.O.; writing—original draft preparation, Y.O., J.Y. and Z.L.; writing—review and editing, J.Y. and Y.O.; supervision, B.Z.; project administration, B.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Project no. 32472010), the Jiangsu Agricultural Science and Technology Innovation Fund (JASTIF) (Grant no. CX(24)3027), and the Natural Science Foundation of Jiangsu Province (Grant no. BK20231478).

Data Availability Statement

Raw data supporting the conclusions of this paper will be provided upon request.

Acknowledgments

Thanks for the support of the National Natural Science Foundation of China (Project no. 32472010), the Jiangsu Agricultural Science and Technology Innovation Fund (JASTIF) (Grant no. CX(24)3027), and the Natural Science Foundation of Jiangsu Province (Grant no. BK20231478).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Xie, C.; Shao, Y.; Li, X.; He, Y. Detection of early blight and late blight Diseases on tomato leaves using hyperspectral imaging. Sci. Rep. 2015, 5, 1–11. [Google Scholar] [CrossRef] [PubMed]
  2. Meena, S.V.; Dhaka, V.S.; Sinwar, D. Exploring the role of Vegetation indices in Plant diseases Identification. In Proceedings of the 2020 IEEE Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC), Waknaghat, Solan, India, 6–8 November 2020; pp. 372–377. [Google Scholar] [CrossRef]
  3. Huang, M.Y.; Wang, J.H.; Huang, W.J.; Huang, Y.D.; Zhao, C.J.; Wan, A.M. Hyperspectral character of stripe rust on winter wheat and monitoring by remote sensing. Trans. Chin. Soc. Agric. Eng. 2003, 19, 154–158. [Google Scholar] [CrossRef]
  4. Bing, C.; Shaokun, L.; Keru, W. Spectrum characteristics of cotton single leaf infected by verticillium wilt and estimation on severity level of disease. Sci. Agric. Sin. 2007, 40, 2709–2715. [Google Scholar] [CrossRef]
  5. Zhou, Y.; Chen, J.; Ma, J.; Han, X.; Chen, B.; Li, G.; Xiong, Z.; Huang, F. Early warning and diagnostic visualization of Sclerotinia infected tomato based on hyperspectral imaging. Sci. Rep. 2022, 12, 21140. [Google Scholar] [CrossRef]
  6. Xie, C.; Yang, C.; He, Y. Detection of grey mold disease on tomato leaves at different infected stages using hyperspectral imaging. In Proceedings of the 2016 ASABE Annual International Meeting, Orlando, FL, USA, 17–20 July 2016; p. 162462686. [Google Scholar] [CrossRef]
  7. Mahlein, A.-K.; Rumpf, T.; Welke, P.; Dehne, H.-W.; Plümer, L.; Steiner, U.; Oerke, E.-C. Development of spectral indices for detecting and identifying plant diseases. Remote Sens. Environ. 2013, 128, 2130. [Google Scholar] [CrossRef]
  8. Meng, R.; Lv, Z.; Yan, J.; Chen, G.; Zhao, F.; Zeng, L.; Xu, B. Development of Spectral Disease Indices for Southern Corn Rust Detection and Severity Classification. Remote Sens. 2020, 12, 3233. [Google Scholar] [CrossRef]
  9. Mandal, N.; Adak, S.; Das, D.K.; Sahoo, R.N.; Mukherjee, J.; Kumar, A.; Chinnusamy, V.; Das, B.; Mukhopadhyay, A.; Rajashekara, H.; et al. Spectral characterization and severity assessment of rice blast disease using univariate and multivariate models. Front. Plant Sci. 2023, 14, 1067189. [Google Scholar] [CrossRef]
  10. Cao, Y.; Yuan, P.; Xu, H.; Martínez-Ortega, J.; Feng, J.; Zhai, Z. Detecting Asymptomatic Infections of Rice Bacterial Leaf Blight Using Hyperspectral Imaging and 3-Dimensional Convolutional Neural Network with Spectral Dilated Convolution. Front. Plant Sci. 2022, 13, 963170. [Google Scholar] [CrossRef]
  11. Bao, D.; Zhou, J.; Bhuiyan, S.A.; Adhikari, P.; Tuxworth, G.; Ford, R.; Gao, Y. Early detection of sugarcane smut and mosaic diseases via hyperspectral imaging and spectral-spatial attention deep neural networks. J. Agric. Food Res. 2024, 18, 101369. [Google Scholar] [CrossRef]
  12. Lee, C.C.; Koo, V.C.; Lim, T.S.; Lee, Y.P.; Abidin, H. A multi-layer perceptron-based approach for early detection of BSR disease in oil palm trees using hyperspectral images. Heliyon 2022, 8, e09252. [Google Scholar] [CrossRef]
  13. Ju, C.; Chen, C.; Li, R.; Zhao, Y.; Zhong, X.; Sun, R.; Liu, T.; Sun, C. Remote sensing monitoring of wheat leaf rust based on UAV multispectral imagery and the BPNN method. Food Energy Secur. 2023, 12, e477. [Google Scholar] [CrossRef]
  14. Zhang, B.; Li, J.; Fan, S.; Huang, W.; Zhao, C.; Liu, C.; Huang, D. Hyperspectral imaging combined with multivariate analysis and band math for detection of common defects on peaches (Prunus persica). Comput. Electron. Agric. 2015, 114, 14–24. [Google Scholar] [CrossRef]
  15. Ferreira, R.D.A.; Teixeira, G.; Peternelli, L.A. Kennard-Stone method outperforms the Random Sampling in the selection of calibration samples in SNPs and NIR data. Ciência Rural 2021, 52, e20201072. [Google Scholar] [CrossRef]
  16. Feng, Z.H.; Wang, L.Y.; Yang, Z.Q.; Zhang, Y.Y.; Li, X.; Song, L.; He, L.; Duan, J.Z.; Feng, W. Hyperspectral monitoring of powdery mildew disease severity in wheat based on machine learning. Front. Plant Sci. 2022, 13, 828454. [Google Scholar] [CrossRef]
  17. Blackburn, G.A. Quantifying chlorophylls and caroteniods at leaf and canopy scales: An evaluation of some hyperspectral approaches. Remote Sens. Environ. 1998, 66, 273–285. [Google Scholar] [CrossRef]
  18. Albetis, J.; Jacquin, A.; Goulard, M.; Poilvé, H.; Rousseau, J.; Clenet, H.; Dedieu, G.; Duthoit, S. On the potentiality of UAV multispectral imagery to detect Flavescence dorée and Grapevine Trunk Diseases. Remote Sens. 2018, 11, 23. [Google Scholar] [CrossRef]
  19. Chappelle, E.W.; Kim, M.S.; McMurtrey, J.E. III. Ratio analysis of reflectance spectra (RARS): An algorithm for the remote estimation of the concentrations of chlorophyll a, chlorophyll b, and carotenoids in soybean leaves. Remote Sens. Environ. 1992, 39, 239–247. [Google Scholar] [CrossRef]
  20. Gamon, J.A.; Penuelas, J.; Field, C.B. A narrow-waveband spectral index that tracks diurnal changes in photosynthetic efficiency. Remote Sens. Environ. 1992, 41, 35–44. [Google Scholar] [CrossRef]
  21. Penuelas, J.; Baret, F.; Filella, I. Semi-empirical indices to assess carotenoids/chlorophyll a ratio from leaf spectral reflectance. Photosynthetica 1995, 31, 221–230. [Google Scholar]
  22. Filella, I.; Serrano, L.; Serra, J.; Peñuelas, J. Evaluating wheat nitrogen status with canopy reflectance indices and discriminant analysis. Crop Sci. 1995, 35, 1400–1405. [Google Scholar] [CrossRef]
  23. Wang, Z.J.; Wang, J.H.; Liu, L.Y.; Huang, W.J.; Zhao, C.J.; Wang, C.Z. Prediction of grain protein content in winter wheat (Triticum aestivum L.) using plant pigment ratio (PPR). Field Crops Res. 2004, 90, 311–321. [Google Scholar] [CrossRef]
  24. Rondeaux, G.; Steven, M.; Baret, F. Optimization of soil-adjusted vegetation indices. Remote Sens. Environ. 1996, 55, 95–107. [Google Scholar] [CrossRef]
  25. Wikantika, K.; Ghazali, M.F.; Dwivany, F.M.; Susantoro, T.M.; Yayusman, L.F.; Sunarwati, D.; Sutanto, A. A Study on the Distribution Pattern of Banana Blood Disease (BBD) and Fusarium Wilt Using Multispectral Aerial Photos and a Handheld Spectrometer in Subang, Indonesia. Diversity 2023, 15, 1046. [Google Scholar] [CrossRef]
  26. Gitelson, A.A.; Gritz, Y.; Merzlyak, M.N. Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. J. Plant Physiol. 2003, 160, 271–282. [Google Scholar] [CrossRef]
  27. Merzlyak, M.N.; Gitelson, A.A.; Chivkunova, O.B.; Rakitin, V.Y. Non-destructive optical detection of pigment changes during leaf senescence and fruit ripening. Physiol. Plant. 1999, 106, 135–141. [Google Scholar] [CrossRef]
  28. Lewis, H.G.; Brown, M. A generalized confusion matrix for assessing area estimates from remotely sensed data. Int. J. Remote Sens. 2001, 22, 3223–3235. [Google Scholar] [CrossRef]
  29. Birth, G.S.; McVey, G.R. Measuring the color of growing turf with a reflectance spectrophotometer 1. Agron. J. 1968, 60, 640–643. [Google Scholar] [CrossRef]
  30. Zarco-Tejada, P.J.; Berjón, A.; López-Lozano, R.; Miller, J.R.; Martín, P.; Cachorro, V.; González, M.R.; De Frutos, A. Assessing vineyard condition with hyperspectral indices: Leaf and canopy reflectance simulation in a row-structured discontinuous canopy. Remote Sens. Environ. 2005, 99, 271–287. [Google Scholar] [CrossRef]
  31. Thenkabail, P.S.; Smith, R.B.; De Pauw, E. Hyperspectral vegetation indices and their relationships with agricultural crop characteristics. Remote Sens. Environ. 2000, 71, 158–182. [Google Scholar] [CrossRef]
  32. Chandel, A.K.; Khot, L.R.; Sallato, B. Apple powdery mildew infestation detection and mapping using high-resolution visible and multispectral aerial imaging technique. Sci. Hortic. 2021, 287, 110228. [Google Scholar] [CrossRef]
  33. Barnes, E.M.; Clarke, T.R.; Richards, S.E.; Colaizzi, P.D.; Haberland, J.; Kostrzewski, M.; Waller, P.; Choi, C.; Riley, E.; Thompson, T.; et al. Coincident detection of crop water stress, nitrogen status and canopy density using ground based multispectral data. In Proceedings of the Fifth International Conference on Precision Agriculture, Bloomington, MN, USA, 16–19 July 2000. [Google Scholar]
  34. Merton, R.; Huntington, J. Early simulation results of the ARIES-1 satellite sensor for multi-temporal vegetation research derived from AVIRIS. In Proceedings of the Eighth Annual JPL Airborne Earth Science Workshop, Pasadena, CA, USA, 9–11 February 1999. [Google Scholar]
  35. Haboudane, D.; Miller, J.R.; Pattey, E.; Zarco-Tejada, P.J.; Strachan, I.B. Hyperspectral vegetation indices and novel algorithms for predicting green LAI of crop canopies: Modeling and validation in the context of precision agriculture. Remote Sens. Environ. 2004, 90, 337–352. [Google Scholar] [CrossRef]
  36. Baloloy, A.B.; Blanco, A.C.; Candido, C.G.; Argamosa, R.J.L.; Dumalag, J.B.L.C.; Dimapilis, L.L.C.; Paringit, E.C. Estimation of mangrove forest aboveground biomass using multispectral bands, vegetation indices and biophysical variables derived from optical satellite imageries: Rapideye, planetscope and sentinel-2. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2018, 4, 29–36. [Google Scholar] [CrossRef]
  37. Chen, B.; Wang, K.; Li, S.; Wang, J.; Bai, J.; Xiao, C.; Lai, J. Spectrum characteristics of cotton canopy infected with verticillium wilt and inversion of severity level. In Computer and Computing Technologies in Agriculture, Volume II, Proceedings of the First IFIP TC 12 International Conference on Computer and Computing Technologies in Agriculture (CCTA 2007), Wuyishan, China, 18–20 August 2007; Springer Nature: Boston, MA, USA, 2008; pp. 1169–1180. [Google Scholar]
  38. Broge, N.H.; Mortensen, J.V. Deriving green crop area index and canopy chlorophyll density of winter wheat from spectral reflectance data. Remote Sens. Environ. 2002, 81, 45–57. [Google Scholar] [CrossRef]
  39. Zarco-Tejada, P.J.; Miller, J.R.; Mohammed, G.H.; Noland, T.L. Chlorophyll fluorescence effects on vegetation apparent reflectance: I. Leaf-level measurements and model simulation. Remote Sens. Environ. 2000, 74, 582–595. [Google Scholar] [CrossRef]
  40. Dobrowski, S.Z.; Pushnik, J.C.; Zarco-Tejada, P.J.; Ustin, S.L. Simple reflectance indices track heat and water stress-induced changes in steady-state chlorophyll fluorescence at the canopy scale. Remote Sens. Environ. 2005, 97, 403–414. [Google Scholar] [CrossRef]
  41. Sims, D.A.; Gamon, J.A. Relationships between leaf pigment content and spectral reflectance across a wide range of species, leaf structures and developmental stages. Remote Sens. Environ. 2002, 81, 337–354. [Google Scholar] [CrossRef]
  42. Barnes, J.D.; Balaguer, L.; Manrique, E.; Elvira, S.; Davison, A.W. A reappraisal of the use of DMSO for the extraction and determination of chlorophylls a and b in lichens and higher plants. Environ. Exp. Bot. 1992, 32, 85–100. [Google Scholar] [CrossRef]
  43. Merton, R. Monitoring community hysteresis using spectral shift analysis and the red-edge vegetation stress index. In Proceedings of the Seventh Annual JPL Airborne Earth Science Workshop, Pasadena, CA, USA, 12–16 January 1998. [Google Scholar]
  44. Klein, D.; Menz, G. Monitoring of seasonal vegetation response to rainfall variation and land use in East Africa using ENVISAT MERIS data. In Proceedings of the 2005 IEEE International Geoscience and Remote Sensing Symposium, 2005. IGARSS’05, Seoul, Republic of Korea, 25–29 July 2005; IEEE: New York, NY, USA, 2005; Volume 4, pp. 2884–2887. [Google Scholar]
  45. Peñuelas, J.; Pinol, J.; Ogaya, R.; Filella, I. Estimation of plant water concentration by the reflectance water index WI (R900/R970). Int. J. Remote Sens. 1997, 18, 2869–2875. [Google Scholar] [CrossRef]
  46. Babar, M.A.; Reynolds, M.P.; Van Ginkel, M.; Klatt, A.R.; Raun, W.R.; Stone, M.L. Spectral reflectance to estimate genetic variation for in-season biomass, leaf chlorophyll, and canopy temperature in wheat. Crop Sci. 2006, 46, 1046–1057. [Google Scholar] [CrossRef]
  47. Liu, X.; Wang, H.; Cao, Y.; Yang, Y.; Sun, X.; Sun, K.; Li, Y.; Zhang, J.; Pei, Z. Comprehensive growth index monitoring of desert steppe grassland vegetation based on UAV hyperspectral. Front. Plant Sci. 2023, 13, 1050999. [Google Scholar] [CrossRef]
  48. Liu, B.; Fernandez, M.A.; Liu, T.M.; Ding, S. Investigation of Using Hyperspectral Vegetation Indices to Assess Brassica Downy Mildew. Sensors 2024, 24, 1916. [Google Scholar] [CrossRef] [PubMed]
  49. Chen, X.; Lv, X.; Ma, L.; Chen, A.; Zhang, Q.; Zhang, Z. Optimization and validation of hyperspectral estimation capability of cotton leaf nitrogen based on SPA and RF. Remote Sens. 2022, 14, 5201. [Google Scholar] [CrossRef]
  50. Pan, T.-T.; Sun, D.-W.; Cheng, J.-H.; Pu, H. Regression Algorithms in Hyperspectral Data Analysis for Meat Quality Detection and Evaluation. Compr. Rev. Food Sci. Food Saf. 2016, 15, 529–541. [Google Scholar] [CrossRef] [PubMed]
  51. Yin, J.; Qi, C.; Chen, Q.; Qu, J. Spatial-spectral network for hyperspectral image classification: A 3-D CNN and Bi-LSTM framework. Remote Sens. 2021, 13, 2353. [Google Scholar] [CrossRef]
  52. Bai, X.; Zhou, Y.; Feng, X.; Tao, M.; Zhang, J.; Deng, S.; Lou, B.; Yang, G.; Wu, Q.; Yu, L.; et al. Evaluation of rice bacterial blight severity from lab to field with hyperspectral imaging technique. Front. Plant Sci. 2022, 13, 1037774. [Google Scholar] [CrossRef]
  53. Yan, T.; Xu, W.; Lin, J.; Duan, L.; Gao, P.; Zhang, C.; Lv, X. Combining Multi-Dimensional Convolutional Neural Network (CNN) With Visualization Method for Detection of Aphis gossypii Glover Infection in Cotton Leaves Using Hyperspectral Imaging. Front. Plant Sci. 2021, 12, 604510. [Google Scholar] [CrossRef]
  54. Wu, Y.; Li, L.; Liu, L.; Liu, Y. Nondestructive measurement of internal quality attributes of apple fruit by using NIR spectroscopy. Multimed. Tools Appl. 2019, 78, 4179–4195. [Google Scholar] [CrossRef]
  55. Pereira, L.F.S.; Barbon, S., Jr.; Valous, N.A.; Barbin, D.F. Predicting the ripening of papaya fruit with digital imaging and random forests. Comput. Electron. Agric. 2018, 145, 76–82. [Google Scholar] [CrossRef]
  56. Weng, S.; Qiu, M.; Dong, R.; Wang, F.; Huang, L.; Zhang, D.; Zhao, J. Fast detection of fenthion on fruit and vegetable peel using dynamic surface-enhanced Raman spectroscopy and random forests with variable selection. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2018, 200, 20–25. [Google Scholar] [CrossRef]
  57. Abdulridha, J.; Batuman, O.; Ampatzidis, Y. UAV-based remote sensing technique to detect citrus canker disease utilizing hyperspectral imaging and machine learning. Remote Sens. 2019, 11, 1373. [Google Scholar] [CrossRef]
  58. Khan, I.H.; Liu, H.; Li, W.; Cao, A.; Wang, X.; Liu, H.; Yao, X. Early detection of powdery mildew disease and accurate quantification of its severity using hyperspectral images in wheat. Remote Sens. 2021, 13, 3612. [Google Scholar] [CrossRef]
  59. Xie, C.; Yang, C.; He, Y. Hyperspectral imaging for classification of healthy and gray mold diseased tomato leaves with different infection severities. Comput. Electron. Agric. 2017, 135, 154–162. [Google Scholar] [CrossRef]
  60. Lacotte, V.; Peignier, S.; Raynal, M.; Demeaux, I.; Delmotte, F.; da Silva, P. Spatial–Spectral Analysis of Hyperspectral Images Reveals Early Detection of Downy Mildew on Grapevine Leaves. Int. J. Mol. Sci. 2022, 23, 10012. [Google Scholar] [CrossRef]
  61. Wan, L.; Li, H.; Li, C.; Wang, A.; Yang, Y.; Wang, P. Hyperspectral Sensing of Plant Diseases: Principle and Methods. Agronomy 2022, 12, 1451. [Google Scholar] [CrossRef]
  62. Lowe, A.; Harrison, N.; French, A.P. Hyperspectral image analysis techniques for the detection and classification of the early onset of plant disease and stress. Plant Methods 2017, 13, 80. [Google Scholar] [CrossRef]
  63. Behmann, J.; Bohnenkamp, D.; Paulus, S.; Mahlein, A.-K. Spatial Referencing of Hyperspectral Images for Tracing of Plant Disease Symptoms. J. Imaging 2018, 4, 143. [Google Scholar] [CrossRef]
  64. Ray, S.S.; Jain, N.; Arora, R.K.; Chavan, S.; Panigrahy, S. Utility of Hyperspectral Data for Potato Late Blight Disease Detection. J. Indian Soc. Remote Sens. 2011, 39, 161–169. [Google Scholar] [CrossRef]
  65. Lu, J.; Zhou, M.; Gao, Y.; Jiang, H. Using hyperspectral imaging to discriminate yellow leaf curl disease in tomato leaves. Precis. Agric. 2018, 19, 379–394. [Google Scholar] [CrossRef]
  66. Wei, X.; Zhang, J.; Conrad, A.O.; Flower, C.E.; Pinchot, C.C.; Hayes-Plazolles, N.; Chen, Z.; Song, Z.; Fei, S.; Jin, J. Machine learning-based spectral and spatial analysis of hyper-and multi-spectral leaf images for Dutch elm disease detection and resistance screening. Artif. Intell. Agric. 2023, 10, 26–34. [Google Scholar] [CrossRef]
  67. Omaye, J.D.; Ogbuju, E.; Ataguba, G.; Jaiyeoba, O.; Aneke, J.; Oladipo, F. Cross-comparative review of Machine learning for plant disease detection: Apple, cassava, cotton and potato plants. Artif. Intell. Agric. 2024, 12, 127–151. [Google Scholar] [CrossRef]
  68. Javidan, S.M.; Banakar, A.; Vakilian, K.A.; Ampatzidis, Y.; Rahnama, K. Early detection and spectral signature identification of tomato fungal diseases (Alternaria alternata, Alternaria solani, Botrytis cinerea, and Fusarium oxysporum) by RGB and hyperspectral image analysis and machine learning. Heliyon 2024, 10, e38017. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The schematic diagram of the hyperspectral imaging system.
Figure 1. The schematic diagram of the hyperspectral imaging system.
Agronomy 14 02694 g001
Figure 2. Healthy and gray leaf mold.
Figure 2. Healthy and gray leaf mold.
Agronomy 14 02694 g002
Figure 3. Flowchart of the work.
Figure 3. Flowchart of the work.
Agronomy 14 02694 g003
Figure 4. Spectral behaviors of different types of strawberry leaves: (a) the hyperspectral cube of the gray mold-infected strawberry leaf; (b) spectra of gray mold-infected strawberry leaves samples; (c) spectra of healthy strawberry leaves samples; and (d) the comparison of original spectra of healthy and disease leaves.
Figure 4. Spectral behaviors of different types of strawberry leaves: (a) the hyperspectral cube of the gray mold-infected strawberry leaf; (b) spectra of gray mold-infected strawberry leaves samples; (c) spectra of healthy strawberry leaves samples; and (d) the comparison of original spectra of healthy and disease leaves.
Agronomy 14 02694 g004
Figure 5. (a) Regression coefficients of each variable; (b) spectral fingerprint feature distribution.
Figure 5. (a) Regression coefficients of each variable; (b) spectral fingerprint feature distribution.
Agronomy 14 02694 g005
Figure 6. (a) The correlation coefficients diagram of 40 vegetation indices; (b) the detail of the correlation coefficients diagram.
Figure 6. (a) The correlation coefficients diagram of 40 vegetation indices; (b) the detail of the correlation coefficients diagram.
Agronomy 14 02694 g006
Figure 7. The COSS of 21 VIs obtained by SPA.
Figure 7. The COSS of 21 VIs obtained by SPA.
Agronomy 14 02694 g007
Figure 8. Classification accuracy comparison of various machine learning models based on different input features. (a) Full wavelength and fingerprint features; (b) full wavelength and significant vegetation index; (c) full wavelength and full vegetation index; and (d) fingerprint feature, significance, and fusion feature.
Figure 8. Classification accuracy comparison of various machine learning models based on different input features. (a) Full wavelength and fingerprint features; (b) full wavelength and significant vegetation index; (c) full wavelength and full vegetation index; and (d) fingerprint feature, significance, and fusion feature.
Agronomy 14 02694 g008
Figure 9. The five models are based on the confusion matrix of mixed features.
Figure 9. The five models are based on the confusion matrix of mixed features.
Agronomy 14 02694 g009
Table 1. Vegetation indices used in this study.
Table 1. Vegetation indices used in this study.
No.CategoryVegetation IndexAcronymEquationReference
1PigmentPigment-specific simple ratioPSSRa R 800 / R 680 [17]
2Pigment-specific simple ratioPSSRb R 800 / R 635 [17]
3Red-green indexRGI R 690 / R 550 [18]
4Ratio analysis of reflection of spectral chlorophyll aRARSa R 675 / R 700 [19]
5Ratio analysis of reflection of spectral chlorophyll bRARSb R 675 / ( R 700 × R 650 ) [19]
6Ratio analysis of reflection of spectral chlorophyll cRARSc R 760 / R 500 [19]
7Photochemical reflectance indexPRI ( R 531 R 570 ) / ( R 531 + R 570 ) [20]
8Structure-insensitive vegetation indexSIPI ( R 800 R 445 ) / ( R 800 + R 680 ) [21]
9Normalized pigment chlorophyll indexNPCI ( R 680 R 430 ) / ( R 680 + R 430 ) [21]
10Nitrogen reflectance indexNRI ( R 570 R 670 ) / ( R 570 + R 670 ) [22]
11Normalized chlorophyll pigment ratio indexNCPI ( R 670 R 450 ) / ( R 670 + R 450 ) [21]
12Plant pigment ratioPPR ( R 550 R 450 ) / ( R 550 + R 450 ) [23]
13Optimized soil-adjusted vegetation indexOSAVI ( 1 + 0.16 ) × ( R 800 R 670 ) / ( R 800 + R 670 + 0.16 ) [24]
14Modified chlorophyll absorption ratio indexMCARI 2 ( ( R 750 R 705 ) 0.2 × ( R 750 R 550 ) ) × ( R 750 / R 705 ) [25]
15Anthocyanin (Gitelson)AntGitelson ( 1 / R 550 1 / R 700 ) × R 780 [26]
16Plant senescence reflectance indexPSRI ( R 660 R 510 ) / R 760 [27]
17Anthocyanin reflectance indexARI 1 / R 550 1 / R 700 [28]
18StructureSimple ratioSR R 900 / R 680 [29]
19Greenness indexGI R 554 / R 667 [30]
20Narrow-band normalized differenceNBNDVI ( R 850 R 680 ) / ( R 850 + R 680 ) [31]
21Normalized difference vegetation indexNDVI ( R 800 R 670 ) / ( R 800 + R 670 ) [32]
22Red-edge NDVIRNDVI ( R 750 R 705 ) / ( R 750 + R 705 ) [33]
23Ratio vegetation structure indexRVSI ( R 712 R 752 ) / 2 R 732 [34]
24Modified triangular vegetation indexMTVI 1.2 × ( 1.2 ( R 800 R 550 ) 2.5 × ( R 670 R 550 ) ) [35]
25Green NDVIGNDVI ( R 750 R 540 + R 570 ) / ( R 750 + R 540 R 570 ) [36]
26Modified simple ratioMSR ( R 800 / R 670 1 ) / ( R 800 / R 670 + 1 ) [37]
27Triangular vegetationindexTVI 0.5 × ( 120 ( R 750 R 550 ) 200 × ( R 670 R 550 ) ) [38]
28PhysiologyFluorescence ratio index 1FRI1 R 690 / R 630 [39]
29Fluorescence ratio index 2FRI2 R 750 / R 800 [39]
30Fluorescence ratio index 3FRI3 R 690 / R 600 [40]
31Fluorescence ratio index 4FRI4 R 740 / R 800 [40]
32Physiological reflectance indexPhRI ( R 550 R 531 ) / ( R 550 + R 531 ) [20]
33Modified red-edge simple ratio indexmRESR ( R 750 R 445 ) / ( R 705 + R 445 ) [41]
34Normalized Pheophytization indexNPQI ( R 415 R 435 ) / ( R 415 + R 435 ) [42]
35Red-edge vegetation stress index 1RVS1 ( R 651 R 750 ) / 2 R 733 [43]
36Red-edge vegetation stress index 2RVS2 ( R 651 R 750 ) / 2 R 751 [43]
37Fluorescence curvature indexFCI R 683 2 / ( R 675 × R 691 ) [39]
38Red edge positionRRE ( R 670 R 780 ) / 2 [44]
39Moisture contentWater band indexWBI R 900 / R 970 [45]
40Water stress and canopy temperatureWSCT ( R 970 R 850 ) / ( R 970 + R 850 ) [46]
Table 2. Classification consequences of the identification models based on 616 spectral features.
Table 2. Classification consequences of the identification models based on 616 spectral features.
Input FeaturesNumber of VariablesModelsCalibration Accuracy (%)Prediction Accuracy (%)
HealthyDiseaseOverallHealthyDiseaseOverall
Full wavelengths616BPNN0.9550.9560.9560.8440.8570.854
CNN1110.8570.7810.817
LSTM0.9120.9330.9220.8120.8210.817
KNN1110.78790.81480.800
RF1110.8260.8180.800
Table 3. Classification consequences of the identification models based on 17 spectral features.
Table 3. Classification consequences of the identification models based on 17 spectral features.
Input FeaturesNumber of VariablesModelsCalibration Accuracy (%)Prediction Accuracy (%)
HealthyDiseaseOverallHealthyDiseaseOverall
Fingerprint feature17BPNN0.9530.9260.9340.9260.8180.867
CNN1110.9060.9290.9167
LSTM0.9550.9560.9720.9030.8930.9167
KNN1110.8790.80.9
RF1110.8710.9320.902
Table 4. Classification consequences of the identification models based on 40 vegetation indices.
Table 4. Classification consequences of the identification models based on 40 vegetation indices.
Input FeaturesNumber of VariablesModelsCalibration Accuracy (%)Prediction Accuracy (%)
HealthyDiseaseOverallHealthyDiseaseOverall
Vegetation index40BPNN0.9870.9850.960.8290.8370.833
CNN1110.9290.9250.844
LSTM0.9870.9860.9940.8110.8260.8
KNN1110.87880.85190.8667
RF1110.9090.9090.9
Table 5. Classification consequences of the identification models based on 5 vegetation indexes.
Table 5. Classification consequences of the identification models based on 5 vegetation indexes.
Input FeaturesNumber of VariablesModelsCalibration Accuracy (%)Prediction Accuracy (%)
HealthyDiseaseOverallHealthyDiseaseOverall
Significant vegetation index5BPNN0.9890.9880.96670.8080.8240.816
CNN0.9780.9760.96670.8150.8480.833
LSTM0.9350.9290.91670.840.8790.833
KNN1110.8190.8890.85
RF1110.86670.90.844
Table 6. Classification consequences of the identification models based on refused features.
Table 6. Classification consequences of the identification models based on refused features.
Input FeaturesNumber of VariablesModelsCalibration Accuracy (%)Prediction Accuracy (%)
HealthyDiseaseTotalityHealthyDiseaseHealthy
Fingerprint feature +significant vegetation index17 + 5BPNN0.9680.9680.96670.9260.9330.933
CNN1110.9630.9660.966
LSTM0.9780.9780.9780.8890.8930.889
KNN1110.84850.96300.900
RF1110.9260.920.918
Table 7. Training time of each model.
Table 7. Training time of each model.
No.Procedure NameTraining/Processing Time
1SAVITZKY-GOLAY smoothing preprocessing1.78 s
2CARS extracting fingerprint features1851.943 s
3Pearson correlation analysis and SPA4.186 s
4BPNN32.294 s
5CNN29.870 s
6KNN75.193 s
7RF18.536 s
8LSTM26.289 s
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ou, Y.; Yan, J.; Liang, Z.; Zhang, B. Hyperspectral Imaging Combined with Deep Learning for the Early Detection of Strawberry Leaf Gray Mold Disease. Agronomy 2024, 14, 2694. https://doi.org/10.3390/agronomy14112694

AMA Style

Ou Y, Yan J, Liang Z, Zhang B. Hyperspectral Imaging Combined with Deep Learning for the Early Detection of Strawberry Leaf Gray Mold Disease. Agronomy. 2024; 14(11):2694. https://doi.org/10.3390/agronomy14112694

Chicago/Turabian Style

Ou, Yunmeng, Jingyi Yan, Zhiyan Liang, and Baohua Zhang. 2024. "Hyperspectral Imaging Combined with Deep Learning for the Early Detection of Strawberry Leaf Gray Mold Disease" Agronomy 14, no. 11: 2694. https://doi.org/10.3390/agronomy14112694

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop