Next Article in Journal
A Mixed Broadleaf Forest Segmentation Algorithm Based on Memory and Convolution Attention Mechanisms
Previous Article in Journal
Modeling the Effects of Spatial Distribution on Dynamics of an Invading Melaleuca quinquenervia (Cav.) Blake Population
Previous Article in Special Issue
Detection of Mulberry Leaf Diseases in Natural Environments Based on Improved YOLOv8
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Detection Model and Spectral Disease Indices for Poplar (Populus L.) Anthracnose Based on Hyperspectral Reflectance

1
College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing 210037, China
2
College of Science, Nanjing Forestry University, Nanjing 210037, China
*
Author to whom correspondence should be addressed.
Forests 2024, 15(8), 1309; https://doi.org/10.3390/f15081309 (registering DOI)
Submission received: 14 June 2024 / Revised: 20 July 2024 / Accepted: 24 July 2024 / Published: 26 July 2024
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning Applications in Forestry)

Abstract

:
Poplar (Populus L.) anthracnose is an infectious disease that seriously affects the growth and yields of poplar trees, and large-scale poplar infections have led to huge economic losses in the Chinese poplar industry. To efficiently and accurately detect poplar anthracnose for improved prevention and control, this study collected hyperspectral data from the leaves of four types of poplar trees, namely healthy trees and those with black spot disease, early-stage anthracnose, and late-stage anthracnose, and constructed a poplar anthracnose detection model based on machine learning and deep learning. We then comprehensively analyzed poplar anthracnose using advanced hyperspectral-based plant disease detection methodologies. Our research focused on establishing a detection model for poplar anthracnose based on small samples, employing the Design of Experiments (DoE)-based entropy weight method to obtain the best preprocessing combination to improve the detection model’s overall performance. We also analyzed the spectral characteristics of poplar anthracnose by comparing typical feature extraction methods (principal component analysis (PCA), variable combination population analysis (VCPA), and the successive projection algorithm (SPA)) with the vegetation index (VI) method (spectral disease indices (SDIs)) for data dimensionality reduction. The results showed notable improvements in the SDI-based model, which achieved 89.86% accuracy. However, this was inferior to the model based on typical feature extraction methods. Nevertheless, it achieved 100% accuracy for early-stage anthracnose and black spot disease in a controlled environment respectively. We conclude that the SDI-based model is suitable for low-cost detection tasks and is the best poplar anthracnose detection model. These findings contribute to the timely detection of poplar growth and will greatly facilitate the forestry sector’s development.

1. Introduction

Poplar (Populus L.) is widely used in the production of various bioproducts, including lubricants, polymers, pharmaceuticals, and biomass energy [1,2]. In particular, poplar leaves and bark contain biologically active compounds such as salicylates and flavonoids, which act as anti-inflammatory and anti-coagulant agents in the human body [3,4]. Poplar leaves can contain up to 15% salicylic acid, the source of the widely used analgesic aspirin [5,6]. Poplar trees have a short rotation period, fast payback, and great returns [7]. In addition to providing direct economic benefits, poplars play a critical role in carbon storage and nitrogen fixation in the soil [7,8]. Poplars can absorb up to 25 tons of CO2 per hectare, with a theoretical value of approximately USD 1000 [9]. According to the Ninth National Forest Resources Inventory Report, China’s poplar plantations cover an area of 7.57 million hectares, with the accumulation of 546 million cubic meters, ranking first in the world [10].
Anthracnose is an infectious disease that has a detrimental impact on the growth of poplar trees; it begins to manifest in May, peaks in July, and then gradually subsides in October. It is caused by Colletotrichum gloeosporioides, which affects the metabolism of substances in cells in the form of adherent cells and ultimately results in swelling to pierce the epidermis of the leaf; thus, it succeeds in invading the subepidermal cells [11,12,13,14]. The growth conditions of poplar trees affect the content of essential chemical components, such as active ingredients [15]. Gordon et al. described a potential interaction between the salicinoid pathway and growth regulatory processes. In the early stage of anthracnose, greenish watery spots appear on the leaf surface and then gradually expand into brown spots of indeterminate shape, which develop faster at the leaf margins and veins [6]. In the late stage of the disease, the spots will form black, nearly round, small, dot-like conidial discs [16]. The growth of infected poplar trees is adversely affected by the early loss of leaves and the drying up of the branch tips, resulting in a decline in productivity [17,18]. Marssonina brunnea is the main pathogen that causes poplar black spot disease [19]. Brown spots may appear on the surfaces of leaves that are susceptible to black spot disease. As the disease progresses, the spots expand, and the leaves turn black [20]. It is important to note that black spot disease and anthracnose have similar symptoms and onset times, which can lead to confusion [14,21]. This can make it difficult to detect anthracnose in poplar.
Hyperspectral technology is extensively used for non-destructive crop disease detection and has achieved excellent results [22,23,24,25,26,27,28,29]. The VIS–NIR division is a popular band region division method in hyperspectral detection research [30,31,32,33]. Dian et al. collected the VIS–NIR information of five peach varieties for identification. They built a CNN model, which achieved 100% accuracy on the verification dataset and 94.4% on the test dataset [34]. Meng et al. collected the visible–near-infrared spectra of 96 samples from six tree species for the prediction of their carbon content, combined the spectral data with PLS regression, and selected appropriate pretreatment methods to achieve the efficient and non-destructive detection and analysis of tree species’ carbon content [35].
Preprocessing techniques are widely used to denoise redundant hyperspectral data. These techniques are commonly employed in constructing models to detect plant diseases and plants’ physiological indicators [36,37]. Noise has various causative factors, such as signal attenuation due to scattering, which affect the model’s prediction accuracy. Preprocessing methods can remove unwanted changes caused by noise [38]. Shen et al. collected hyperspectral data on winter wheat at various fertility stages and the corresponding chlorophyll content. Denoising, data form transformation, and dimensionality reduction were performed sequentially to obtain 20 preprocessing combinations. The effects of the different combinations on the model were compared. The results indicated that the combination of wavelet packet denoising, a first-order derivative transform, and principal component analysis improved the accuracy and stability of the model [39]. However, using preprocessing techniques targeted at specific types of noise can have additional positive effects [40,41]. As a result, research has focused on the fusion of multiple pretreatment methods. This approach allows for the combination of the strengths and weaknesses of different methods and provides a reasonable framework for selecting the most appropriate methods [41]. Bian et al. used the Design of Experiments (DoE) strategy to combine various pretreatment methods. They tested and compared the designed pretreatment combinations using maize, blood, and edible blended oil samples. The results showed that the selective integrated pretreatment method provided stable and accurate results compared to traditional pretreatment methods [42].
Feature engineering can improve the model’s performance, and appropriate feature engineering can also enhance the interpretability of the model [43,44]. Spectral disease indices (SDIs) offer a more efficient feature engineering technique than traditional downscaling algorithms; they involve using a small number of bands to construct new exponential features, allowing the model to achieve a faster response with high accuracy [45]. Previous researchers have used SDIs to achieve excellent spectral data classification and sensitive band extraction on crops such as corn [46].
In addition, feature engineering can facilitate the development of spectral detection devices at specific wavelengths, reducing the instrumentation and research costs (UAVs with thermal imagers cost between USD 5000 and USD 15,000) [47]. Cucho-Padin et al. developed a low-cost UAV remote sensing system for agriculture, including a UAV that could compute specified vegetation indices (VIs) with a multispectral camera, and proposed image processing technology based on open-source software, enabling farmers who cannot afford expensive commercially available hardware and software services to implement a reasonable solution according to their budgets and needs [48].
Precision agriculture often utilizes machine learning and deep learning techniques. The typical algorithms include RF, SVM, the long short-term memory network (LSTM), and the 1D convolutional neural network (1DCNN). The recurrent neural network (RNN) has also been investigated [49,50]. The LSTM is a type of RNN. In a study by He et al., the ability of RF-based and A-LSTM-based models to recognize the winter wheat area using the raw surface reflectance of MODIS remote sensing images was evaluated. The F1 scores of the two models were 0.72 and 0.71, respectively [51]. Recently, the 1DCNN, a variant of CNNs with the potential for processing 1D signals and time series, has gained attention in research [52,53]. Lei Pang et al. achieved 90.11% accuracy on raw spectral data when using hyperspectral imaging and the 1DCNN, proving the potential of the 1DCNN for maize seed viability detection applications [54].
Machine learning can achieve excellent detection results with good interpretability for low-dimensional data inputs. In contrast, deep learning automatically mines the features required for the corresponding task, which is lacking in machine learning [55]. However, large datasets with massive labels pose a serious obstacle to model screening for both learning algorithms in obtaining the most applicable algorithm for the task [56]. Although large datasets help to obtain more credible experimental results, the high cost of data collection hinders the development and testing of the applied models for small- and medium-scale growers [46].
Using machine learning, particularly SDIs and deep learning, in developing detection models may be effective for diagnosing poplar anthracnose. However, there are currently no reports on the spectral features of poplar anthracnose, the development of SDIs for poplar anthracnose detection, or the use of machine learning and deep learning models for poplar anthracnose detection. Therefore, this study focuses on poplar anthracnose. Information-rich bands were acquired, including the visible spectrum (400–760 nm), the near-infrared spectrum (761–2400 nm), and all spectra. The spectral features of poplar anthracnose were analyzed within the 400–2400 nm range. A systematic approach using the Design of Experiments (DoE) method was employed to design preprocessing combination schemes and identify the most suitable one. SDIs were developed to detect poplar anthracnose and were compared with traditional feature extraction methods (PCA, SPA, VCPA). To develop a classification model for poplar anthracnose and to achieve its prompt detection, this study also evaluated and analyzed the differences in detection effectiveness among different models. The aims of this study were (1) to analyze the spectral features of poplar anthracnose, (2) to determine the most relevant single and normalized band differences and construct an SDI for the detection of poplar anthracnose, and (3) to establish a poplar anthracnose detection model based on machine learning and deep learning using an SDI and other typical feature extraction methods. The model was then evaluated to select the most suitable poplar anthracnose detection tools for the desired application.

2. Materials and Methods

2.1. Study Area and Plant Material

As shown in Figure 1, the experimental area was in Qixia District, Nanjing City, Jiangsu Province, China (119.317 E, 32.250 N). The region falls within a subtropical monsoon climate zone, with obvious climate variations, sizeable annual temperature fluctuations, and abundant annual rainfall. The average annual temperature is around 18 °C, with the maximum annual summer temperature reaching 39.7 °C. The annual rainfall can reach 1106 mm. During June–September, anthracnose and black spot disease among poplar are highly prevalent, as observed after years of follow-up sampling.
Leaves from four different poplar stages were collected between June and September from 9:30 to 11:30 a.m. During this period, there is sufficient sunlight, and the leaves have high water content, which is conducive to collecting and preserving fresh samples. The samples collected were healthy leaves and those with black spots, early-stage anthracnose, and late-stage anthracnose. To avoid the influence of irrelevant elements such as sunlight and the growth position on the experiment, we randomly collected four leaf samples from different positions on each sample tree. The tree was marked and GPS-pointed for the sampling tracking. The collected leaf samples were vacuum-sealed, labeled, and refrigerated in airtight containers at −20 °C. The time interval between the field sampling of the poplar leaves and the laboratory collection of hyperspectral data was kept within 4 h. Poplar leaf samples were selected and classified for laboratory PCR pathogen detection. Please refer to Figure 2 and Table 1 for details.

2.2. Data Acquisition

This study utilized a hyperspectral data acquisition system to gather hyperspectral reflectance data from poplar leaves. The system comprised a halogen lamp, experimental table, calibrating panel, spectrometer, power supply, workstation, and control software, as illustrated in Figure 3. The halogen lamp had a power output of 75 W. The experimental table was covered with clean, solid black velvet. The calibrating panel was utilized to correct the light intensity, reduce the noise in the spectral data, and minimize the impact of circuit noise on the results. The hyperspectral instrument used was an ASD FieldSpec 3 spectroradiometer (ASD, Inc., Falls Church, VA, USA), which covered the spectrum from 350 to 2500 nm and involved 2151 spectral bands. The spectral coverage area encompassed hyperspectral data in the visible and infrared bands. The spectrometer was operated using the ViewSpecPro 6.20 control software for acquisition.
The hyperspectral data were collected in a dark room environment. The spectrometer was powered and warmed up for 15 min. Then, the acquisition probe of the spectrometer was aligned with the calibrating panel and covered the lens field of view under the illumination of a halogen lamp. The control software corrected the light intensity and optimized the spectrometer’s sensitivity. The field of view (FOV) of the probe was 15 degrees. Based on the FOV ( α ) and the average of the leaf areas ( A ), the vertical distance ( d ) between the probe and the sample center was estimated to be 4 cm (based on [57], Formula (1), A 0.87   c m 2 ), and the probe was fixed with a bracket.
A = π · tan α 2 · d 2
Before scanning, we used a black and white board for calibration. We recorded the reflectance of the white board ( W ) and the black board ( B ), as well as the original reflectance ( X ) obtained by scanning the leaf, and finally calculated the reflectance ( R ) of the sample after correction using Formula (2). Each leaf was scanned 5 times, and the average of 5 scans was used as the reflectance value of the sample.
R = X B W B

2.3. Data Analysis

This study classified poplar leaves into four stages using hyperspectral reflectance data. The processing steps included band segmentation, data preprocessing, feature engineering, and the construction of classifications, as shown in Figure 4. (1) A reflection curve analysis was used to calculate the average reflectance in continuous bands and to observe and analyze the spectral features of different samples. (2) The DoE method was used to preprocess the data with various combinations of smoothing, baseline correction, scattering correction, and scaling. The effects of different preprocessing combinations on the preprocessing effect were analyzed, and a comprehensive assessment index was used to evaluate the preprocessing combinations, which resulted in the identification of the optimal preprocessing combination and preprocessed data (called OPC). (3) The collected spectral data were segmented into three band thresholds: the visible region of 400–760 nm (VIS), the near-infrared region of 761–2400 nm (NIR), and all spectra within 400–2400 nm (ALL). The model classification results were compared for the three different regions of data. (4) Feature extraction, including four schemes of PCA, SDI, SPA, and VCPA, was performed to construct an SDI of poplar anthracnose to reduce data dimensionality and improve the model’s detection performance. (5) Machine learning classification was performed, including RF and SVM. (6) Deep learning classification, including LSTM and 1DCNN, was performed. (7) Model evaluation was performed, using the overall accuracy and recognition accuracy for each category to evaluate the detection performance of the classification models.

2.3.1. Reflection Curve Analysis

This study utilized the mean spectral reflectance to plot continuous band graphs to visualize the poplar leaves’ spectral variations and interclass differences. The mean spectral reflectance was the average of the reflectance of all samples within a set of samples of the same class [58].

2.3.2. Preprocessing

A combination of preprocessing methods was designed based on the DoE approach. These methods include smoothing, removing baseline drift, scattering correction, and scaling hyperspectral data [40]. The effectiveness of these preprocessing methods in denoising and retaining adequate information in hyperspectral data was analyzed using the entropy weighting method. This method measures the weights of the corresponding indicators of the classification model.
(1)
Smoothing
Savitzky–Golay (SG) applies a least-squares polynomial fit to a five-element sliding window of neighboring data points, and the fitted result replaces the original signal.
The Gaussian-weighted moving average (Gaussian) is applied to each data point, with the weights determined by a Gaussian distribution centered on the data point. The window size of the Gaussian-weighted moving average is 5.
(2)
Baseline correction
The airPLS method is an adaptive iterative approach combining penalized least squares and asymmetric weighting to accurately estimate and remove baseline components from spectral data.
The Continuous Wavelet Transform (CWT) decomposes a signal into wavelet coefficients representing signals at different scales and locations in the time and frequency domains. This decomposition is achieved by convoluting the signal with wavelet basis functions, each with a specific scale and frequency.
(3)
Scatter correction
Variable Sorting for Normalization (VSN) automatically generates a weighting function for a given set of multivariate signals. This function favors signal variables only affected by additive and multiplicative effects, independent of the response of interest, which ensures objectivity and precision in the normalization process [59].
Multiplicative Scatter Correction (MSC) removes spectral differences caused by differences in scattering levels during spectral measurements (e.g., different measurement positions, different lighting).
(4)
Scaling
Mean centering is a data processing method that scales values to the interval [−1, 1] and centers the mean value at zero.
The subsequent results were unaffected by differences between data points at different scales, making them suitable when the maximum and minimum values remained unchanged. The impact of scaling the data on the model was relatively small, so we did not explore comparisons of multiple data scaling methods. All preprocessing combinations, as shown in Table 2, used mean centering.
(5)
Optimal preprocessing combination (OPC)
Each model’s accuracy and F1 score were evaluated by assessing the impact of various preprocessed spectral data on the classification models’ recognition performance. The preprocessing method with the highest value exhibited the best performance. The entropy weighting method was used to assign weights to each indicator in multiple classification models. The scores of each preprocessing method were then calculated based on the indicator weights.
The entropy weighting method involves a de-measurement operation for the two positive indicators of the overall accuracy and F1 score:
y i j = x i j m i n m a x m i n
where x i j represents the value of index j of preprocessing method i , and m a x and m i n represent the maximum and minimum values of index j in all preprocessing methods, respectively.
We then calculated the information entropy of indicator j , e j :
z i j = y i j i = 1 n y i j
e j = i = 1 n z ij ln z i j ln n
where z i j represents the proportion of the value of index j of preprocessing method i among all values of index j , and n represents the number of preprocessing methods.
Next, we calculated the information utility value of indicator j , d j :
d j = 1 e j
Then, we calculated the weight of the calculated indicator j :
w j = d j j = 1 m d j
where m indicates the number of indicators.
Finally, we calculated the final score for pretreatment method i :
P P score i = j = 1 m w j x ij
Subsequently, the pretreatment method with the highest score was selected for analysis.

2.3.3. Feature Extraction

(1)
Principal component analysis (PCA) reduces the dimensionality of datasets by identifying the principal components and the directions in which the data changes the most in the original feature space. Only the principal components contributing to a cumulative rate of more than 95% are retained to construct a new feature dataset.
(2)
The essence of the successive projection algorithm (SPA) is to perform forward feature selection for multiple features to reduce the collinearity in the vector space [60]. After a limited number of iterations of the SPA, the set of feature bands with the lowest RMSE is selected.
(3)
Variable combination population analysis (VCPA) uses the exponential decreasing function (EDF) to determine the space of the feature subset. At the same time, binary matrix sampling (BMS) analyzes the interactions between features in the randomly combined subset to select important features. Finally, the optimal subset with the lowest RMSECV value is obtained via model population analysis (MPA) and PLS regression [61].
(4)
The construction of the spectral disease indices (SDIs) consists of two steps: first, the RELIEF-F algorithm is used to estimate the discrimination ability of certain features based on their performance in separating different classes of samples near each other [46,62,63]. (1) For a feature, find a neighbor sample of the same type and a different class of samples from a given sample set and record them as the most recently hit sample and the most recently missed sample, respectively. (2) Calculate the sum of the Euclidean distances of the most recently hit and the most recently missed samples of the feature to represent the weight of the feature.
Then, an SDI for poplar anthracnose detection was constructed; Formula (9) was used to set up the SDI as follows:
S D I = A + B A B + C × D
The eigenband sets A , B , and C were obtained through RELIEF-F screening. The coefficient D was set to [−1, 1]. The SDI was determined by exhaustively searching for the optimal combination of all wavelengths and coefficients, with a search step of 0.5.

2.3.4. Classification Algorithm

This study employed four supervised classification algorithms, each described below, to model the detection of anthracnose on poplar leaves. The algorithms’ performance was compared using a fixed seed of a random number generator, which ensured the repeatability of the experiments.
(1)
Random forest (RF) is a classification model that integrates learning ideas by creating many decision trees to train the model on a random subset of the training data. The final classification results are generated based on the discriminative results of the decision trees.
(2)
Support vector machine (SVM) is a suitable method for the solution of classification tasks involving small-sample, high-dimensional feature datasets [64]. It has been shown to perform well in hyperspectral classification research. SVM achieves the separation of different sample classes by constructing an optimal hyperplane. The decision to build the hyperplane involves mapping the data to a high-dimensional space using a kernel function and maximizing the distance between the nearest data points of the two different classes in the high-dimensional space (known as the support vector). Using kernel functions, SVM can handle both linear and non-linearly separable data.
(3)
Long short-term memory (LSTM) is a type of recurrent neural network (RNN) that is particularly effective in dealing with long-term dependencies, as opposed to a traditional RNN. The key idea of LSTM is that it can remember information over long periods using a memory unit. The memory unit acts as a storage unit, allowing the model to selectively add, delete, and update information as it processes the input sequence. The LSTM layer is configured with 30 memory units, and Adam is used as the optimization algorithm during training [65,66].
(4)
A typical CNN consists of a convolutional layer, a pooling layer, and a fully connected layer, with the neurons in each layer connected by an activation function. The convolutional layer extracts data features, while the pooling layer reduces the data dimensionality. The non-linear activation function allows the network to learn complex and abstract features and patterns in the data. The fully connected layer completes the classification task. We constructed a 1DCNN based on the working principle of the CNN to classify hyperspectral data. Each layer specifies its configuration, indicating how the data are processed and transformed throughout the network in the sequence classification task.

2.3.5. Model Evaluation

The confounding definitions for the four types of sample classification models used in this study are shown in Table 3.
The model’s performance was evaluated by calculating the overall accuracy (OA) and stage-specific accuracy on the test set through a confusion matrix. The OA was used to evaluate the overall detection performance of the model; the stage-specific accuracy was used to evaluate the performance of the model on healthy leaf samples ( A c c H e a l t h y ), leaf samples with black spot disease ( A c c B l a c k   S p o t ), and leaf samples with early-stage ( A c c E a r l y ) and late-stage anthracnose ( A c c L a t e ). The performance evaluation metrics were defined as follows.
O A = T i T i + F i , J × 100 % , i , j { Healthy , Early , Late , Black , Spot } , i j
A c c E a r l y = T E a r l y T E a r l y + F E a r l y L a t e + F E a r l y H e a l t h y + F E a r l y B l a c k × 100 %
A c c L a t e = T L a t e F L a t e E a r l y + T L a t e + F L a t e H e a l t h y + F L a t e B l a c k × 100 %
A c c H e a l t h y = T H e a l t h y F H e a l t h y E a r l y + F H e a l t h y L a t e + T H e a l t h y + F H e a l t h y B l a c k × 100 %
A c c B l a c k   S p o t = T B l a c k F B l a c k E a r l y + F B l a c k L a t e + F B l a c k H e a l t h y + T B l a c k × 100 %
All model results were obtained using MATLAB R2022b, and ten-fold cross-validation was used to evaluate the classifier’s performance and optimization results. Specifically, the dataset was randomly divided into a training set (70%) and a test set (30%). During the classifier training process, 10% of each batch of training sets was used as a validation set, and the classifier’s performance was evaluated using the mean of 10 repetitions.

3. Results

3.1. Spectral Reflectance Curve Analysis

Figure 5 shows the samples’ average reflectance curve in different physiological stages. The average reflectance of the four stages showed a similar trend in the continuous band across the entire spectrum but differed obviously between physiological stages. The maximum reflectance was observed for black spot disease and the minimum for anthracnose in the visible light range of 400–700 nm. At 540–560 nm (which belongs to the green band region), a sudden peak was observed for the four stages of the leaves. In the near-infrared band, an average reflectance difference was observed between anthracnose and black spot disease at 883–1126 nm. Anthracnose had higher average reflectance than black spot disease, and this difference increased with the severity of the disease. The average reflectance value of healthy leaves was between these two stages. Between 1391 and 2400 nm, the mean reflectance values of black spot disease were greater than those of anthracnose, with an obvious difference. However, the difference in the mean reflectance of healthy leaves and those with late-stage anthracnose was slight. Obvious horizontal differences in the spectral characteristics of the four stages appeared between 1600 and 1750 nm. In particular, at 1650 nm, the mean reflectance value of black spot disease had a peak, while the other stages had troughs.

3.2. Preferred Pretreatment Combinations

The original spectrum underwent preprocessing to minimize the impact of noise. In Figure 6, SG, Gaussian renders the reflectance curve smoother while maintaining the shape and trend of the original spectral curves. After CWT processing, the spectral curve has more peaks and troughs. The airPLS increases the differences in the reflectance levels of the four stages at 780–1300 nm; VSN and MSC render the peaks and troughs sharper. These preprocessing methods can impact the final classification results, and these effects are reflected in the final model evaluation.
From the preprocessed data, we modeled and evaluated the accuracy and F1 score. A total of 108 models were evaluated using eight metrics across 27 preprocessed datasets. This is shown in the Supplementary File Table S2.
Based on the results of the preprocessing combination, an ablation study was performed. As shown in Table 4 (a–c), the results indicate that preprocessing can improve the model performance; Table 4 (a,e) show the importance of selecting a reasonable preprocessing combination to improve the model accuracy; and Table 4 (b–d) show that the Gaussian + CWT preprocessing combination can improve the model performance.
Therefore, considering the varying impacts of the 27 preprocessing combinations on the classification performance of the four models, the most suitable preprocessing combination for the algorithms was selected through a comprehensive analysis using the entropy weighting method, which involved assigning weights to indicators that reflected the impacts.
As shown in Table 5, the entropy method was used to calculate the variables’ weights (importance), where the maximum value of the indicator weights was the F1 score of SVM (13.893%), and the minimum value was the F1 score of the 1DCNN (9.979%).
Figure 7 shows that Gaussian + CWT + VSN achieved the highest comprehensive score of 0.9531, followed by Gaussian + No + MSC, whose score was 0.9052. No processing + airPLS + no processing received the lowest score of 0.3208. Therefore, Gaussian + CWT + VSN was the optimal preprocessing method and was selected for the subsequent stages of the study. This optimal preprocessing combination is defined as the OPC.

3.3. Feature Extraction

We used PCA, SDI, VCPA, and SPA for the dataset without feature extraction. We also discussed and analyzed feature extraction using VIS, NIR, and ALL. It is important to note that all evaluations were objective and free from bias.
(1)
Principal component analysis (PCA)
Figure 8 shows the results of the PCA processing. In the visible spectrum (400–760 nm), 98.56% of the data variation was explained by PC1 (92.68%) and PC2 (5.88%), which contained the spectral regions of 705–720 nm, 725–744 nm, and 757–760 nm. In the near-infrared spectrum (761–2400 nm), PC1 (63.75%) and PC2 (19.80%), including the regions 761–780 nm and 1653–1670 nm, explained 83.55% of the data variation. PC1 (71.69%) and PC2 (9.13%) explained 80.82% of the data variation, including regions 712–716 nm, 725–744 nm, and 762–776 nm. Figure 8 shows the scatter distribution of the four sample data types in the two-dimensional space of PC1 and PC2. The confidence ellipse of black spot disease does not overlap with that of anthracnose, and there is a clear positional difference, which helps the model to distinguish between black spot disease and anthracnose. Additionally, the confidence ellipse of healthy samples partially overlaps with those of all other classes, which may cause the model to misclassify healthy leaves.
(2)
Spectral disease indices (SDIs)
We developed four representative SDIs for the four classifications, which were defined as follows: (1) the healthy index (HI) represented the healthy stage; (2) the black spot index (BI) represented black spot disease; (3) the early-stage anthracnose index (ESAI) represented early-stage anthracnose; and (4) the late-stage anthracnose index (LSAI) represented late-stage anthracnose. Each sample had four index values, resulting in a reduction in the data’s dimensionality.
The RELIEF-F algorithm can be used to assess the degree of wavelength correlation in a given dataset by separating two classes of samples. The preprocessed sample set is divided into four categories based on the corresponding labels: healthy vs. other, black spot disease vs. other, early-stage anthracnose vs. other, and late-stage anthracnose vs. other. The RELIEF-F algorithm was used to assess the degree of correlation between the single bands and the sample categories, as shown in Figure 9. In the VIS spectrum, the wavelengths highly correlated with the healthy state, black spot disease, and anthracnose (both early-stage and late-stage) were 433 nm, 438 nm, and 710 nm, respectively. In the NIR spectrum, the wavelengths highly correlated with the healthy state, black spot disease, early-stage anthracnose, and late-stage anthracnose were 1357 nm, 2114 nm, 1892 nm, and 1893 nm, respectively. For ALL, the wavelengths highly correlated with the healthy state, black spot disease, and anthracnose (both early-stage and late-stage) were 1698 nm, 2088 nm, and 1893 nm, respectively.
All possible normalized wavelength differences within the sample set were calculated using the following formula, and their relevance to each category of samples was assessed using the RELIEF-F algorithm (Figure 10).
The RELIEF-F algorithm yielded eight single bands and eight normalized wavelength differences with the highest relevance to the four stages. The four index values for each sample were calculated by combining the single bands and normalized wavelength differences. A ten-fold cross-validation test obtained the combination of single bands and normalized wavelength differences that resulted in the minimum RMSECV. This combination was ultimately used to construct the SUITABLE SDI (Table 6 and Table 7).
(3)
VCPA
The VCPA treatment with the smallest RMSECV was selected to determine the best combination of feature bands. All three methods could obtain the band combination with the smallest RMSECV value within a limited number of runs. The results are presented in Table 8 and Figure 11.
Table 8 shows that VIS, NIR, and ALL data downscaling was achieved based on VCPA. The RMSECV of the VIS feature band obtained after cross-validation was larger than that for NIR and ALL, which may result in underperformance in the subsequent classification task. The intervals of the feature bands in NIR and ALL overlapped, primarily at 1000–1170 nm, 1600 nm, and 1900 nm.
(4)
SPA
Figure 12 shows the feature bands extracted by the SPA algorithm. As the number of variables increases, the RMSECV value tends to decrease. However, the decrease is no longer obvious beyond a certain number of selected variables. Selecting only the necessary variables is effective to ensure a small RMSECV value, which can improve the model’s response speed and classification performance.
As shown in Table 9, the feature bands of VIS are mainly distributed in the 405–593 nm range, with a sporadic distribution within 610–760 nm. The SPA-processed NIR values are concentrated within the range of 2162–2398 nm, while the feature bands of ALL are mainly distributed within the range of 2158–2400 nm. The two sets of feature bands overlap.

3.4. Machine Learning

(1)
The visible spectrum
Table 10 shows that the SVM classification models based on the OPC and VCPA have overall accuracy above 95% (VCPA-SVM: 97.10% > OPC-SVM: 95.65%). There is no obvious difference in classification accuracy between the two. However, VCPA uses far fewer discriminative wavelengths than the OPC dataset. Compared to PCA and SDI, the SVM models constructed based on the OPC, SPA, and VCPA exhibit higher overall accuracy than the RF model.
(2)
Near-infrared spectrum
Table 11 shows that the SPA-RF model has the highest overall accuracy of 100% in the NIR band interval. Other classification models that use feature extraction algorithms achieve data dimensionality reduction, but this can result in missing information, which may be why their detection accuracy is lower than that of the OPC dataset.
(3)
All spectra
Table 12 shows that SPA-SVM achieves 100% classification accuracy in the whole band interval, while OPC-SVM has a slightly lower accuracy of 98.55%. However, SPA-SVM uses fewer discriminative wavelengths than OPC-SVM, resulting in higher classification accuracy and a shorter response time, suggesting that SPA is a reliable feature extraction method.
This study’s results indicate that the NIR-SPA-RF and ALL-SPA-SVM models perform the best among all the machine learning models tested, achieving classification accuracy of 100%.
Of the three band intervals, VIS, NIR, and ALL, only the ALL-SDI-RF model achieves high accuracy (89.86%). The other SDI-based classification models exhibit a sharp drop in accuracy. However, the SDI has a strong correlation with a single class, as demonstrated by the considerable accuracy for healthy poplar leaves when using NIR-SDI-RF, early-stage anthracnose poplar leaves when using ALL-SDI-RF and ALL-SDI-SVM, and black spot poplar leaves when using ALL-SDI-RF, all of which achieved 100% classification accuracy. The results indicate that the SDI has practical application value.

3.5. Deep Learning

As shown in Table 13, the LSTM model based on NIR and ALL had the best overall accuracy (OA = 100%). The LSTM model demonstrated higher overall accuracy than the CNN model across all three band intervals: VIS, NIR, and ALL. Additionally, the classification accuracy of the model constructed based on NIR and ALL was higher than that of the model based on VIS.

4. Discussion and Future Work

Pathogens can affect leaves’ physiological indicators and biochemical responses, which can be observed in spectral characteristics [35,67]. As shown in Figure 6, the spectral curves of the four stages showed obvious differences in NIR. Salehi et al. studied the average reflectance curves of five wheat genotypes in the visible and near-infrared regions. They found that the differences in the reflectance values among the genotypes were greater in the near-infrared region compared to the visible region [32]. NIR can be used to determine the chemical composition of a sample, including the chlorophyll and carotenoid content, the elemental content of carbon, nitrogen, and phosphorus, and the 13C and 15N levels [3,68,69,70,71,72]. In particular, the differences between anthracnose and black spot disease at 883–1126 nm may be related to the leaf water content [73]. This is reasonable for stressed leaves, where changes in water absorption wavelengths may occur due to the collapse of internal cell structures or absorption by other substances [74]. The mean reflectance curves of the four stages showed huge differences at 1600–1750 nm, especially at 1650 nm, considered the atmospheric water vapor absorption region [75].
The model results were attributed to the commonality of the preprocessing combination rather than simple linear superposition [40]. The results showed obvious improvements when using Gaussian + CWT + VSN, Gaussian + CWT + MSC, and Gaussian + CWT + No processing, which illustrates that the combination of Gaussian and CWT can have a positive effect on raw spectral data. The study by Ojo et al. empirically investigated the applicability of various preprocessing methods to perform noise reduction and disease-associated anomaly enhancement in plant images [76].
The accuracy of NIR-SDI-RF for healthy leaves, ALL-SDI-RF and ALL-SDI-SVM for early-stage anthracnose, and ALL-SDI-RF for black spot disease was 100%. According to Al-Saddik et al., common SVIs are not disease-specific (or disease-dependent), and it would appear beneficial to design a specific index (SDI) for each infection [77]. In particular, ALL-SDI-RF achieved overall accuracy of 89.86%. The model’s accuracy was comparable to similar studies [46,62,78,79]. Al-Saddik et al. developed an SDI for ‘Flavescence Dorée’ grapevine disease identification, and the accuracy of most models based on the SDI was over 90%. In a study by Meng et al., the SDI-based models achieved an overall accuracy of 87% and 70% in southern corn rust detection and severity classification [46]. Further research on SDIs will potentially reduce growers’ spending on detection equipment, such as drones or portable spectrometers equipped with multispectral cameras, which only captures specific wavelengths to calculate SDIs to detect specific plant diseases.
Among the machine-learning-based classification models, those based on SVM algorithms generally outperform those based on RF algorithms, except for SDI models with fewer features. SVM has a clear advantage in classifying small-sample, high-dimensional data [64,80]. Detecting disease in field settings may require multiple sensors to obtain multiple features, such as the spectrum, texture, vegetation index, etc. SVM may still apply to such high-dimensional feature data.
The LSTM model generally outperformed the 1DCNN model due to its superior feature extraction and classification decision capabilities. The NIR-LSTM and ALL-LSTM models achieved classification accuracy of 100%. Tang et al. presented a CNN and bi-directional long short-term memory (Bi-LSTM)-based deep learning method (Deep6mAPred), which reached an accuracy of 0.9556 over an independent rice dataset [81]. Turkoglu proposed the multi-model LSTM-based pre-trained convolutional neural networks (MLP-CNNs) [82]. The combinations of CNNs and LSTM have application values in this area, which could be explored in future studies [83,84].
In this study, the best preprocessing combination was Gaussian + CWT + VSN, the best machine learning models based on this combination were NIR-SPA-RF and ALL-SPA-SVM, and the best deep learning detection models based on this combination were NIR-LSTM and ALL-LSTM. Future work will focus on studying timely detection techniques for poplar and overcoming the limitations of existing methods by obtaining more samples of different poplar stages from various regions. Additionally, the spectral database on poplars will be expanded, and the model will be further refined.

5. Conclusions

This study used hyperspectral reflectance data to analyze the hyperspectral characteristics of healthy poplar leaves, poplar leaves with black spot disease, and poplar leaves with early-stage and late-stage anthracnose. The optimal pretreatment combination selected was Gaussian + CWT + VSN. A detection model for poplar anthracnose was developed based on the SDI (HI, BI, ESAI, and LSAI), and the ALL-SDI-RF model performed well, with an OA value of 89.86%. Then, models constructed using typical feature extraction methods such as PCA, SPA, and VCPA were analyzed. The best detection models for poplar anthracnose (OA = 100%) were NIR-SPA-RF and ALL-SPA-SVM, based on machine learning, and NIR-LSTM and ALL-LSTM, based on deep learning. This study presents a spectral detection technology for poplar anthracnose that is efficient, non-invasive, rapid, and accurate. This study also achieved the non-invasive detection of poplar anthracnose in a controlled environment. In future studies, the sample categories will be expanded, and comparative experiments on UAV/airborne data will be carried out to evaluate the model’s generalization and to promote forestry disease detection research and forestry industry development.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/f15081309/s1, Figure S1: Performance of models constructed based on 27 preprocessing combinations: (a) accuracy of all models; (b) F1-score of all models. The X-axis 1–27 represents 27 preprocessing combinations; Figure S2: Structure of 1DCNN: the reflectance of all samples is used as the input of 1DCNN. Through convolution, pooling and full connection, four types of responses are finally obtained to achieve four-category classification; Figure S3: shows the samples’ average reflectance and sensitivity curves in different physiological states; Figure S4: Shows the cumulative principal component contributions; Table S1: Preprocessing Methods for Hyperspectral Data Analysis in Detecting Poplar Tree Diseases; Table S2: Accuracy and F1-score for 108 models. The results of best model are shown in red; Table S3: Interpreted bands of PC1 and PC2 after PCA treatment.

Author Contributions

Conceptualization, Z.J.; data curation, Z.J.; formal analysis, Z.J. and Q.D.; funding acquisition, Z.J.; investigation, Q.D. and Y.W.; methodology, Z.J., Q.D. and K.W.; project administration, Z.J. and H.J.; resources, Z.J.; software, Z.J. and Q.D.; supervision, Z.J.; validation, Z.J.; visualization, Q.D.; writing—original draft, Z.J., Q.D. and Y.W.; writing—review and editing, Z.J., Q.D. and Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Modern Agricultural Machinery Equipment and Technology Demonstration and Promotion Project of Jiangsu Province [grant number NJ2022-12]; the Primary Research and Development Plan of Jiangsu Province [grant number BE2022374]; and the Jiangsu Agricultural Science and Technology Innovation Fund [grant number CX (19)3075].

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wang, G.; Dong, Y.; Liu, X.; Yao, G.; Yu, X.; Yang, M. The current status and development of insect-resistant genetically engineered poplar in China. Front. Plant Sci. 2018, 9, 328449. [Google Scholar] [CrossRef] [PubMed]
  2. Hu, J.; Wang, L.; Yan, D.; Lu, M.-Z. Research and application of transgenic poplar in China. In Challenges and Opportunities for the World’s Forests in the 21st Century; Springer: Dordrecht, The Netherlands, 2014; pp. 567–584. [Google Scholar] [CrossRef]
  3. Mazurek, S.; Wlodarczyk, M.; Pielorz, S.; Okinczyc, P.; Kus, P.M.; Dlugosz, G.; Vidal-Yanez, D.; Szostak, R. Quantification of Salicylates and Flavonoids in Poplar Bark and Leaves Based on IR, NIR, and Raman Spectra. Molecules 2022, 27, 3954. [Google Scholar] [CrossRef] [PubMed]
  4. Meshkova, V.; Zhupinska, K.; Borysenko, O.; Zinchenko, O.; Skrylnyk, Y.; Vysotska, N. Possible Factors of Poplar Susceptibility to Large Poplar Borer Infestation. Forests 2024, 15, 882. [Google Scholar] [CrossRef]
  5. Tyśkiewicz, K.; Konkol, M.; Kowalski, R.; Rój, E.; Warmiński, K.; Krzyżaniak, M.; Gil, Ł.; Stolarski, M.J. Characterization of bioactive compounds in the biomass of black locust, poplar and willow. Trees 2019, 33, 1235–1263. [Google Scholar] [CrossRef]
  6. Gordon, H.; Fellenberg, C.; Lackus, N.D.; Archinuk, F.; Sproule, A.; Nakamura, Y.; Köllner, T.G.; Gershenzon, J.; Overy, D.P.; Constabel, C.P. CRISPR/Cas9 disruption of UGT71L1 in poplar connects salicinoid and salicylic acid metabolism and alters growth and morphology. Plant Cell 2022, 34, 2925–2947. [Google Scholar] [CrossRef] [PubMed]
  7. Wang, Z.; Yan, W.; Peng, Y.; Wan, M.; Farooq, T.H.; Fan, W.; Lei, J.; Yuan, C.; Wang, W.; Qi, Y. Biomass production and carbon stocks in poplar-crop agroforestry chronosequence in subtropical central China. Plants 2023, 12, 2451. [Google Scholar] [CrossRef]
  8. Huang, H.; Tian, C.; Huang, Y.; Huang, H. Biological control of poplar anthracnose caused by Colletotrichum gloeosporioides (Penz.) Penz. & Sacc. Egypt. J. Biol. Pest Control 2020, 30, 1–9. [Google Scholar] [CrossRef]
  9. Popp, M.; Nalley, L.; Fortin, C.; Smith, A.; Brye, K. Estimating net carbon emissions and agricultural response to potential carbon offset policies. Agron. J. 2011, 103, 1132–1143. [Google Scholar] [CrossRef]
  10. National Forestry and Grassland Administration of China. The Ninth National Forest Resources Inventory Report. 2019. Available online: http://digitalpaper.stdaily.com/http_www.kjrb.com/kjwzb/html/2023-06/16/content_554941.htm (accessed on 16 June 2023).
  11. Wang, X.; Lu, D.; Tian, C. Mucin Msb2 cooperates with the transmembrane protein Sho1 in various plant surface signal sensing and pathogenic processes in the poplar anthracnose fungus Colletotrichum gloeosporioides. Mol. Plant Pathol. 2021, 22, 1553–1573. [Google Scholar] [CrossRef]
  12. Liu, N.; Meng, F.; Tian, C. Transcriptional network in Colletotrichum gloeosporioides mutants lacking Msb2 or Msb2 and Sho1. J. Fungi 2022, 8, 207. [Google Scholar] [CrossRef]
  13. Li, X.; Sun, H.; Pei, J.; Dong, Y.; Wang, F.; Chen, H.; Sun, Y.; Wang, N.; Li, H.; Li, Y. De novo sequencing and comparative analysis of the blueberry transcriptome to discover putative genes related to antioxidants. Gene 2012, 511, 54–61. [Google Scholar] [CrossRef]
  14. Qin, X.; Tian, C.; Meng, F. Comparative Transcriptome Analysis Reveals the Effect of the DHN Melanin Biosynthesis Pathway on the Appressorium Turgor Pressure of the Poplar Anthracnose-Causing Fungus Colletotrichum gloeosporioides. Int. J. Mol. Sci. 2023, 24, 7411. [Google Scholar] [CrossRef] [PubMed]
  15. Pobłocka-Olech, L.; Głód, D.; Jesionek, A.; Łuczkiewicz, M.; Krauze-Baranowska, M. Studies on the polyphenolic composition and the antioxidant properties of the leaves of poplar (Populus spp.) various species and hybrids. Chem. Biodivers. 2021, 18, e2100227. [Google Scholar] [CrossRef] [PubMed]
  16. Huang, H. Research on Antimicrobial Activity of Antagonistic Endophytic Bacterium against Colletotrichum gloeosporioides. Ph.D. Thesis, Beijing Forestry University, Beijing, China, 2020. [Google Scholar]
  17. Zhang, L.; Bao, H.; Meng, F.; Ren, Y.; Tian, C. Transcriptome and metabolome reveal the role of flavonoids in poplar resistance to poplar anthracnose. Ind. Crops Prod. 2023, 197, 116537. [Google Scholar] [CrossRef]
  18. Wang, X.; Lu, D.; Tian, C. CgEnd3 regulates endocytosis, appressorium formation, and virulence in the poplar anthracnose fungus Colletotrichum gloeosporioides. Int. J. Mol. Sci. 2021, 22, 4029. [Google Scholar] [CrossRef] [PubMed]
  19. Zhang, Y.; He, W.; Yan, D.-H. Histopathologic characterization of the process of Marssonina brunnea infection in poplar leaves. Can. J. For. Res. 2018, 48, 1302–1310. [Google Scholar] [CrossRef]
  20. Wang, J. The occurrence pattern and comprehensive prevention and control techniques of poplar black spot disease. Contemp. Hortic. 2023, 46, 121–123. [Google Scholar] [CrossRef]
  21. Cellerino, G.P. Review of Fungal Diseases in Poplar; AC492/E; Food and Agriculture Organization of The United Nations: Rome, Italy, 1999. [Google Scholar]
  22. Wang, F.; Zhao, C.; Yang, H.; Jiang, H.; Li, L.; Yang, G. Non-destructive and in-site estimation of apple quality and maturity by hyperspectral imaging. Comput. Electron. Agric. 2022, 195, 106843. [Google Scholar] [CrossRef]
  23. Zhang, Z.; Yin, X.; Ma, C. Development of simplified models for the nondestructive testing of rice with husk starch content using hyperspectral imaging technology. Anal. Methods 2019, 11, 5910–5918. [Google Scholar] [CrossRef]
  24. Jia, B.; Wang, W.; Ni, X.; Lawrence, K.C.; Zhuang, H.; Yoon, S.-C.; Gao, Z. Essential processing methods of hyperspectral images of agricultural and food products. Chemom. Intell. Lab. Syst. 2020, 198, 103936. [Google Scholar] [CrossRef]
  25. Zhang, S.; Yin, Y.; Liu, C.; Li, J.; Sun, X.; Wu, J. Discrimination of wheat flour grade based on PSO-SVM of hyperspectral technique. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2023, 302, 123050. [Google Scholar] [CrossRef] [PubMed]
  26. Siripatrawan, U.; Makino, Y. Hyperspectral imaging coupled with machine learning for classification of anthracnose infection on mango fruit. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2024, 309, 123825. [Google Scholar] [CrossRef] [PubMed]
  27. Kim, S.-R.; Lee, W.-K.; Lim, C.-H.; Kim, M.; Kafatos, M.C.; Lee, S.-H.; Lee, S.-S. Hyperspectral analysis of pine wilt disease to determine an optimal detection index. Forests 2018, 9, 115. [Google Scholar] [CrossRef]
  28. Xie, C.; He, Y. Spectrum and image texture features analysis for early blight disease detection on eggplant leaves. Sensors 2016, 16, 676. [Google Scholar] [CrossRef] [PubMed]
  29. Śmigaj, M. Hyperspectral, Thermal and LiDAR Remote Sensing for Red Band Needle Blight Detection in Pine Plantation Forests; Newcastle University: Newcastle upon Tyne, UK, 2018. [Google Scholar]
  30. Zahir, S.A.D.M.; Omar, A.F.; Jamlos, M.F.; Azmi, M.A.M.; Muncan, J. A review of visible and near-infrared (Vis-NIR) spectroscopy application in plant stress detection. Sens. Actuators A Phys. 2022, 338, 113468. [Google Scholar] [CrossRef]
  31. Morellos, A.; Tziotzios, G.; Orfanidou, C.; Pantazi, X.E.; Sarantaris, C.; Maliogka, V.; Alexandridis, T.K.; Moshou, D. Non-destructive early detection and quantitative severity stage classification of Tomato Chlorosis Virus (ToCV) infection in young tomato plants using vis–NIR Spectroscopy. Remote Sens. 2020, 12, 1920. [Google Scholar] [CrossRef]
  32. Salehi, B.; Mireei, S.A.; Jafari, M.; Hemmat, A.; Majidi, M.M. Integrating in-field Vis-NIR leaf spectroscopy and deep learning feature extraction for growth-stage dependent and independent genotyping of wheat plants. Biosyst. Eng. 2024, 238, 188–199. [Google Scholar] [CrossRef]
  33. Hossain, M.A. UV–Visible–NIR camouflage textiles with natural plant based natural dyes on natural fibre against woodland combat background for defence protection. Sci. Rep. 2023, 13, 5021. [Google Scholar] [CrossRef]
  34. Rong, D.; Wang, H.; Ying, Y.; Zhang, Z.; Zhang, Y. Peach variety detection using VIS-NIR spectroscopy and deep learning. Comput. Electron. Agric. 2020, 175, 105553. [Google Scholar] [CrossRef]
  35. Meng, Y.; Zhang, Y.; Li, C.; Zhao, J.; Wang, Z.; Wang, C.; Li, Y. Prediction of the Carbon Content of Six Tree Species from Visible-Near-Infrared Spectroscopy. Forests 2021, 12, 1233. [Google Scholar] [CrossRef]
  36. Shen, L.; Gao, M.; Yan, J.; Li, Z.-L.; Leng, P.; Yang, Q.; Duan, S.-B. Hyperspectral estimation of soil organic matter content using different spectral preprocessing techniques and PLSR method. Remote Sens. 2020, 12, 1206. [Google Scholar] [CrossRef]
  37. Bian, X. Spectral preprocessing methods. In Chemometric Methods in Analytical Spectroscopy Technology; Springer: Berlin/Heidelberg, Germany, 2022; pp. 111–168. [Google Scholar]
  38. Helin, R.; Indahl, U.G.; Tomic, O.; Liland, K.H. On the possible benefits of deep learning for spectral preprocessing. J. Chemom. 2022, 36, e3374. [Google Scholar] [CrossRef]
  39. Shen, L.; Gao, M.; Yan, J.; Wang, Q.; Shen, H. Winter Wheat SPAD Value Inversion Based on Multiple Pretreatment Methods. Remote Sens. 2022, 14, 4660. [Google Scholar] [CrossRef]
  40. Gerretzen, J.; Szymańska, E.; Bart, J.; Davies, A.N.; van Manen, H.-J.; van den Heuvel, E.R.; Jansen, J.J.; Buydens, L.M. Boosting model performance and interpretation by entangling preprocessing selection and variable selection. Anal. Chim. Acta 2016, 938, 44–52. [Google Scholar] [CrossRef] [PubMed]
  41. Mishra, P.; Biancolillo, A.; Roger, J.M.; Marini, F.; Rutledge, D.N. New data preprocessing trends based on ensemble of multiple preprocessing techniques. TrAC Trends Anal. Chem. 2020, 132, 116045. [Google Scholar] [CrossRef]
  42. Bian, X.; Wang, K.; Tan, E.; Diwu, P.; Zhang, F.; Guo, Y. A selective ensemble preprocessing strategy for near-infrared spectral quantitative analysis of complex samples. Chemom. Intell. Lab. Syst. 2020, 197, 103916. [Google Scholar] [CrossRef]
  43. Khalid, S.; Khalil, T.; Nasreen, S. A survey of feature selection and feature extraction techniques in machine learning. In Proceedings of the 2014 Science and Information Conference, London, UK, 27–329 August 2014; pp. 372–378. [Google Scholar]
  44. Cen, H.; Lu, R.; Zhu, Q.; Mendoza, F. Nondestructive detection of chilling injury in cucumber fruit using hyperspectral imaging with feature selection and supervised classification. Postharvest Biol. Technol. 2016, 111, 352–361. [Google Scholar] [CrossRef]
  45. Golhani, K.; Balasundram, S.K.; Vadamalai, G.; Pradhan, B. A review of neural networks in plant disease detection using hyperspectral data. Inf. Process. Agric. 2018, 5, 354–371. [Google Scholar] [CrossRef]
  46. Meng, R.; Lv, Z.; Yan, J.; Chen, G.; Zhao, F.; Zeng, L.; Xu, B. Development of spectral disease indices for southern corn rust detection and severity classification. Remote Sens. 2020, 12, 3233. [Google Scholar] [CrossRef]
  47. Kouadio, L.; El Jarroudi, M.; Belabess, Z.; Laasli, S.-E.; Roni, M.Z.K.; Amine, I.D.I.; Mokhtari, N.; Mokrini, F.; Junk, J.; Lahlali, R. A Review on UAV-Based Applications for Plant Disease Detection and Monitoring. Remote Sens. 2023, 15, 4273. [Google Scholar] [CrossRef]
  48. Cucho-Padin, G.; Loayza, H.; Palacios, S.; Balcazar, M.; Carbajal, M.; Quiroz, R. Development of low-cost remote sensing tools and methods for supporting smallholder agriculture. Appl. Geomat. 2020, 12, 247–263. [Google Scholar] [CrossRef]
  49. Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
  50. Selvin, S.; Vinayakumar, R.; Gopalakrishnan, E.; Menon, V.K.; Soman, K. Stock price prediction using LSTM, RNN and CNN-sliding window model. In Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, 13–16 September 2017; pp. 1643–1647. [Google Scholar]
  51. He, T.; Xie, C.; Liu, Q.; Guan, S.; Liu, G. Evaluation and comparison of random forest and A-LSTM networks for large-scale winter wheat identification. Remote Sens. 2019, 11, 1665. [Google Scholar] [CrossRef]
  52. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  53. Mozaffari, M.H.; Tay, L.-L. Overfitting One-Dimensional convolutional neural networks for Raman spectra identification. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2022, 272, 120961. [Google Scholar] [CrossRef]
  54. Pang, L.; Men, S.; Yan, L.; Xiao, J. Rapid vitality estimation and prediction of corn seeds based on spectra and images using deep learning and hyperspectral imaging techniques. IEEE Access 2020, 8, 123026–123036. [Google Scholar] [CrossRef]
  55. Janiesch, C.; Zschech, P.; Heinrich, K. Machine learning and deep learning. Electron. Mark. 2021, 31, 685–695. [Google Scholar] [CrossRef]
  56. Salman, Z.; Muhammad, A.; Piran, M.J.; Han, D. Crop-saving with AI: Latest trends in deep learning techniques for plant pathology. Front. Plant Sci. 2023, 14, 1224709. [Google Scholar] [CrossRef]
  57. Danner, M.; Locherer, M.; Hank, T.; Richter, K. Spectral Sampling with the ASD FIELDSPEC 4. 2015. Available online: https://gfzpublic.gfz-potsdam.de/rest/items/item_1388298/component/file_1388299/content (accessed on 16 June 2023).
  58. Abdulridha, J.; Ampatzidis, Y.; Kakarla, S.C.; Roberts, P. Detection of target spot and bacterial spot diseases in tomato using UAV-based and benchtop-based hyperspectral imaging techniques. Precis. Agric. 2020, 21, 955–978. [Google Scholar] [CrossRef]
  59. Rabatel, G.; Marini, F.; Walczak, B.; Roger, J.M. VSN: Variable sorting for normalization. J. Chemom. 2019, 34, e3164. [Google Scholar] [CrossRef]
  60. Yao, Z.; Lei, Y.; He, D. Early visual detection of wheat stripe rust using visible/near-infrared hyperspectral imaging. Sensors 2019, 19, 952. [Google Scholar] [CrossRef] [PubMed]
  61. Ma, L.; Zhang, Y.; Zhang, Y.; Wang, J.; Li, J.; Gao, Y.; Wang, X.; Wu, L. Rapid Nondestructive Detection of Chlorophyll Content in Muskmelon Leaves under Different Light Quality Treatments. Agronomy 2022, 12, 3223. [Google Scholar] [CrossRef]
  62. Mahlein, A.-K.; Rumpf, T.; Welke, P.; Dehne, H.-W.; Plümer, L.; Steiner, U.; Oerke, E.-C. Development of spectral indices for detecting and identifying plant diseases. Remote Sens. Environ. 2013, 128, 21–30. [Google Scholar] [CrossRef]
  63. Wu, G.; Fang, Y.; Jiang, Q.; Cui, M.; Li, N.; Ou, Y.; Diao, Z.; Zhang, B. Early identification of strawberry leaves disease utilizing hyperspectral imaging combing with spectral features, multiple vegetation indices and textural features. Comput. Electron. Agric. 2023, 204, 107553. [Google Scholar] [CrossRef]
  64. Liu, X.; Gao, C.; Li, P. A comparative analysis of support vector machines and extreme learning machines. Neural Netw. 2012, 33, 58–66. [Google Scholar] [CrossRef]
  65. Soydaner, D. A comparison of optimization algorithms for deep learning. Int. J. Pattern Recognit. Artif. Intell. 2020, 34, 2052013. [Google Scholar] [CrossRef]
  66. Saleem, M.H.; Potgieter, J.; Arif, K.M. Plant disease classification: A comparative evaluation of convolutional neural networks and deep learning optimizers. Plants 2020, 9, 1319. [Google Scholar] [CrossRef]
  67. Zhang, J.; Huang, Y.; Pu, R.; Gonzalez-Moreno, P.; Yuan, L.; Wu, K.; Huang, W. Monitoring plant diseases and pests through remote sensing technology: A review. Comput. Electron. Agric. 2019, 165, 104943. [Google Scholar] [CrossRef]
  68. Gillon, D.; Houssard, C.; Joffre, R. Using near-infrared reflectance spectroscopy to predict carbon, nitrogen and phosphorus content in heterogeneous plant material. Oecologia 1999, 118, 173–182. [Google Scholar] [CrossRef]
  69. Petisco, C.; García-Criado, B.; Mediavilla, S.; Vázquez de Aldana, B.R.; Zabalgogeazcoa, I.; García-Ciudad, A. Near-infrared reflectance spectroscopy as a fast and non-destructive tool to predict foliar organic constituents of several woody species. Anal. Bioanal. Chem. 2006, 386, 1823–1833. [Google Scholar] [CrossRef]
  70. Asner, G.P.; Martin, R.E. Spectral and chemical analysis of tropical forests: Scaling from leaf to canopy levels. Remote Sens. Environ. 2008, 112, 3958–3970. [Google Scholar] [CrossRef]
  71. Kleinebecker, T.; Schmidt, S.R.; Fritz, C.; Smolders, A.J.; Hölzel, N. Prediction of δ13C and δ15N in plant tissues with near-infrared reflectance spectroscopy. New Phytol. 2009, 184, 732–739. [Google Scholar] [CrossRef] [PubMed]
  72. Martín-Tornero, E.; de Jorge Páscoa, R.N.M.; Espinosa-Mansilla, A.; Martín-Merás, I.D.; Lopes, J.A. Comparative quantification of chlorophyll and polyphenol levels in grapevine leaves sampled from different geographical locations. Sci. Rep. 2020, 10, 6246. [Google Scholar] [CrossRef] [PubMed]
  73. Ceccato, P.; Flasse, S.; Gregoire, J.-M. Designing a spectral index to estimate vegetation water content from remote sensing data: Part 2. Validation and applications. Remote Sens. Environ. 2002, 82, 198–207. [Google Scholar] [CrossRef]
  74. Sinha, R.; Khot, L.R.; Rathnayake, A.P.; Gao, Z.; Naidu, R.A. Visible-near infrared spectroradiometry-based detection of grapevine leafroll-associated virus 3 in a red-fruited wine grape cultivar. Comput. Electron. Agric. 2019, 162, 165–173. [Google Scholar] [CrossRef]
  75. Rasheed, F.; Delagrange, S.; Lorenzetti, F. Detection of plant water stress using leaf spectral responses in three poplar hybrids prior to the onset of physiological effects. Int. J. Remote Sens. 2020, 41, 5127–5146. [Google Scholar] [CrossRef]
  76. Ojo, M.O.; Zahid, A. Improving deep learning classifiers performance via preprocessing and class imbalance approaches in a plant disease detection pipeline. Agronomy 2023, 13, 887. [Google Scholar] [CrossRef]
  77. Al-Saddik, H.; Simon, J.-C.; Cointault, F. Development of spectral disease indices for ‘Flavescence Dorée’grapevine disease identification. Sensors 2017, 17, 2772. [Google Scholar] [CrossRef] [PubMed]
  78. Guo, A.; Huang, W.; Ye, H.; Dong, Y.; Ma, H.; Ren, Y.; Ruan, C. Identification of wheat yellow rust using spectral and texture features of hyperspectral images. Remote Sens. 2020, 12, 1419. [Google Scholar] [CrossRef]
  79. Zhang, J.; Pu, R.; Huang, W.; Yuan, L.; Luo, J.; Wang, J. Using in-situ hyperspectral data for detecting and discriminating yellow rust disease from nutrient stresses. Field Crops Res. 2012, 134, 165–174. [Google Scholar] [CrossRef]
  80. Xu, K.; Sun, L.-L.; Wang, J.; Liu, S.-X.; Yang, H.-W.; Xu, N.; Zhang, H.-J.; Wang, J.-X. Potassium deficiency diagnosis method of apple leaves based on MLR-LDA-SVM. Front. Plant Sci. 2023, 14, 1271933. [Google Scholar] [CrossRef] [PubMed]
  81. Tang, X.; Zheng, P.; Li, X.; Wu, H.; Wei, D.-Q.; Liu, Y.; Huang, G. Deep6mAPred: A CNN and Bi-LSTM-based deep learning method for predicting DNA N6-methyladenosine sites across plant species. Methods 2022, 204, 142–150. [Google Scholar] [CrossRef] [PubMed]
  82. Turkoglu, M.; Hanbay, D.; Sengur, A. Multi-model LSTM-based convolutional neural networks for detection of apple diseases and pests. J. Ambient Intell. Humaniz. Comput. 2022, 13, 3335–3345. [Google Scholar] [CrossRef]
  83. Crisóstomo de Castro Filho, H.; Abílio de Carvalho Júnior, O.; Ferreira de Carvalho, O.L.; Pozzobon de Bem, P.; dos Santos de Moura, R.; Olino de Albuquerque, A.; Rosa Silva, C.; Guimaraes Ferreira, P.H.; Fontes Guimarães, R.; Trancoso Gomes, R.A. Rice crop detection using LSTM, Bi-LSTM, and machine learning models from Sentinel-1 time series. Remote Sens. 2020, 12, 2655. [Google Scholar] [CrossRef]
  84. Padmavathi, B.; BhagyaLakshmi, A.; Vishnupriya, G.; Datchanamoorthy, K. IoT-based prediction and classification framework for smart farming using adaptive multi-scale deep networks. Expert Syst. Appl. 2024, 254, 124318. [Google Scholar] [CrossRef]
Figure 1. Experimental area. Poplar (Populus L.) forest in Qixia District, Nanjing City, Jiangsu Province, China. (a) UAV aerial view and (b,c) live images: large area, lush poplar trees, no other tree species, and no human intervention in the growth cycle.
Figure 1. Experimental area. Poplar (Populus L.) forest in Qixia District, Nanjing City, Jiangsu Province, China. (a) UAV aerial view and (b,c) live images: large area, lush poplar trees, no other tree species, and no human intervention in the growth cycle.
Forests 15 01309 g001
Figure 2. Four stages of poplar leaves: (a) healthy; (b) black spot disease; (c) early-stage anthracnose; (d) late-stage anthracnose. The numbers on the lower right side show the scale of the infected area on the leaf.
Figure 2. Four stages of poplar leaves: (a) healthy; (b) black spot disease; (c) early-stage anthracnose; (d) late-stage anthracnose. The numbers on the lower right side show the scale of the infected area on the leaf.
Forests 15 01309 g002
Figure 3. Hyperspectral acquisition laboratory: (a) hyperspectral data acquisition system: halogen lamp, calibrating panel, spectrometer, workstation, experimental table; (b) hyperspectral reflectance data for all samples: green shading highlights the visible spectrum region (400–760 nm), and orange shading highlights the near-infrared spectrum region (761–2400 nm).
Figure 3. Hyperspectral acquisition laboratory: (a) hyperspectral data acquisition system: halogen lamp, calibrating panel, spectrometer, workstation, experimental table; (b) hyperspectral reflectance data for all samples: green shading highlights the visible spectrum region (400–760 nm), and orange shading highlights the near-infrared spectrum region (761–2400 nm).
Forests 15 01309 g003
Figure 4. Methodology flowchart: the blue arrows indicate the order of processing.
Figure 4. Methodology flowchart: the blue arrows indicate the order of processing.
Forests 15 01309 g004
Figure 5. Mean reflectance curves of poplar leaves in four stages.
Figure 5. Mean reflectance curves of poplar leaves in four stages.
Forests 15 01309 g005
Figure 6. Reflectance curves after different preprocessing: (a) SG; (b) Gaussian; (c) CWT; (d) airPLS; (e) MSC; and (f) VSN.
Figure 6. Reflectance curves after different preprocessing: (a) SG; (b) Gaussian; (c) CWT; (d) airPLS; (e) MSC; and (f) VSN.
Forests 15 01309 g006
Figure 7. Comprehensive scores for all preprocessing methods; the circle size indicates the comprehensive score’s value. Gaussian + CWT + VSN had the highest comprehensive score, which was 0.9531.
Figure 7. Comprehensive scores for all preprocessing methods; the circle size indicates the comprehensive score’s value. Gaussian + CWT + VSN had the highest comprehensive score, which was 0.9531.
Forests 15 01309 g007
Figure 8. The distributions of PC1 and PC2, and the 95% confidence ellipses: (ac) are the distributions of the various types of stages depicted by the confidence ellipses of 95% for the three cases of visible, near-infrared, and all spectra, respectively.
Figure 8. The distributions of PC1 and PC2, and the 95% confidence ellipses: (ac) are the distributions of the various types of stages depicted by the confidence ellipses of 95% for the three cases of visible, near-infrared, and all spectra, respectively.
Forests 15 01309 g008aForests 15 01309 g008b
Figure 9. Relevance of single wavelengths for healthy, black spot disease, early-stage anthracnose and late-stage anthracnose according to the RELIEF-F algorithm: (ad) are the relevance of single wavelengths with the four stages within the VIS, respectively; (eh) are the relevance of single wavelengths with the four stages within the NIR, respectively; and (il) are the relevance of single wavelengths with the four stages within the ALL, respectively. The bands boxed in red are the bands with the first eight highest Relevance values, from which one is selected as the element of SDI.
Figure 9. Relevance of single wavelengths for healthy, black spot disease, early-stage anthracnose and late-stage anthracnose according to the RELIEF-F algorithm: (ad) are the relevance of single wavelengths with the four stages within the VIS, respectively; (eh) are the relevance of single wavelengths with the four stages within the NIR, respectively; and (il) are the relevance of single wavelengths with the four stages within the ALL, respectively. The bands boxed in red are the bands with the first eight highest Relevance values, from which one is selected as the element of SDI.
Forests 15 01309 g009aForests 15 01309 g009bForests 15 01309 g009cForests 15 01309 g009d
Figure 10. Relevance of all possible normalized wavelength differences for healthy, black spot disease, early-stage anthracnose and late-stage anthracnose according to the RELIEF-F algorithm: (ad) are the relevance of all normalized wavelength difference with the 4 stages in VIS; (eh) are the relevance of all normalized wavelength differences with 4 stages concerning NIR; (il) are the relevance of all normalized wavelength differences with 4 stages in ALL. The color bar on the right shows the relevance of the color.
Figure 10. Relevance of all possible normalized wavelength differences for healthy, black spot disease, early-stage anthracnose and late-stage anthracnose according to the RELIEF-F algorithm: (ad) are the relevance of all normalized wavelength difference with the 4 stages in VIS; (eh) are the relevance of all normalized wavelength differences with 4 stages concerning NIR; (il) are the relevance of all normalized wavelength differences with 4 stages in ALL. The color bar on the right shows the relevance of the color.
Forests 15 01309 g010aForests 15 01309 g010b
Figure 11. RMSECV values obtained from 50 VCPA treatments of the dataset and the distribution of the obtained eigenbands: (ac) are the RMSECV values of VIS, NIR, ALL after each VCPA treatment; (df) denote the distribution of the VIS, NIR, ALL eigenbands after VCPA treatment. The red boxes in (df) show the feature bands obtained by VCPA processing.
Figure 11. RMSECV values obtained from 50 VCPA treatments of the dataset and the distribution of the obtained eigenbands: (ac) are the RMSECV values of VIS, NIR, ALL after each VCPA treatment; (df) denote the distribution of the VIS, NIR, ALL eigenbands after VCPA treatment. The red boxes in (df) show the feature bands obtained by VCPA processing.
Forests 15 01309 g011
Figure 12. RMSECV values of the dataset after SPA processing and the distribution of the obtained eigenbands: (a,c,e) show the RMSECV values for each generation within a finite number of iterations; (b,d,f) show the distribution of the eigenbands of VIS, NIR, ALL after SPA processing. The red boxes in (a,c,e) show the number of variables selected. The red boxes in (b,d,f) show the feature bands obtained by SPA processing.
Figure 12. RMSECV values of the dataset after SPA processing and the distribution of the obtained eigenbands: (a,c,e) show the RMSECV values for each generation within a finite number of iterations; (b,d,f) show the distribution of the eigenbands of VIS, NIR, ALL after SPA processing. The red boxes in (a,c,e) show the number of variables selected. The red boxes in (b,d,f) show the feature bands obtained by SPA processing.
Forests 15 01309 g012aForests 15 01309 g012b
Table 1. Sample size for different stages.
Table 1. Sample size for different stages.
Poplar Sample CategorySample Size
Healthy55
Black spot disease55
Early-stage anthracnose54
Late-stage anthracnose65
Table 2. Preprocessing combinations. Smoothing uses SG or Gaussian, baseline correction uses airPLS or CWT, scatter correction uses VSN or MSC, and ‘no processing’ means no smoothing/baseline correction/scatter correction. The combination idea is based on the DoE method.
Table 2. Preprocessing combinations. Smoothing uses SG or Gaussian, baseline correction uses airPLS or CWT, scatter correction uses VSN or MSC, and ‘no processing’ means no smoothing/baseline correction/scatter correction. The combination idea is based on the DoE method.
ExperimentSmoothingBaseline CorrectionScatter Correction
1SGairPLSVSN
2SGairPLSMSC
3SGairPLSNo processing
4SGCWTVSN
5SGCWTMSC
6SGCWTNo processing
7SGNo processingVSN
8SGNo processingMSC
9SGNo processingNo processing
10GaussianairPLSVSN
11GaussianairPLSMSC
12GaussianairPLSNo processing
13GaussianCWTVSN
14GaussianCWTMSC
15GaussianCWTNo processing
16GaussianNo processingVSN
17GaussianNo processingMSC
18GaussianNo processingNo processing
19No processingairPLSVSN
20No processingairPLSMSC
21No processingairPLSNo processing
22No processingCWTVSN
23No processingCWTMSC
24No processingCWTNo processing
25No processingNo processingVSN
26No processingNo processingMSC
27No processingNo processingNo processing
Table 3. Definition of the confusion matrix.
Table 3. Definition of the confusion matrix.
Confusion MatrixPredicted Class
Early-Stage AnthracnoseLate-Stage AnthracnoseHealthyBlack Spot
Actual ClassEarly-stage AnthracnoseTEarlyFEarly-LateFEarly-HealthyFEarly-Black
Late-stage AnthracnoseFLate-EarlyTLateFLate-HealthyFLate-Black
HealthyFHealthy-EarlyFHealthy-LateTHealthyFHealthy-Black
Black Spot DiseaseFBlack-EarlyFBlack-LateFBlack-HealthyTBlack
Table 4. Ablation study of several specific preprocessing combinations: combination (a) means no preprocessing is performed on the original data; combinations (b–e) perform the corresponding processing in sequence. Mean Value is the average of the accuracy and F1-score obtained by all models based on the preprocessing combination. “√” shows that the method is used, and “×” shows that the method is not used.
Table 4. Ablation study of several specific preprocessing combinations: combination (a) means no preprocessing is performed on the original data; combinations (b–e) perform the corresponding processing in sequence. Mean Value is the average of the accuracy and F1-score obtained by all models based on the preprocessing combination. “√” shows that the method is used, and “×” shows that the method is not used.
PreprocessingGaussianairPLSCWTVSNMSCMean Value
(a) No processing×××××0.7040
(b) Gaussian + CWT + VSN××0.9497
(c) Gaussian + CWT + MSC××0.9392
(d) Gaussian + CWT×××0.8954
(e) with only airPLS××××0.6892
Table 5. Weights of different indicators.
Table 5. Weights of different indicators.
ejdjwj (%)
RF-ACC0.9590.04113.596
RF-F1-score0.9610.03912.906
SVM-ACC0.9590.04113.658
SVM-F1-score0.9580.04213.893
RNN-ACC0.9610.03912.754
RNN-F1-score0.9620.03812.593
1DCNN-ACC0.9680.03210.621
1DCNN-F1-score0.970.039.979
Table 6. Formula for calculating the healthy index (HI) and the black spot index (BI).
Table 6. Formula for calculating the healthy index (HI) and the black spot index (BI).
DataHealthy Index (HI)Black Spot Index (BI)
VISHI = (R633 − R432)/(R633 + R432) + R432BI = (R539 − R438)/(R539 + R438) + R440
NIRHI = (R823 − R1832)/(R823 + R1832) + R769BI = (R2007 − R2192)/(R2007 + R2192) + R947
ALLHI = (R1188 − R2244)/(R1188 + R2244) + R1056BI = (R1691 − R2385)/(R1691 + R2385) − 0.4∙R1114
Table 7. Formula for calculating the early-stage anthracnose index (ESAI) and the late-stage anthracnose index (LSAI).
Table 7. Formula for calculating the early-stage anthracnose index (ESAI) and the late-stage anthracnose index (LSAI).
DataEarly-Stage Anthracnose Index (ESAI)Late-Stage Anthracnose Index (LSAI)
VISESAI = (R760 − R713)/(R760 + R713) − R608LSAI = (R696 − R706)/(R696 + R706) − R710
NIRESAI = (R856 − R818)/(R856 + R818) − R806LSAI = (R906 − R1892)/(R906 + R1892) + R905
ALLESAI = (R2170 − R1121)/(R2170 + R1121) − R1170LSAI = (R1259 − R2303)/(R1259 + R2303) + R1260
Table 8. Number of bands, wavelength, and RMSECV values of dataset.
Table 8. Number of bands, wavelength, and RMSECV values of dataset.
DatasetNumberWavelength (nm)RMSECV
VIS114124444454570.4860
459469470512
574595728
NIR1010081020105310920.3358
1161117716011634
16751905
ALL106931053108410940.3189
1114111611611633
19051906
Table 9. Number of bands and RMSECV values of dataset.
Table 9. Number of bands and RMSECV values of dataset.
DatasetNumberRMSECV
VIS490.44826
NIR500.33442
ALL820.35839
Table 10. OA and A c c E a r l y , A c c L a t e , A c c H e a l t h y , A c c B l a c k   S p o t of Machine Learning Models on the VIS Dataset.
Table 10. OA and A c c E a r l y , A c c L a t e , A c c H e a l t h y , A c c B l a c k   S p o t of Machine Learning Models on the VIS Dataset.
DataClassifierOA
(%)
A c c E a r l y
(%)
A c c L a t e
(%)
A c c H e a l t h y
(%)
A c c B l a c k   S p o t
(%)
OPCRF92.75100.0096.15100.0077.78
SVM95.6590.9196.1594.12100.00
PCARF86.96100.0080.7777.7888.89
SVM79.7190.9180.7770.5980.00
SDIRF71.0193.7557.6977.7866.67
SVM66.6790.9153.8576.4760.00
SPARF89.86100.0092.31100.0072.22
SVM92.7590.9192.3194.1293.33
VCPARF84.0687.5084.6288.8977.78
SVM97.10100.0092.31100.00100.00
Table 11. OA and A c c E a r l y , A c c L a t e , A c c H e a l t h y , A c c B l a c k   S p o t of Machine Learning Models on the NIR Dataset.
Table 11. OA and A c c E a r l y , A c c L a t e , A c c H e a l t h y , A c c B l a c k   S p o t of Machine Learning Models on the NIR Dataset.
DataClassifierOA
(%)
A c c E a r l y
(%)
A c c L a t e
(%)
A c c H e a l t h y
(%)
A c c B l a c k   S p o t
(%)
OPCRF95.6593.7592.31100.00100.00
SVM98.5590.91100.00100.00100.00
PCARF92.7593.7596.1588.8988.89
SVM89.8690.9188.4682.35100.00
SDIRF69.5775.0053.85100.0072.22
SVM63.7754.5557.6976.4766.67
SPARF100.00100.00100.00100.00100.00
SVM94.2080.0096.15100.00100.00
VCPARF94.20100.0084.62100.00100.00
SVM94.2081.8292.31100.00100.00
Table 12. OA and A c c E a r l y , A c c L a t e , A c c H e a l t h y , A c c B l a c k   S p o t of Machine Learning Models on the ALL Dataset.
Table 12. OA and A c c E a r l y , A c c L a t e , A c c H e a l t h y , A c c B l a c k   S p o t of Machine Learning Models on the ALL Dataset.
DataClassifierOA
(%)
A c c E a r l y
(%)
A c c L a t e
(%)
A c c H e a l t h y
(%)
A c c B l a c k   S p o t
(%)
OPCRF95.6593.7592.31100.00100.00
SVM98.55100.0096.15100.00100.00
PCARF88.4187.5084.62100.0088.89
SVM95.65100.0088.46100.00100.00
SDIRF89.86100.0076.9288.89100.00
SVM68.12100.0050.0064.7180.00
SPARF94.20100.0085.19100.00100.00
SVM100.00100.00100.00100.00100.00
VCPARF95.6593.7592.31100.00100.00
SVM89.8690.9184.6288.24100.00
Table 13. OA and A c c E a r l y , A c c L a t e , A c c H e a l t h y , A c c B l a c k   S p o t of Deep Learning Models.
Table 13. OA and A c c E a r l y , A c c L a t e , A c c H e a l t h y , A c c B l a c k   S p o t of Deep Learning Models.
DataClassifierOA
(%)
A c c E a r l y
(%)
A c c L a t e
(%)
A c c H e a l t h y
(%)
A c c B l a c k   S p o t
(%)
VISLSTM92.7590.9192.3188.24100.00
CNN75.3680.0080.7740.00100.00
NIRLSTM100.00100.00100.00100.00100.00
CNN91.3080.00100.0080.00100.00
ALLLSTM100.00100.00100.00100.00100.00
CNN84.0680.0084.6280.0092.31
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jia, Z.; Duan, Q.; Wang, Y.; Wu, K.; Jiang, H. Detection Model and Spectral Disease Indices for Poplar (Populus L.) Anthracnose Based on Hyperspectral Reflectance. Forests 2024, 15, 1309. https://doi.org/10.3390/f15081309

AMA Style

Jia Z, Duan Q, Wang Y, Wu K, Jiang H. Detection Model and Spectral Disease Indices for Poplar (Populus L.) Anthracnose Based on Hyperspectral Reflectance. Forests. 2024; 15(8):1309. https://doi.org/10.3390/f15081309

Chicago/Turabian Style

Jia, Zhicheng, Qifeng Duan, Yue Wang, Ke Wu, and Hongzhe Jiang. 2024. "Detection Model and Spectral Disease Indices for Poplar (Populus L.) Anthracnose Based on Hyperspectral Reflectance" Forests 15, no. 8: 1309. https://doi.org/10.3390/f15081309

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop