Next Article in Journal
Scientific and Technological Innovation Effects on High-Quality Agricultural Development: Spatial Boundaries and Mechanisms
Previous Article in Journal
Foliar H2O2 Application Improve the Photochemical and Osmotic Adjustment of Tomato Plants Subjected to Drought
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hyperspectral Imaging and Machine Learning: A Promising Tool for the Early Detection of Tetranychus urticae Koch Infestation in Cotton

by
Mariana Yamada
*,
Leonardo Vinicius Thiesen
,
Fernando Henrique Iost Filho
and
Pedro Takao Yamamoto
Department of Entomology and Acarology, University of São Paulo, Piracicaba 13418-900, Brazil
*
Author to whom correspondence should be addressed.
Agriculture 2024, 14(9), 1573; https://doi.org/10.3390/agriculture14091573
Submission received: 24 July 2024 / Revised: 24 August 2024 / Accepted: 29 August 2024 / Published: 10 September 2024
(This article belongs to the Section Digital Agriculture)

Abstract

:
Monitoring Tetranychus urticae Koch in cotton crops is challenging due to the vast crop areas and clustered mite attacks, hindering early infestation detection. Hyperspectral imaging offers a solution to such a challenge by capturing detailed spectral information for more accurate pest detection. This study evaluated machine learning models for classifying T. urticae infestation levels in cotton using proximal hyperspectral remote sensing. Leaf reflection data were collected over 21 days, covering various infestation levels: no infestation (0 mites/leaf), low (1–10), medium (11–30), and high (>30). Data were preprocessed, and spectral bands were selected to train six machine learning models, including Random Forest (RF), Principal Component Analysis–Linear Discriminant Analysis (PCA-LDA), Feedforward Neural Network (FNN), Support Vector Machine (SVM), k-Nearest Neighbor (kNN), and Partial Least Squares (PLS). Our analysis identified 31 out of 281 wavelengths in the near-infrared (NIR) region (817–941 nm) that achieved accuracies between 80% and 100% across 21 assessment days using Random Forest and Feedforward Neural Network models to distinguish infestation levels. The PCA loadings highlighted 907.69 nm as the most significant wavelength for differentiating levels of two-spotted mite infestation. These findings are significant for developing novel monitoring methodologies for T. urticae in cotton, offering insights for early detection, potential cost savings in cotton production, and the validation of the spectral signature of T. urticae damage, thus enabling more efficient monitoring methods.

1. Introduction

In recent years, Brazil has become the fifth-largest global producer of cotton due to advancements in technology used in cotton cultivation and an increase in cultivated areas, estimated at 1674 thousand hectares for the 2023/24 crop season [1]. However, the continuous cultivation of cotton in Brazil, combined with the unique climatic and geographical conditions of Brazil, has led to frequent pest infestations that result in productivity losses and quality alterations in both seeds and fibers [1,2].
One of the pests that pose a significant threat to cotton plants in Brazil is the two-spotted spider mite Tetranychus urticae Koch (Acari: Tetranychidae). This polyphagous and sucking pest occurs in cotton plants from emergence to boll opening [3]. Initially, mite infestations occur in localized patches, with a preference for the underside of leaves, where they lay their eggs and feed [4,5,6]. The symptoms of infestation begin with chlorosis of the attacked leaves, which later progresses to reddish spots between veins. Infested plants also experience a shortened life cycle, producing smaller bolls and altered fiber quality [7,8,9].
The short life cycle, haplodiploid sex determination, and high reproductive capacity of T. urticae contribute to the fast selection of populations resistant to chemical control methods, consequently leading to increased production costs for cotton producers [10,11,12]. Therefore, the effective management of T. urticae in cotton fields remains a significant challenge because of the challenges in monitoring this pest, such as the extensive nature of cotton fields, clustered mite attacks, and difficulties in the early identification of infestations [13].
Currently, T. urticae monitoring is conducted through the visual observation of the middle part of the plant by checking for signs of infestation, webs, and mites using a pocket magnifying lens. The recommended threshold for control is when 30% of the inspected plants [14] or 10% of the inspected leaves show signs of mite presence [15]. However, this monitoring method has limitations such as requiring trained personnel, subjectivity, time-consuming sampling, and high labor costs [16].
Given the limitations of traditional pest detection methods, remote sensing has emerged as a promising tool for monitoring insect pests and the associated damage, especially in large agricultural areas. Remote sensing aims to improve pesticide use efficiency and minimize productivity losses by detecting changes in the morphological and physiological characteristics of plants caused by pest attacks, thereby altering the spectral reflectance of the plants [17,18,19]. Studies on the application of remote sensing for pest monitoring in various crops, including cotton, have shown significant potential [18,20,21,22,23].
Remote sensing data are primarily acquired using multispectral and hyperspectral sensors. Multispectral sensors capture and record reflected radiation in a few noncontiguous bands, typically ranging from 3 to 12. In contrast, hyperspectral sensors capture hundreds of contiguous bands, allowing for the identification of the spectral signature of the target [24]. A key objective of remote sensing is to identify unique spectral signatures for different types of subtle changes such as vegetation vigor [25]. Leaf changes caused by two-spotted spider mite attacks can be detected through visual observation and spectral analysis in the visible range (VIS; 400–700 nm) [26]. Enhanced precision and efficiency in determining spectral changes caused by pest damage in plants, including the spectral signature of symptoms resulting from pest attacks, can be achieved by using hyperspectral sensors. This advancement may lead to simplified sensors that can precisely capture signals in specific spectral bands [27,28,29].
Previous studies have demonstrated the feasibility of using sensors to detect two-spotted spider mites on cotton. Reisig and Godfrey [25] used a portable spectrometer and an integrating sphere to identify the differences and occurrences of damage caused by Aphis gossypii (Hemiptera: Aphididae) and T. urticae in cotton. However, they could not distinguish a unique spectral signature for mite and aphid infestations on cotton leaves. Martin et al. [30] evaluated a multispectral sensor for monitoring T. urticae in cotton plants and concluded that it showed promise as a tool for pest detection.
In contrast to conventional images, which only cover visible wavelengths, hyperspectral images can capture both spectral and spatial information beyond human perception. Consequently, these images can effectively detect and identify pest infestations [31]. A hyperspectral image comprises three dimensions: two spatial dimensions (x-lines and y-columns) and a spectral dimension (wavelength). It can acquire spectral data from hundreds of electromagnetic spectrum bands, including near-infrared and shortwave infrared bands [32]. Recent studies have demonstrated the applicability of hyperspectral images for pest detection in various crops, such as soybeans [33], cotton [34], rice [35], peanuts [23], and apples [36]. However, spectral signatures for T. urticae damage in cotton have not yet been defined.
Nevertheless, the use of sensors for data collection results in large volumes of information, which pose challenges for collection, storage, and analysis [22]. In response, combining hyperspectral imaging with spectroscopy and machine learning techniques to analyze databases can significantly improve pest monitoring through remote sensing. One challenge regarding the analysis of hyperspectral data is the high dimensionality and redundancy generated by the large number and sequence of wavelengths. To address this issue, it is necessary to use models that can work with these characteristics (large amounts of data and redundancy) and employ variable selection techniques to mitigate the problems [37,38].
By utilizing appropriate machine learning techniques on the information extracted from hyperspectral images, it is possible to develop predictive models and conduct studies on changes in population demography or plant physiology [39,40]. Several machine learning models, as reviewed by [41] (2018), have been used to calibrate hyperspectral data classification models. This approach offers the potential to accurately and efficiently classify remote sensing images, excelling in its ability to process complex, high-dimensional data and identify detailed features across different classes.
Machine learning models, such as the Support Vector Machine (SVM), are renowned for their proficiency in handling high-dimensional data, such as hyperspectral images, as reviewed by Ghamisi et al. [32]. Wang et al. [42] confirmed SVM’s effectiveness in analyzing hyperspectral images, with an accuracy above 96% for identifying changes in the vitality of waxy corn. Similarly, Cao et al. [43] demonstrated the good accuracy of models such as Random Forest (RF, >85%) and Linear Discriminant Analysis (LDA, 85%) in identifying mangrove species using hyperspectral images. Furthermore, Neural Networks have shown promising results, achieving accuracies greater than 80% in classifying Spodoptera eridania infestations using hyperspectral data [44].
Although hyperspectral imaging technology is currently utilized in laboratory settings due to its high cost and complexity, the data generated provide valuable insights for developing practical solutions for pest monitoring in the field [21]. The identification of specific spectral signatures associated with damage from T. urticae infestation enables the use of more affordable sensors that can be integrated into drones or portable equipment. Furthermore, machine learning algorithms developed for hyperspectral data analysis can be adapted to process information collected by these field sensors, enhancing data analysis and increasing the accessibility of monitoring technologies [17]. Thus, hyperspectral research not only contributes to a detailed understanding of infestations but also improves practical pest management in the field for mite infestations and reduces agricultural losses.
We tested the hypothesis that it is possible to determine the spectral signature of different levels of T. urticae infestation in cotton, which is detectable by hyperspectral images associated with machine learning. The objective of this study was to evaluate the effectiveness of several machine learning models commonly employed in hyperspectral proximal remote sensing to distinguish between different levels of two-spotted spider mite infestation in cotton crops. By combining hyperspectral data collection and analysis using machine learning approaches, we anticipate the immediate identification of pests, thus minimizing financial losses in cotton production.

2. Materials and Methods

2.1. Establishment and Maintenance of T. urticae Colonies in the Laboratory

The initial population of T. urticae was obtained from the Acarology Laboratory at Esalq/USP, and colonies were established and maintained in the Integrated Pest Management Laboratory. The adapted methodology described by Ferreira et al. [45] was used to maintain the colonies, where mites were reared on jack bean plants, Canavalia ensiformis (L.), which were grown in 5 L plastic pots in an agricultural greenhouse with daily manual irrigation.

2.2. Bioassay

In a greenhouse, conventional cotton plants of the FM 954GLT variety were grown in 5 L plastic pots containing soil collected from the Department of Entomology and Acarology experimental area, and three plants per pot were maintained. The fertilizer recommended for planting was applied to the soil. Considering that the period of occurrence of mites is from 40 to 50 days after the emergence of plants, the plants were exposed to the infestation 50 days after emergence, and to expose them, a metal structure (120 cm high and 25 cm in diameter) was wrapped in voile fabric and used as a cage to prevent the entry of unwanted arthropods and the escape of mites. The experimental design was completely randomized, with ten replications of four treatments. Four densities of 0, 30, 50, and 100 T. urticae females were released into each pot containing three plants, totaling 40 experimental units. On all the evaluation days, the numbers of adult and immature mites present were registered. The results of this count of immature and adult individuals of T. urticae were grouped into four infestation levels, namely, no infestation (0 mites/leaf), low infestation (1 to 10 mites/leaf), medium infestation (11 to 30 mites/leaf), and high infestation (>30 mites/leaf) (Figure 1).

2.3. Hyperspectral Image Acquisition

Hyperspectral data were acquired using a benchtop scanning system (PIKA L, Resonon Inc., Bozeman, MT, USA) operating in the visible-to-near-infrared (Vis-NIR) region (Figure 2). The camera had spectral coverage from 400 to 1000 nm in 281 spectral channels with a spectral resolution of 3.3 nm and a spectral bandwidth of 2.1 nm and captures 900 pixels per line. The imaging system included a tower with four 15 W and 12 V LED lamps, a stable power source, and control software (PIKA L, Resonon Inc., Bozeman, MT, USA https://resonon.com/Pika-L?gad_source=1&gclid=EAIaIQobChMIyfCHtZGjiAMV5cVMAh1rdzReEAAYASAAEgJ0QPD_BwE accessed on 23 July 2024). Before capturing the images, two calibrations were performed: one with a black cap covering the lens and the other with a white plastic polyethylene plate (Type 822, Spectronon Pro, Resonon, Bozeman, MT, USA) (Figure 2).
After 3, 9, 12, and 21 days of the initial infestation, leaves from each pot were collected from the middle part of the plant and placed on a flat surface under the sensor, which moved automatically, capturing information as controlled by the Spectronon software (Version 3.4.5, Resonon Inc., Bozeman, MT, USA) to obtain reflectance values. Each leaf was placed on the platform with the upper side facing the sensor, and the acquired data were saved in a datacube (.bill) and images (.tiff).

2.4. Spectral Extraction

The hyperspectral data corresponding to the spectral sample of each leaf were manually extracted from each dataset. The same manually selected region of interest was designated for each leaf. This process involved adjusting the contrast using the toolbar of SpectrononPro software (Version 3.4.5, Resonon Inc., Bozeman, MT, USA), selecting the region of interest, and generating average reflectance data for each leaf band. Spectral curves were generated based on the average reflectance values of the original variables.

2.5. Data Preprocessing

The electromagnetic spectrum of the sensor initially covered a wavelength range of 384.73–1021.76 nm, consisting of 300 spectral channels. However, a spectrum range of 400–1000 nm was adopted for the analysis process, with 281 channels analyzed on all dates. This adjustment was made to eliminate noise from the dataset, concentrated in wavelengths ranging from 384.73 to 398.77 nm and from 1001.47 to 1021.76 nm. To express the reflectance factor, the reflectance values obtained in different spectral bands were divided by a constant of 10,000.
After obtaining the reflectance emitted by each leaf/pot, the same leaves were examined using a stereoscopic microscope, where the mites were counted, and the leaves were classified by infestation levels (Item 2.2) These different levels were used as targets for the classification of the tested machine learning models and the construction of the PCA to evaluate the selection of variables (wavelengths), while the exact number of individuals present and the day of evaluation were considered predictor variables of the models studied together with the reflectance results of each wavelength.
The spectral data of the 281 wavelengths were preprocessed using the Savitzky–Golay smoothing (differentiation order (m) = 0, polynomial order (p) = 2, window size (w) = 11) and derivative (m = 1, p = 1, w = 3) methods [46,47] utilizing the “prospectr” package [48]. The 35 most significant wavelengths to classify mite infestation were selected using the “Boruta” package [49]. The Boruta algorithm is a sophisticated feature-selection method that enhances the Random Forest classifier by distinguishing truly important features from those that are irrelevant. It builds on Random Forest’s capability to evaluate feature importance by introducing “shadow” attributes, which are shuffled copies of the original features. This technique allows Boruta to compare the importance of real features against these randomized ones, providing a benchmark for assessing significance. The importance is measured by the decrease in classification accuracy when feature values are permuted, with the Z score used to account for variations in accuracy loss. However, the Z score’s distribution does not always align with normal statistical significance, requiring careful interpretation. By repeatedly performing this process, Boruta ensures a rigorous evaluation, filtering out noise and irrelevant features while retaining those with a genuine impact on model performance. Despite its computational intensity and potential limitations in perfect statistical significance, Boruta’s method of using ensemble randomness and comparison with shadow attributes offers a robust approach to identifying important features [49]. The selection of wavelengths involved employing 500 trees and 300 iterations for each evaluation day and for the data of all days (including “days” as a variable). After preprocessing, the variables selected for the entire available dataset were used in the construction of a PCA to facilitate the visualization and understanding of the data, and the loadings were extracted to verify which wavelength presented the greatest contribution to the main components.

2.6. Data Analysis

The reflection dataset was analyzed using different machine learning techniques. All classification models for different levels of infestation were implemented using the “mlr” package [50] in the R software version 4.3.0 [51]. The preprocessed and smoothed data were used as a database for the application of the machine learning techniques, including Random Forest (RF), Principal Component Analysis–Linear Discriminant Analysis (PCA-LDA), Feedforward Neural Network (FNN), Support Vector Machine (SVM), k-Nearest Neighbor (kNN), and Partial Least Squares (PLS) (Table 1).
In the SVM calculation, we employed a radial basis function (RBF) kernel with degree equals 3 in a grid search to optimize the cost and gamma parameters, varying them across a wide range of values using logarithmic transformations. The cost parameter was explored to adjust the trade-off between margin width and classification error, while gamma was employed to control the influence range of the samples in the model. This approach ensured an efficient search for a configuration that balanced model performance, maximizing its generalization capability. The Feedforward Neural Network in this study consisted of an input layer, a hidden layer, and an output layer, with the number of hidden neurons optimized between 1 and 10. The logistic sigmoid activation function in the hidden layer captured nonlinear relationships. Additionally, L2 regularization (decay parameter) was applied, ranging from 0.001 to 0.1, to control model complexity and prevent overfitting. The maximum iterations were set between 10 and 500 to ensure stable convergence during training.
These models were chosen because they offer complementary characteristics. Support Vector Machine (SVM) was chosen for its effectiveness in handling high-dimensional data and its ability to find optimal decision margins, while Random Forest (RF) was used due to its robustness against overfitting and its capability to identify variable importance. To reduce dimensionality and capture the most discriminative variables, we employed Principal Component Analysis–Linear Discriminant Analysis (PCA-LDA). The Feedforward Neural Network (FNN) was included to model complex nonlinear relationships within the data, and k-Nearest Neighbor (kNN) was selected for its simplicity and efficiency in classifying data with nonlinear distributions. Finally, Partial Least Squares (PLS) was utilized to address collinearity among the predictor variables, ensuring that the nuances of the data were adequately captured and analyzed. This combination of methods allowed us to explore different aspects of the hyperspectral data, maximizing the accuracy in classifying infestation levels. After the models performed, their classification accuracies were compared. One of our goals was to find an algorithm capable of working with a large amount of data, resulting in high accuracy in the classification of the different levels of two-spotted spider mite infestation, in addition to reduced training and classification time. All selected models are referenced as algorithms used for data classification [52,53,54,55,56,57].
The tuning of the models was performed using ten cross-validations on both internal and external resampling datasets. Internal resampling was employed for model calibration and parameter adjustment/optimization (70% of the data), whereas external resampling was used to evaluate the performance of the model on the test dataset (30% of the data). The trained and tuned models were used to randomly classify a sample of the raw data, and the results of classification success and failure were used to build a Generalized Linear Model (GLM) with a binomial distribution. The mean values of accuracy in the test dataset for each model were compared using Tukey’s test (p < 0.05). Subsequently, the outcomes of the best-trained model were used to create a confusion matrix.

3. Results

Damage caused by T. urticae resulted in alterations in the reflectance patterns of cotton leaves (Figure 3). There were no discernible differences in the infestation levels in any spectral region on the third day. On the ninth day, in the red edge region (730–760 nm) and near-infrared (NIR) (700–1000 nm) regions, which are related to the structure of the cell wall [58], plants infested with a higher mite infestation displayed increased reflectance. On the 21st day of evaluation, differences in the visible region (400–700 nm) were also observed (Figure 3).
To classify T. urticae infestation levels, PCA was performed before constructing the classification models, and the results revealed that the first two PCs explained more than 91% of the total variance. The “no infestation” and “low infestation” levels showed an overlap of ellipses, while it was possible to observe distinct ellipses between “high infestation”, “moderate infestation”, and “no infestation” (Figure 4), motivating the construction of machine learning models for infestation classification. The PCA loadings showed that the wavelength of 907.69 nm made the largest contribution to the components.
Applying the Boruta algorithm for feature selection significantly reduced the number of variables, leaving approximately 30 to 40 wavelengths from the original dataset (281 wavelengths), in which the 35 variables ranked with the highest importance scores classified by the Boruta algorithm were used. In addition to the different wavelengths, the number of T. urticae individuals and the day of evaluation were considered important for classifying the different levels of infestation. The experimental design allowed for the observation of different levels of infestation (no, low, moderate, and high) for each day of evaluation, including the initial evaluations at 3 days, allowing the model to identify the reflectance patterns of each level of infestation at different ages of the plant and periods since the beginning of the mite infestation. On the third day, most of the plants still presented low or no infestation, even in pots that were inoculated with 100 mites per three plants. This result is due to the movement of mites throughout the plant, since these infestation levels were defined according to the number of individuals present on each leaf that was randomly removed from one of the plants. After adopting this methodology, all the algorithms studied and optimized presented an accuracy in the classification of the infestation greater than 50% for all days evaluated (Figure 5a).
Overall, the Random Forest (RF) and Feedforward Neural Network (FNN) algorithms stood out for their high accuracy in classifying two-spotted mite infestations in cotton for the data from each evaluation day and for the complete data using all evaluations (All). The model optimized by Random Forest had classification accuracies ranging from 80 to 100%, indicating high precision in differentiating mite infestation, in addition to presenting a time required for processing lower than 20 s for the isolated data from each day and lower than 40 s to train the model on the complete database (Figure 5b). The results of high classification accuracy and reduced time to train the model indicate that Random Forest has an advantage over the other models tested, in addition to the fact that this algorithm can be easily adjusted and tuned in different programming languages, compared to other models tested here, such as Neural Networks and Support Vector Machine (SVM) (Figure 6).
Due to the complete (“All”) dataset’s high accuracy and low variance and our objective of finding a model that can accurately identify mite infestation regardless of when the mite enters the area, we conducted the remaining analyses using the complete data set. The selection of the most important variables by the Boruta algorithm indicated that of the 33 most important wavelengths for the classification of two-spotted spider mite infestations in cotton, they were divided into three spectral groups, in which the first and smallest group was in the region of 578.84 and 580.92 nm, the second group with 10 bands ranged from 817.6 to 837.25 nm, and the third and largest group had 21 bands selected between 896.63 and 941.01 nm (Figure 7). In the first group, the selected bands covered the visible region (VIS: 400–700 nm), while the second and third groups covered the near-infrared region (NIR: 700–1000 nm). Our results indicate that most of the selected bands belonged to the NIR region, underscoring its significance in classifying two-spotted spider mite infestations in cotton. Furthermore, PCA loadings revealed that the wavelength of 907.69 nm contributed the most to distinguishing the different infestation levels.
Our results showed that the algorithms optimized by RF and FNN presented the best performance in classifying different levels of two-spotted spider mite infestation in cotton at different evaluation times, regardless of how long the pest colonized the plants. These models presented an accuracy close to 100% in the exact classification of the infestation level (Figure 8).
The findings of this study revealed that in the near-infrared region, it is possible to accurately distinguish different levels of spider mite infestation in cotton. When constructing the confusion matrix with the results obtained in validating the Random Forests model (Table 2), it was evident that infestation levels were predicted with high accuracy, resulting in high sensitivity and specificity for all infestation levels.

4. Discussion

Recently, hyperspectral remote sensing has been widely researched for pest detection [34,59,60,61]. Notably, Pandey et al. [62] demonstrated the application of hyperspectral imaging in identifying stress factors in crops, underscoring its potential for detecting subtle physiological changes in plants due to pest infestations. Their study also highlighted significant alterations in the red edge region as a key indicator of stress. Similarly, the results obtained from our study demonstrated that the spectral curve of cotton leaves not infested with T. urticae exhibited lower reflectance compared to that of infested leaves at different infestation levels. This difference is attributed to physiological changes in the plant, such as damage to chlorophyll, variations in water content, and cellular injuries, which can influence its spectral response [63]. Hence, the plant’s spectral signature changes because of biotic and abiotic stresses, such as water deficiency, diseases, or insect infestations.
The two-spotted spider mite pierces mesophyll cells to feed, making the chlorophyll content in the leaves decrease [3,64] and the leaf physical and chemical properties change. Healthy plants reflect less light in the visible range (400–700 nm) because of the high absorption of these wavelengths by photosynthetic pigments [24,65]. Red edge (680–760 nm) and NIR (700–1000 nm) regions are sensitive to chlorophyll concentrations, and leaf changes caused by T. urticae feeding can be identified by these spectral bands. In our study, at three days after infestation with different densities of T. urticae, we observed an increase in reflectance at the red edge (680–760 nm) and NIR (700–1000 nm) regions, which consistently detected healthy and infested plants [66,67], reflecting chlorophyll concentration and therefore plant health [68].
Hyperspectral sensors generate high-dimensional data, and feature selection is an essential step in preprocessing these data [37]. Furthermore, preprocessing plays a crucial role in improving the results of classification models, thereby contributing to significant advances in remote sensing. In our work, we used the Boruta algorithm, which introduces randomness into the system and consolidates the results from multiple random samples to minimize the influence of fluctuations and random correlations. This additional randomness aids in identifying truly significant attributes more clearly [49]. In this study, the Boruta algorithm effectively reduced the dimensionality of the hyperspectral dataset, proving to be efficient in selecting the most important bands, which considerably increased the accuracy and precision of model classification. This algorithm has already been described in other studies, considerably reducing the number of variables. Adam et al. [69], Poona and Ismail [70], Agjee et al. [71], and Liu et al. [72] found that reducing hyperspectral data and selecting relevant bands through the Boruta algorithm increase the classification performance of the reduced hyperspectral dataset.
The combination of hyperspectral images with machine learning methods has been successfully employed to detect insect pest infestations through plant changes with high precision [34,44,59,72,73,74,75,76,77,78]. These models must have a high level of predictive capacity and be trained efficiently in a short time to minimize the impact on the processing of newly acquired data. Although hyperspectral technology combined with machine learning has been applied in various domains, its application to monitoring mite infestation is yet to be extensively explored.
Considering the two-spotted spider mite control levels in cotton being 10% of leaves [15] and 30% of plants infested [14], our work demonstrated that control measures could be adopted even before these levels are reached because by using the hyperspectral data with machine learning, we can identify low infestations (1–10 mites/leaf). The spectral signatures obtained in our study refer to the range of 817–941 nm, covering 31 wavelengths. Lan et al. [79] identified that wavelengths of 550, 560, 680, and 740 nm are important for detecting spectral differences between cotton plants infested by the two-spotted spider mite treated with different doses of acaricide and concluded that reducing the amount of acaricide could be effective, since the application of half the dose affects spectral reflectance levels similarly to infested plants treated with the full dose. Hermann et al. [28] reported that the reflectance spectrum in the 550–650 nm region was significantly separated into different levels of damage, enabling the early identification of T. urticae infestation in pepper leaves. Nguyen et al. [78] conducted hyperspectral data collection on healthy bok choy plants and those infested with mustard aphids, vegetable thrips, and two-spotted spider mites. They identified spectral regions of 422–440 nm, 500–520 nm, and 720–800 nm as important for distinguishing between infested and healthy plants when all the species were present. However, when the plants were infested only with the two-spotted spider mite, they exhibited spectral characteristics similar to those of the non-infested plants. This demonstrates the need for future work with hyperspectral data in cotton with other variables involved, including other pests, nutritional deficiencies, and diseases, to confirm the spectral signature of two-spotted spider mite damage in the crop so that it is possible to manage the pest before it reaches the currently approved control level.
Our research findings demonstrate the significance of testing various approaches to machine learning models. The effectiveness of these models in classifying the level of two-spotted spider mite infestation can vary depending on the database used. Our research highlights that both Feedforward Neural Network (FNN) and Random Forest (RF) exhibit distinct capabilities, each with its advantages and challenges. Random Forest, an ensemble learning method, aggregates results from multiple decision trees to enhance accuracy and reduce overfitting. It is particularly effective in handling high-dimensional and multicollinear data, which are common in hyperspectral imaging [52]. Previous studies have highlighted RF’s robustness and efficiency; for instance, Ekramirad et al. [59] and Furuya et al. [60] demonstrated its high classification accuracy, often exceeding 95%, for pest damage detection and herbivory in various crops. Its ability to process extensive datasets and manage noise makes RF a solid choice for T. urticae infestation classification. On the other hand, Feedforward Neural Network (FNN) is known for its capacity to model complex and nonlinear relationships in the data. Although FNN requires significantly longer training times and fine-tuning, it can offer superior precision in specific contexts [80]. This extended training allows FNN to capture detailed patterns of infestation that simpler models might miss. For example, Forstmaier et al. [80] achieved 92.5% accuracy in mapping the spatial distribution of Eucalyptus using FNN, demonstrating the method’s effectiveness in diverse applications. Similarly, Wang et al. [81] showed high performance in monitoring large yellow croaker fillets using FNN, with accuracies of 98.1% and 91.6% for different parameters. The effectiveness of RF in our study, providing high accuracy in classifying infestation levels, alongside FNN’s ability to capture complex patterns, suggests that both approaches have their distinct advantages. The choice between RF and FNN should be guided by the specific problem requirements, computational resources, and the desired balance between precision and efficiency. Integrating these methods with advanced preprocessing and feature-selection techniques, such as the Boruta algorithm, could further enhance the detection of T. urticae and improve overall monitoring effectiveness.
Despite the promising results of using hyperspectral imaging combined with machine learning for detecting T. urticae infestations, limitations and potential areas for future research should be addressed. One notable limitation is the dependence on high-quality hyperspectral data, which can be costly and technically demanding to acquire. The current system relies on lab-based hyperspectral sensors that may not be easily deployable in field settings. Future research could focus on developing and validating compact, cost-effective hyperspectral sensors that can be mounted on drones or other field equipment, thereby making real-time monitoring more accessible [81]. Another limitation is the sensitivity of machine learning models to the quality and diversity of training data. As the current study relies on data from specific infestation scenarios, expanding datasets to include a wider range of environmental conditions, pest densities, and plant varieties could enhance model robustness and generalizability [59].
Our study suggests that the adoption of the RF model, together with the spectral signature of T. urticae damage, is a promising starting point for future research. The data from the near-infrared region obtained in our study and processed by machine learning identified specific bands that can be detected in the future using smaller sensors attached to drones, providing more accurate information for monitoring T. urticae. Furthermore, we recommend that future studies use hyperspectral sensors to analyze the physiological responses of cotton plants subjected to various stresses in conjunction with spider mite infestations. This will provide a more comprehensive understanding of the impact of multiple stress factors on cotton plants, which represents a gap in the utilization of remote sensing in the field.

5. Conclusions

This study concluded that hyperspectral sensors are highly effective in detecting spectral changes in cotton plants induced by varying levels of T. urticae infestations ranging from none to low, moderate, and high levels. The enormous volume of data generated by hyperspectral remote sensing poses challenges; however, the integration of machine learning techniques proves promising for accurately predicting infestation levels, as some tested models achieved accuracy rates approaching 100%. Our results identified 31 out of 281 wavelengths in the near-infrared (NIR) region (817–941 nm) that achieved accuracies ranging from 80% to 100% in distinguishing infestation levels over different assessment days using Random Forest and Feedforward Neural Network models. PCA loadings revealed that the wavelength of 907.69 nm contributed the most to differentiating levels of two-spotted mite infestation. By combining hyperspectral sensing with machine learning, this study represents a significant advancement in overcoming the limitations of visual inspections for spider mite monitoring. Future research should continue to employ these technologies to distinguish mite infestation levels from other biotic and abiotic factors affecting cotton plant reflectance. This approach will help validate the spectral signature of T. urticae damage and pave the way for developing more efficient and cost-effective monitoring tools, such as smaller, targeted sensors. Ultimately, these advancements will contribute to more accurate information and enhance integrated pest management programs in cotton fields.

Author Contributions

Conceptualization, M.Y., F.H.I.F. and P.T.Y.; methodology, M.Y., L.V.T., F.H.I.F. and P.T.Y.; validation, M.Y. and L.V.T.; formal analysis, L.V.T.; investigation, M.Y., L.V.T. and F.H.I.F.; resources, P.T.Y.; data curation, L.V.T.; writing—original draft preparation, M.Y. and L.V.T.; writing—review and editing, M.Y., L.V.T., F.H.I.F. and P.T.Y.; supervision, P.T.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the São Paulo Advanced Research Center in Biological Control (SPARCBIO) FAPESP-Koppert, 2018/02317-5 and by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), grant number 88887.513328/2020-00.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

We would like to thank the funding agency mentioned above. We would like to thank Fagner Goes da Conceição for the help with the spectral curve graph.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Companhia Nacional De Abastecimento—CONAB. Algodão. Available online: https://www.conab.gov.br/info-agro/safras/serie-historica-das-safras/itemlist/category/898-algodao (accessed on 4 February 2024).
  2. Severino, L.S.; Rodrigues, S.M.M.; Chitarra, L.G.; Joaquim; Filho, L.; Contini, E.; Mota, M.; Marra, R.; Araújo, A. Algodão: Caracterização e Desafios Tecnológicos. Campina Grande: Embrapa Algodão. 2019. Available online: https://ainfo.cnptia.embrapa.br/digital/bitstream/item/198192/1/SerieDesafiosAgronegocioBrasileiroNT3Algodao.pdf (accessed on 7 February 2024).
  3. Moraes, G.J.; Flechtmann, C.H.W. Manual de Acarologia Acarologia Básica e Ácaros de Plantas Cultivadas no Brasil; Holos: Ribeirão Preto, Brasil, 2008; p. 308. [Google Scholar]
  4. Saito, Y. The concept of life types in Tetranychinae. An attempt to classify the spinning behaviour of Tetranychinae. Acarologia 1983, 24, 377–391. [Google Scholar]
  5. Brandenburg, R.L.; Kennedy, G.G. Ecological and agricultural considerations in the management of twospotted spider mite (Tetranychus urticae Koch). Agric. Zool. Rev. 1987, 2, 185–236. [Google Scholar]
  6. Osakabe, M.; Hongo, K.; Funayama, K.; Osumi, S. Amensalism via webs causes unidirectional shifts of dominance in spider mite communities. Oecologia 2006, 150, 496–505. [Google Scholar] [CrossRef]
  7. Wilson, L.J.; Morton, L.K. Spider mites (Acari: Tetranychidae) affect yield and fiber quality of cotton. J. Econ. Entomol. 1993, 86, 566–585. [Google Scholar] [CrossRef]
  8. Miyazaki, J.; Wilson, L.J.; Stiller, W.N. Fitness of twospotted spider mites is more affected by constitutive than induced resistance traits in cotton (Gossypium spp.). Pest Manag. Sci. 2013, 69, 1187–1197. [Google Scholar] [CrossRef]
  9. Scott, W.S.; Catchot, A.; Gore, J.; Musser, F.; Cook, D. Impact of two-spotted spider mite (Acari: Tetranychidae) duration of infestation on cotton seedlings. J. Econ. Entomol. 2013, 106, 862–865. [Google Scholar] [CrossRef]
  10. Van Leeuwen, T.; Vontas, J.; Tsagkarakou, A.; Dermauw, W.; Tirry, L. Acaricide resistance mechanisms in the two-spotted spider mite Tetranychus urticae and other important Acari: A review. Insect Biochem. Mol. Biol. 2010, 40, 563–572. [Google Scholar] [CrossRef]
  11. Sato, M.E.; Veronez, B.; Stocco, R.S.M.; Queiroz, M.C.V.; Gallego, R. Spiromesifen resistance in Tetranychus urticae (Acari: Tetranychidae): Selection, stability, and monitoring. Crop Prot. 2016, 89, 278–283. [Google Scholar] [CrossRef]
  12. Wang, L.; Zhang, Y.; Xie, W.; Wu, Q.; Wang, S. Sublethal effects of spinetoram on the two-spotted spider mite, Tetranychus urticae (Acari: Tetranychidae). Pestic. Biochem. Physiol. 2016, 132, 102–107. [Google Scholar] [CrossRef]
  13. Peixoto, M.F.; Barbosa, R.V.; Oliveira, R.R.C.; Fernandes, P.M.; Costa, R.B. Amostragem do ácaro rajado Tetranychus urticae Koch (Acari: Tetranychidae) e eficiência de acaricidas no seu controle na cultura do algodoeiro irrigado. Biosci. J. 2009, 25, 24–32. Available online: https://seer.ufu.br/index.php/biosciencejournal/article/view/6873 (accessed on 17 January 2024).
  14. Miranda, J.E. Manejo Integrado de Pragas do Algodoeiro no Cerrado Brasileiro. EMBRAPA CNPA, Campina Grande. 2006. Available online: https://www.infoteca.cnptia.embrapa.br/infoteca/bitstream/doc/274817/1/CIRTEC98.pdf (accessed on 4 February 2024).
  15. Degrande, P.E. Guia Prático de Controle das Pragas do Algodoeiro; UFMS: Campo Grande, Brazil, 1998; p. 60. [Google Scholar]
  16. Raphael, M.M.; Maheswari, R. Automatic Monitoring of Pest Trap. Int. J. Adv. Res. Electr. Electron. Instrum. Eng. 2016, 5, 2470–2473. [Google Scholar] [CrossRef]
  17. Behmann, J.; Mahlein, A.-K.; Rumpf, T.; Römer, C.; Plümer, L. A review of advanced machine learning methods for the detection of biotic stress in precision crop protection. Precis. Agric. 2015, 16, 239–260. [Google Scholar] [CrossRef]
  18. Nansen, C.; Elliot, N. Remote sensing and reflectance profiling in entomology. Annu. Rev. Entomol. 2016, 61, 139–158. [Google Scholar] [CrossRef]
  19. Martin, D.E.; Latheef, M.A. Aerial application methods control spider mites on corn in Kansas, USA. Exp. Appl. Acarol. 2019, 77, 571–582. [Google Scholar] [CrossRef]
  20. Prabhakar, M.; Prasad, Y.G.; Thirupathi, M.; Sreedevi, G.; Dharajothi, B.; Venkateswarlu, B. Use of ground based hyperspectral remote sensing for detection of stress in cotton caused by leafhopper (Hemiptera: Cicadellidae). Comput. Electron. Agric. 2011, 79, 189–198. [Google Scholar] [CrossRef]
  21. El-Ghany, N.M.A.; El-Aziz, S.E.A.; Marei, S.S. A review: Application of remote sensing as a promising strategy for insect pests and diseases management. Environ. Sci. Pollut. Res. 2020, 27, 33503–33515. [Google Scholar] [CrossRef]
  22. Filho, F.H.I.; Heldens, W.B.; Kong, Z.; de Lange, E.S. Drones: Innovative technology for use in precision pest management. J. Econ. Entomol. 2020, 113, 1–25. [Google Scholar] [CrossRef]
  23. Pinto, J.; Powell, S.; Peterson, R.; Rosalen, D.; Fernandes, O. Detection of defoliation injury in peanut with hyperspectral proximal remote sensing. Remote Sens. 2020, 12, 3828. [Google Scholar] [CrossRef]
  24. Moreira, M.A. Fundamentos do Sensoriamento Remoto e Metodologias de Aplicação; UFV: Viçosa, Brazil, 2012; p. 442. [Google Scholar]
  25. Reisig, D.; Godfrey, L. Spectral response of cotton aphid-(Homoptera: Aphididae) and spider mite-(Acari: Tetranychidae) infested cotton: Controlled studies. Environ. Entomol. 2007, 36, 1466–1474. [Google Scholar] [CrossRef]
  26. Hermann, I.; Berenstein, M.; Paz-Kagan, T.; Sade, A.; Karnieli, A. Spectral assessment of two-spotted spider mite damage levels in the leaves of greenhouse-grown pepper and bean. Biosyst. Eng. 2017, 157, 72–85. [Google Scholar] [CrossRef]
  27. Fitzgerald, G.J.; Maas, S.J.; Detar, W.R. Spider mite detection and canopy component mapping in cotton using hyperspectral imagery and spectral mixture analysis. Precis. Agric. 2004, 5, 275–289. [Google Scholar] [CrossRef]
  28. Hermann, I.; Berenstein, M.; Sade, A.; Karnieli, A.; Bonfil, D.J.; Weintraub, P.G. Spectral monitoring of two-spotted spider mite damage to pepper leaves. Remote Sens. Lett. 2012, 3, 277–283. [Google Scholar] [CrossRef]
  29. Nansen, C.; Murdock, M.; Purington, R.; Marshall, S. Early infestations by arthropod pests induce unique changes in plant compositional traits and leaf reflectance. Pest Manag. Sci. 2021, 77, 5158–5169. [Google Scholar] [CrossRef]
  30. Martin, D.E.; Latheef, M.A.; López, J.D. Evaluation of selected acaricides against twospotted spider mite (Acari: Tetranychidae) on greenhouse cotton using multispectral data. Exp. Appl. Acarol. 2015, 66, 227–245. [Google Scholar] [CrossRef]
  31. Xiao, Z.; Yin, K.; Geng, L.; Wu, J.; Zhang, F.; Liu, Y. Pest identification via hyperspectral image and deep learning. Signal Image Video Process. 2022, 16, 873–880. [Google Scholar] [CrossRef]
  32. Ghamisi, P.; Plaza, J.; Chen, Y.; Li, J.; Plaza, A.J. Advanced spectral classifiers for hyperspectral images: A review. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–32. [Google Scholar] [CrossRef]
  33. Gui, J.; Xu, H.; Fei, J. Non-destructive detection of soybean pest based on hyperspectral image and attention-resnet meta-learning model. Sensors 2023, 23, 678. [Google Scholar] [CrossRef]
  34. Yan, T.; Xu, W.; Lin, J.; Duan, L.; Gao, P.; Zhang, C.; Lv, X. Combining multi-dimensional convolutional neural network (CNN) with visualization method for detection of Aphis gossypii glover infection in cotton leaves using hyperspectral imaging. Front. Plant Sci. 2021, 12, 604510. [Google Scholar] [CrossRef]
  35. Srivastava, S.; Mishra, H.N. Detection of insect damaged rice grains using visible and near infrared hyperspectral imaging technique. Chemom. Intell. Lab. Syst. 2022, 221, 104489. [Google Scholar] [CrossRef]
  36. Ekramirad, N.; Khaled, A.Y.; Donohue, K.D.; Villanueva, R.T.; Adedeji, A.A. Classification of codling moth-infested apples using sensor data fusion of acoustic and hyperspectral features coupled with machine learning. Agriculture 2023, 13, 839. [Google Scholar] [CrossRef]
  37. Reddy, G.T.; Reddy, M.P.K.; Lakshmanna, K.; Kaluri, R.; Rajput, D.S.; Srivastava, G.; Baker, T. Analysis of dimensionality reduction techniques on big data. IEEE Access 2020, 8, 54776–54788. [Google Scholar] [CrossRef]
  38. Kok, Z.H.; Mohamed Shariff, A.R.; Alfatni, M.S.M.; Khairunniza-Bejo, S. Support Vector Machine in precision agriculture: A review. Comput. Electron. Agric. 2021, 191, 106546. [Google Scholar] [CrossRef]
  39. Vance, C.K.; Tolleson, D.R.; Kinoshita, K.; Rodriguez, J.; Foley, W.J. Near infrared spectroscopy in wildlife and biodiversity. J. Near Infrared Spectrosc. 2016, 24, 1–25. [Google Scholar] [CrossRef]
  40. Helser, T.E.; Benson, I.M.; Barnett, B.K. Proceedings of the research workshop on the rapid estimation of fish age using Fourier Transform Near Infrared Spectroscopy (FT-NIRS). AFSC Process. Rep. 2019, 06, 195. [Google Scholar] [CrossRef]
  41. Maxwell, A.E.; Warner, T.A.; Fang, F. Implementation of machine-learning classification in remote sensing: An applied review. Int. J. Remote Sens. 2018, 39, 2784–2817. [Google Scholar] [CrossRef]
  42. Wang, J.; Yan, L.; Wang, F.; Qi, S. SVM classification method of waxy corn seeds with different vitality levels based on hyperspectral imaging. J. Sens. 2022, 2022, 4379317. [Google Scholar] [CrossRef]
  43. Cao, J.; Liu, K.; Liu, L.; Zhu, Y.; Li, J.; He, Z. Identifying mangrove species using field close-range snapshot hyperspectral imaging and machine-learning techniques. Remote Sens. 2018, 10, 2047. [Google Scholar] [CrossRef]
  44. Iost Filho, F.H.; Pazini, J.D.B.; de Medeiros, A.D.; Rosalen, D.L.; Yamamoto, P.T. Assessment of injury by four major pests in soybean plants using hyperspectral proximal imaging. Agronomy 2022, 12, 1516. [Google Scholar] [CrossRef]
  45. Ferreira, F.C.; Moraes, N.M.; Shinoda, S.; Sato, M.E.; Morini, M.S.C. Manual Para Criação de Ácaros Predadores; Canal 6: Bauru, Brasil, 2018; p. 64. [Google Scholar]
  46. Savitzky, A.; Golay, M.J.E. Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 1964, 36, 1627–1639. [Google Scholar] [CrossRef]
  47. Schafer, R.W. What is a savitzky-golay filter? IEEE Signal Process. Mag. 2011, 28, 111–117. [Google Scholar] [CrossRef]
  48. Stevens, A.; Ramirez-Lopez, L. An Introduction to the Prospectr Package. GitHub. 2014. Available online: https://antoinestevens.github.io/prospectr/#An_Introduction_to_the_prospectr_package (accessed on 15 February 2024).
  49. Kursa, M.B.; Rudnicki, W.R. Feature selection with the boruta package. J. Stat. Softw. 2010, 36, 1–13. [Google Scholar] [CrossRef]
  50. Bischl, B.; Lang, M.; Kotthoff, L.; Schiffner, J.; Richter, J.; Studerus, E.; Casalicchio, G.; Jones, Z.M. mlr: Machine Learning in R. J. Mach. Learn. Res. 2016, 17, 1–5. [Google Scholar]
  51. R Core Team. The R Project for Statistical Computing. 2024. Available online: https://www.r-project.org/ (accessed on 10 February 2024).
  52. Belgiu, M.; Drăgu, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
  53. Chen, M.; Wang, Q.; Li, X. Discriminant analysis with graph learning for hyperspectral image classification. Remote Sens. 2018, 10, 836. [Google Scholar] [CrossRef]
  54. Guo, Y.; Cao, H.; Han, S.; Sun, Y.; Bai, Y. Spectral-spatial hyperspectralimage classification with k-nearest neighbor and guided filter. IEEE Access 2018, 6, 18582–18591. [Google Scholar] [CrossRef]
  55. Pathak, D.K.; Kalita, S.K.; Bhattacharya, D.K. Hyperspectral image classification using support vector machine: A spectral spatial feature based approach. Evol. Intell. 2022, 15, 1809–1823. [Google Scholar] [CrossRef]
  56. Zhou, X.; Liu, H.; Shi, C.; Liu, J. The basics of deep learning. In Deep Learning on Edge Computing Devices; Zhou, X., Liu, H., Shi, C., Liu, J., Eds.; Elsevier: Amsterdam, The Netherlands, 2022; pp. 19–36. [Google Scholar] [CrossRef]
  57. Gasela, M.; Kganyago, M.; De Jager, G. Using resampled nSight-2 hyperspectral data and various machine learning classifiers for discriminating wetland plant species in a Ramsar Wetland site, South Africa. Appl. Geomat. 2024, 16, 429–440. [Google Scholar] [CrossRef]
  58. Shirzadifar, A.; Bajwa, S.; Nowatzki, J.; Shojaeiarani, J. Development of spectral indices for identifying glyphosate-resistant weeds. Comput. Electron. Agric. 2020, 170, 105276. [Google Scholar] [CrossRef]
  59. Ekramirad, N.; Khaled, A.Y.; Doyle, L.E.; Loeb, J.R.; Donohue, K.D.; Villanueva, R.T.; Adedeji, A.A. Nondestructive detection of codling moth infestation in apples using pixel-based nir hyperspectral imaging with machine learning and feature selection. Foods 2022, 11, 8. [Google Scholar] [CrossRef]
  60. Furuya, D.E.G.; Ma, L.; Pinheiro, M.M.F.; Gomes, F.D.G.; Gonçalvez, W.N.; Marcato Junior, J.; Rodrigues, D.C.; Blassioli-Moraes, M.C.; Michereff, M.F.F.; Borges, M.; et al. Prediction of insect-herbivory-damage and insect-type attack in maize plants using hyperspectral data. Int. J. Appl. Earth Obs. Geoinf. 2021, 105, 102608. [Google Scholar] [CrossRef]
  61. Peignier, S.; Lacotte, V.; Duport, M.-G.; Baa-Puyoulet, P.; Simon, J.-C.; Calevro, F.; Heddi, A.; da Silva, P. Detection of aphids on hyperspectral images using one-class SVM and laplacian of gaussians. Remote Sens. 2023, 15, 2103. [Google Scholar] [CrossRef]
  62. Pandey, P.; Singh, S.; Khan, M.S.; Semwal, M. Non-invasive estimation of foliar nitrogen concentration using spectral characteristics of menthol mint (Mentha arvensis L.). Front. Plant Sci. 2022, 13, 680282. [Google Scholar] [CrossRef]
  63. Semeraro, T.; Mastroleo, G.; Pomes, A.; Luvisi, A.; Gissi, E.; Aretano, R. Modelling fuzzy combination of remote sensing vegetation index for durum wheat crop analysis. Comput. Electron. Agric. 2019, 156, 684–692. [Google Scholar] [CrossRef]
  64. Fraulo, A.B.; Cohen, M.; Liburd, O.E. Visible/near infrared reflectance (VNIR) spectroscopy for detecting twospotted spider mite (Acari: Tetranychidae) damage in strawberries. Environ. Entomol. 2009, 38, 137–142. [Google Scholar] [CrossRef]
  65. Abdulridha, J.; Batuman, O.; Ampatzidis, Y. UAV-Based remote sensing technique to detect citrus canker disease utilizing hyperspectral imaging and machine learning. Remote Sens. 2019, 11, 1373. [Google Scholar] [CrossRef]
  66. Vanegas, F.; Bratanov, D.; Powell, K.; Weiss, J.; Gonzalez, F.A. A novel methodology for improving plant pest surveillance in vineyards and crops using UAV-Based hyperspectral and spatial data. Sensors 2018, 18, 260. [Google Scholar] [CrossRef]
  67. Jiang, J.; Johansen, K.; Stanschewski, C.S.; Wellman, G.; Mousa, M.A.A.; Fiene, G.M.; Asiry, K.A.; Tester, M.; McCabe, M.F. Phenotyping a diversity panel of quinoa using UAV-retrieved leaf area index, SPAD-based chlorophyll and a random forest approach. Precis. Agric. 2022, 23, 961–983. [Google Scholar] [CrossRef]
  68. Slaton, M.R.; Raymond Hunt, E.; Smith, W.K. Estimating near-infrared leaf reflectance from leaf structural characteristics. Am. J. Bot. 2001, 88, 278–284. [Google Scholar] [CrossRef]
  69. Adam, E.M.; Mutanga, O.; Ismail, R. Determining the susceptibility of Eucalyptus nitens forests to Coryphodema tristis (cossid moth) occurrence in Mpumalanga, South Africa. Int. J. Geogr. Inf. Sci. 2013, 27, 1924–1938. [Google Scholar] [CrossRef]
  70. Poona, N.; Ismail, R. Using Boruta-selected spectroscopic wavebands for the asymptomatic detection of Fusarium circinatum stress. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 3764–3772. [Google Scholar] [CrossRef]
  71. Agjee, N.H.; Ismail, R.; Mutanga, O. Identifying relevant hyperspectral bands using Boruta: A temporal analysis of water hyacinth biocontrol. J. Appl. Remote Sens. 2016, 10, 042002. [Google Scholar] [CrossRef]
  72. Liu, Z.; Lu, Y.; Peng, Y.; Zhao, L.; Wang, G.; Hu, Y. Estimation of soil heavy metal content using hyperspectral data. Remote Sens. 2019, 11, 1464. [Google Scholar] [CrossRef]
  73. Huang, M.; Wan, X.; Zhang, M.; Zhu, Q. Detection of insect-damaged vegetable soybeans using hyperspectral transmittance image. J. Food Eng. 2013, 116, 45–49. [Google Scholar] [CrossRef]
  74. Aeberli, A.; Robson, A.; Phinn, S.; Lamb, D.W.; Johansen, K.A. A comparison of analytical approaches for the spectral discrimination and characterisation of mite infestations on banana plants. Remote Sens. 2022, 14, 5467. [Google Scholar] [CrossRef]
  75. Gao, B.; Yu, L.; Ren, L.; Zhan, Z.; Luo, Y. Early detection of Dendroctonus valens infestation with machine learning algorithms based on hyperspectral reflectance. Remote Sens. 2022, 14, 1373. [Google Scholar] [CrossRef]
  76. Johari, S.N.A.M.; Khairunniza-Bejo, S.; Shariff, A.R.M.; Husin, N.A.; Basri, M.M.M.; Kamarudin, N. Identification of bagworm (Metisa plana) instar stages using hyperspectral imaging and machine learning techniques. Comput. Electron. Agric. 2022, 194, 106739. [Google Scholar] [CrossRef]
  77. Ramos, A.P.M.; Gomes, F.D.G.; Pinheiro, M.M.F.; Furuya, D.E.G.; Gonçalvez, W.N.; Mercato Junior, J.; Michereff, M.F.F.; Blassioli-Moraes, M.C.; Borges, M.; Alaumann, R.A.; et al. Detecting the attack of the fall armyworm (Spodoptera frugiperda) in cotton plants with machine learning and spectral measurements. Precis. Agric. 2022, 23, 470–491. [Google Scholar] [CrossRef]
  78. Nguyen, D.; Tan, A.; Lee, R.; Lim, W.F.; Hui, T.F.; Suhaimi, F. Early detection of infestation by mustard aphid, vegetable thrips and two-spotted spider mite in bok choy with deep neural network (DNN) classification model using hyperspectral imaging data. Comput. Electron. Agric. 2024, 220, 108892. [Google Scholar] [CrossRef]
  79. Lan, Y.; Zhang, H.; Hoffmann, W.C.; Lopez, J.D., Jr. Spectral response of spider mite infested cotton: Mite density and miticide rate study. Int. J. Agric. Biol. Eng. 2013, 6, 48–52. [Google Scholar] [CrossRef]
  80. Forstmaier, A.; Shekhar, A.; Chen, J. Mapping of Eucalyptus in natura 2000 areas using sentinel 2 imagery and artificial neural networks. Remote Sens. 2020, 12, 2176. [Google Scholar] [CrossRef]
  81. Wang, S.; Das, A.K.; Pang, J.; Liang, P. Artificial intelligence empowered multispectral vision based system for non-contact monitoring of large yellow croaker (Larimichthys crocea) fillets. Foods 2021, 10, 1161. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Cotton leaves with different levels of Tetranychus urticae infestation.
Figure 1. Cotton leaves with different levels of Tetranychus urticae infestation.
Agriculture 14 01573 g001
Figure 2. Benchtop system for hyperspectral image acquisition.
Figure 2. Benchtop system for hyperspectral image acquisition.
Agriculture 14 01573 g002
Figure 3. Spectral signature damage caused by different levels of infestation of Tetranychus urticae in cotton plants after 3, 9, 12, and 21 days of infestation.
Figure 3. Spectral signature damage caused by different levels of infestation of Tetranychus urticae in cotton plants after 3, 9, 12, and 21 days of infestation.
Agriculture 14 01573 g003
Figure 4. Principal Component Analysis (PCA) using the wavelength ranges selected by the Boruta algorithm for classifying mite infestation levels.
Figure 4. Principal Component Analysis (PCA) using the wavelength ranges selected by the Boruta algorithm for classifying mite infestation levels.
Agriculture 14 01573 g004
Figure 5. Classification accuracy (a) and time taken to train different models to identify Tetranychus urticae infestations in cotton (b). kNN = k-Nearest Neighbor, PCA-LDA = Principal Component Analysis–Linear Discriminant Analysis, FNN = Feedforward Neural Network, PLS = Partial Least Squares, RF = Random Forest, and SVM = Support Vector Machine.
Figure 5. Classification accuracy (a) and time taken to train different models to identify Tetranychus urticae infestations in cotton (b). kNN = k-Nearest Neighbor, PCA-LDA = Principal Component Analysis–Linear Discriminant Analysis, FNN = Feedforward Neural Network, PLS = Partial Least Squares, RF = Random Forest, and SVM = Support Vector Machine.
Agriculture 14 01573 g005
Figure 6. Performance rank of each model trained for data from each day of the infestation assessment and complete data over time. Models with the best performance were assigned the lowest rank. kNN = k-Nearest Neighbor, PCA-LDA = Principal Component Analysis–Linear Discriminant Analysis, FNN = Feedforward Neural Network, PLS = Partial Least Squares, RF = Random Forest, and SVM = Support Vector Machine.
Figure 6. Performance rank of each model trained for data from each day of the infestation assessment and complete data over time. Models with the best performance were assigned the lowest rank. kNN = k-Nearest Neighbor, PCA-LDA = Principal Component Analysis–Linear Discriminant Analysis, FNN = Feedforward Neural Network, PLS = Partial Least Squares, RF = Random Forest, and SVM = Support Vector Machine.
Agriculture 14 01573 g006
Figure 7. Selected wavelengths using the Boruta algorithm for all reflectance data were used to classify different Tetranychus urticae infestation levels.
Figure 7. Selected wavelengths using the Boruta algorithm for all reflectance data were used to classify different Tetranychus urticae infestation levels.
Agriculture 14 01573 g007
Figure 8. Comparison of the mean accuracy (±se) of the models for reflectance data from all days evaluated. Bars followed by different letters indicate differences between mean accuracies using the Tukey test (p < 0.05). kNN = k-Nearest Neighbor, PCA-LDA = Principal Component Analysis–Linear Discriminant Analysis, FNN = Feedforward Neural Network, PLS = Partial Least Squares, RF = Random Forest, and SVM = Support Vector Machine.
Figure 8. Comparison of the mean accuracy (±se) of the models for reflectance data from all days evaluated. Bars followed by different letters indicate differences between mean accuracies using the Tukey test (p < 0.05). kNN = k-Nearest Neighbor, PCA-LDA = Principal Component Analysis–Linear Discriminant Analysis, FNN = Feedforward Neural Network, PLS = Partial Least Squares, RF = Random Forest, and SVM = Support Vector Machine.
Agriculture 14 01573 g008
Table 1. Tuning and control parameters of each model were tested for preprocessed reflectance data to classify the different mite infestation levels using the mlr package.
Table 1. Tuning and control parameters of each model were tested for preprocessed reflectance data to classify the different mite infestation levels using the mlr package.
ModelParameter Search SpaceControl
RF(“mtry”, lower = 1, upper = 25)CR (maxit = 30)
PCA-LDA(“ppc.pcaComp”, lower = 1, upper = 50)CR (maxit = 30)
FNN(“size”, lower = 1, upper = 10),
(“decay”, lower = 0.001, upper = 0.1),
(“maxit”, lower = 10, upper = 500)
CR (maxit = 100)
SVM(“cost”, lower = 0, upper = 6, trafo = function(x) 10x),
(“gamma”, lower = −5, upper = 1, trafo = function(x) 10x)
CMBO (budget = 60)
kNN(“k”, lower = 1, upper = 5)CR (maxit = 100)
PLS(“ncomp”, lower = 1, upper = 15)CR (maxit = 30)
RF = Random Forest, PCA-LDA = Principal Component Analysis–Linear Discriminant Analysis, FNN = Feedforward Neural Network, SVM = Support Vector Machine, kNN = k-Nearest Neighbor, PLS = Partial Least Squares, CR = Control Random, and CMBO = Control Model-Based Optimization.
Table 2. Confusion matrix for the Random Forest (RF) results in the testing dataset for all days of evaluation at different Tetranychus infestation levels.
Table 2. Confusion matrix for the Random Forest (RF) results in the testing dataset for all days of evaluation at different Tetranychus infestation levels.
PredictionReference
No InfestationLow InfestationModerate InfestationHigh Infestation
No infestation770000
Low infestation046000
Moderate infestation003601
High infestation000629
Sensitivity1.0001.0001.0000.9984
Specificity1.0001.0000.99951.000
Accuracy (%) (95% CI) = 99.95 (99.75–100.00), Kappa = 0.994.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yamada, M.; Thiesen, L.V.; Iost Filho, F.H.; Yamamoto, P.T. Hyperspectral Imaging and Machine Learning: A Promising Tool for the Early Detection of Tetranychus urticae Koch Infestation in Cotton. Agriculture 2024, 14, 1573. https://doi.org/10.3390/agriculture14091573

AMA Style

Yamada M, Thiesen LV, Iost Filho FH, Yamamoto PT. Hyperspectral Imaging and Machine Learning: A Promising Tool for the Early Detection of Tetranychus urticae Koch Infestation in Cotton. Agriculture. 2024; 14(9):1573. https://doi.org/10.3390/agriculture14091573

Chicago/Turabian Style

Yamada, Mariana, Leonardo Vinicius Thiesen, Fernando Henrique Iost Filho, and Pedro Takao Yamamoto. 2024. "Hyperspectral Imaging and Machine Learning: A Promising Tool for the Early Detection of Tetranychus urticae Koch Infestation in Cotton" Agriculture 14, no. 9: 1573. https://doi.org/10.3390/agriculture14091573

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop