Deep-Learning-Based Strawberry Leaf Pest Classification for Sustainable Smart Farms

Kim, Haram; Kim, Dongsoo

doi:10.3390/su15107931

Open AccessArticle

Deep-Learning-Based Strawberry Leaf Pest Classification for Sustainable Smart Farms

by

Haram Kim

¹ and

Dongsoo Kim

^2,*

¹

Department of IT Distribution and Logistics, Soongsil University, Seoul 06978, Republic of Korea

²

Department of Industrial and Information Systems Engineering, Soongsil University, Seoul 06978, Republic of Korea

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(10), 7931; https://doi.org/10.3390/su15107931

Submission received: 1 April 2023 / Revised: 6 May 2023 / Accepted: 11 May 2023 / Published: 12 May 2023

(This article belongs to the Special Issue Digital Technology for Smart Agriculture: Applications, Challenges, and Outlooks)

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents a deep-learning-based classification model that aims to detect diverse pest infections in strawberry plants. The proposed model enables the timely identification of pest symptoms, allowing for prompt and effective pest management in smart farms. The present research employed an actual dataset of strawberry leaf images collected from a smart farm test bed. To expand the dataset, open data from sources such as Kaggle were utilized, while diseased leaf images were obtained through web crawling with the aid of the Python library. Subsequently, the expanded and added data were resized to a uniform size, and Pseudo-Labeling was implemented to ensure stable learning for both the training and test datasets. The RegNet and EfficientNet models were selected as the primary CNN-based image network models for repetitive learning, and ensemble learning was employed to enhance prediction accuracy. The proposed model is anticipated to facilitate the early identification and treatment of pests on strawberry leaves during the seedling period, a pivotal phase in smart farm development. Furthermore, it is expected to boost production in the agricultural industry and strengthen its competitive edge.

Keywords:

deep learning model; strawberry leaf pest; sustainable smart farm

1. Introduction

The concept of smart farming has gained considerable attention in the pursuit of sustainable agriculture. By utilizing modern technologies such as sensors, drones, and machine learning algorithms, farmers can optimize various aspects of crop production [1,2]. Through advanced techniques to monitor critical factors such as soil conditions, weather patterns, and crop health, farmers can minimize waste, reduce resource consumption, and enhance overall efficiency. Such practices not only benefit the environment by decreasing the impact of farming on ecosystems but also ensure the long-term viability of agricultural production by making it more economically feasible. As a result, smart farming has emerged as a crucial strategy for promoting sustainable agriculture and addressing the challenges of food security and climate change.

Deep learning is one of the key technologies for building sustainable smart farms [3]. Deep learning enables more accurate predictions and decision making. For example, deep learning can be used to optimize the growing environment of crops, the amount of pesticides used, and the timing of harvests. This allows for more crop production while managing crop production environments more efficiently. Such technology not only contributes significantly to increasing crop productivity but also plays a major role in realizing resource efficiency and environmentally friendly agriculture.

This paper presents a deep-learning-based classification model for detecting various pest infections in strawberry plants. By enabling the timely identification of pest symptoms, the proposed model supports effective pest management in smart farms, thereby contributing to improved crop yields and quality. As the healthy food industry and the vegan market continue to expand, there is increasing attention given to the importance of supplying high-quality horticultural crops, particularly in response to climate change and food crises. The emergence of smart farms has made it possible to produce crops of better quality than those grown in open fields or plastic greenhouses and to cope with extreme weather conditions.

Strawberries are a highly valuable crop, but they are susceptible to pests and diseases during the seedling stage. Although all crops face the threat of diseases and pests, strawberries are particularly vulnerable at this stage compared to other crops and times. Failing to prevent foliar diseases and pests during the strawberry seedling period can lead to significant yield and quality reductions, resulting in substantial losses not only for farmers but also for the national agricultural economy. Therefore, there is a pressing need to develop a system model that can swiftly and precisely diagnose various diseases and pests of strawberries to combat this issue.

The objective of this study is to design a research model based on deep learning that can facilitate the early detection and response to leaf pest symptoms for sustainable smart farms. Figure 1 demonstrates the structure of a sustainable strawberry smart farm. The proposed model aims to identify and prevent strawberry leaf pests during the seedling period at an early stage. This is expected to contribute to an increase in strawberry yield, thereby enhancing the competitiveness of the agricultural industry. By using advanced techniques to diagnose and prevent leaf pest symptoms, farmers can optimize crop production, minimize waste, reduce resource consumption, and improve overall efficiency. The successful implementation of this research model will be instrumental in promoting sustainable smart farming practices and addressing the challenges of food security and climate change.

The study’s details and procedures are as follows: First, a study was conducted based on strawberry leaf image data provided by a smart farm consulting company in Korea. Figure 1 shows a diagram of a strawberry smart farm provided by the company. Second, dataset classes were created, and transformations for data augmentation were defined to effectively learn and train the data. The collected image data were resized to unify the image size for the application of the experiment. Additionally, test functions were defined for training and verification. The model was trained to determine whether the leaves were normal or abnormal and to identify the disease names of pests using pre-labeled learning datasets. The Pseudo-Labeling technique was applied to enhance learning training stability. Third, training and validation loss and prediction accuracy were checked using RegNet and EfficientNet, and data training was performed through Pseudo-Labeling. Ensemble learning was then conducted with the results of the two models through the trained data. Fourth, the number of epochs was reduced to streamline training while improving accuracy. Post-processing was performed to identify image data in the form of noise that was not identified during the pre-processing process or negatively affected learning training and prediction accuracy. After the post-processing process, the prediction accuracy of the model was averaged, and the result was generated to complete the study.

This study has four distinctive features compared to previous research.

Our study investigates a wider range of image classifications by comparing four different types of pests with normal leaf images.
We implement a stable learning process using Pseudo-Labeling to enhance the robustness of our model.
Furthermore, to improve prediction accuracy, we leverage ensemble learning of two powerful ImageNet models.
By utilizing an efficient CNN image network model with an optimized number of epochs, our study achieves excellent performance in pest classification.

The rest of the paper is structured as follows: Section 2 provides a review of previous research on deep-learning-based image classification for sustainable smart farms. Section 3 describes the detailed procedure for the study and presents the overall framework based on deep learning. Section 4 presents the implementation results and expected outcomes, followed by a discussion of the results. Finally, Section 5 concludes the paper.

2. Related Work

This section presents an overview of previous research on deep-learning-based image classification for smart farming. First, several previous studies that employed the ImageNet model were examined. For instance, Ref. [4] suggested a technique for the early diagnosis of crop diseases by comparing and analyzing camera-sensing-based technologies. Ref. [5] collected tomato leaf data and compared and classified the characteristics using four different models, including EfficientNet. Ref. [6] proposed a method of classifying mushroom image data based on EfficientNet to prevent misunderstandings about poisonous mushrooms and implement services. Ref. [7] presented a method to classify malicious code files by type through imaging and applying them to the EfficientNet model. Ref. [8] achieved high accuracy in learning image data and predicting the characteristics of difficult-to-distinguish images using the EfficientNet model. Ref. [9] proposed a pest classification method using the superpixel technique and a CNN and built a model that, although slightly inferior in performance, was applicable in real environments. Ref. [10] used the EfficientNet model to detect cracks in wooden cultural assets and showed better performance than other models. Ref. [11] demonstrated good performance and improved image utilization using AlexNet as a strawberry pest image recognition model. Ref. [12] showed high accuracy and efficient training time using the ResNet model, despite learning only 1306 pieces of image data with a low number of epochs. Ref. [13] built a PlantVillage dataset consisting of 55,448 images of 39 classes and found that EfficientNet B5 and B4 models performed better than other models. Ref. [14] introduced the RegNet model as a new network design model, which showed up to five times faster speed on a GPU than EfficientNet under similar training settings and failure conditions. Ref. [15] proposed an effective method for classifying images by optimally matching the width, depth, and resolution of fruit image data.

Next, we provide an overview of prior research on deep-learning-based image classification for sustainable smart farming, with a focus on studies related to foliar diseases and pests. Ref. [16] presented a process for classifying tomato pests using image augmentation techniques consisting of Random Crop, Random Horizontal Flip, and Google AutoAugment. Ref. [17] proposed several specialized systems to diagnose strawberry pests. In this study, an expert system was proposed to prepare for pests by collecting environmental conditions of cultivation, disease occurrence parts of plants, and symptoms revealed through pests. Ref. [18] presented the effect of pest control on plants through related experiments on how they affect plants. In particular, the degree of influence on pests and diseases was measured through direct experiments for control. Ref. [19] proposed a method to improve the pest detection performance of a deep learning model by suggesting a segmented image dataset specialized for symptoms. Referring to the proposed pest detection performance method, this study partially applied it to the process. Information on diagnosing and controlling major pests and diseases of strawberries was referred to in [20]. In particular, pest control during the seedling period was described in detail, and a manual was prepared and distributed in case of infection. In [21], a model that can identify the presence of disease by training strawberry leaf image data with a deep learning model was proposed, which is the predecessor of this study.

The existing studies have achieved high accuracy by using a powerful ImageNet model and have conducted research aimed at predicting various topics. With the recent emergence of numerous ImageNet models, it has been confirmed that studies using related ImageNet models have also increased significantly. In terms of the utilization of the ImageNet model, it can be applied to various topics, but among them, it was found that studies applied to agricultural products are actively increasing. The limitations of existing research include limited diversity of datasets, limited consideration of real-world conditions, and limited consideration of economic feasibility.

This study stands out from previous research in the field of deep learning by utilizing several distinctive features. Firstly, a stable learning process utilizing Pseudo-Labeling is employed. Secondly, an ensemble learning method is implemented by combining two powerful ImageNet models, leading to a significant improvement in the model’s prediction accuracy. Thirdly, despite employing a small number of epochs, the study achieves excellent performance by utilizing an efficient CNN image network model. The academic significance of this study lies in its ability to diagnose, prevent, and mitigate the impact of pests by training a model with leaf image data and utilizing the aforementioned points of differentiation.

Previous studies on strawberry pest classification have focused primarily on enhancing image classification accuracy, specifically addressing issues related to normal and abnormal leaf images. In contrast, the current study distinguishes itself from these previous works by examining a more comprehensive range of image classifications. Specifically, the study compares four different types of pests (Fusarium wilt, Cotton aphid, Powdery mildew, and Mite) with normal leaf images that are commonly encountered in domestic fields. The study presents a novel deep-learning-based model that can automatically classify all five image types with a high degree of accuracy, thus demonstrating its potential in practical applications.

3. Proposed Methods

3.1. Overall Framework

The overall framework of this study is shown in Figure 2. Initially, additional strawberry leaf image data were obtained through the project, web crawling, and open data, which were then used to build the entire dataset by dividing the strawberry leaf image data into a training and test dataset. RegNet and EfficientNet were applied based on the Pseudo-Labeled image data, and ensemble learning was performed to derive the data applied to the model, to verify the utilization plan and expected effect of the study, and to present research limitations and future research directions. In addition, we utilized Grad-CAM to visually illustrate the results of learning. Through the use of Grad-CAM we were able to generate visual explanations of the model’s decision-making process, which can help to improve interpretability and transparency in machine learning applications.

Figure 2 shows the overall framework of this study.

3.2. Research Environment Setup

Optimizing the hyperparameters of a deep learning model requires careful tuning based on the performance of the model on a validation set rather than relying solely on the training phase. By integrating a validation set and discerning the optimal hyperparameters based on both training and validation, we were able to enhance the accuracy and efficacy of the proposed model. The optimizer was set to Lamb, the scheduler was set to cycle, the batch size was set to 8, weight decay was set to 1 × 10⁻³, and finally, the epochs were set to 30. Table 1 shows the hyperparameter settings used for this study.

In the previous study [21], the number of epochs involved in the training was not large since it aimed to discriminate between normal and abnormal leaves. However, in this study, the number of epochs increased compared to the previous study because it aimed to discriminate four types of normal and abnormal leaves based on five criteria. The optimal number of epochs is important to find since a small number of epochs can show efficient training time but low prediction accuracy, while a large number of epochs can result in very high prediction accuracy but takes a lot of training time and can cause overfitting. Overfitting means the excessive learning of the training data in the machine learning process, leading to decreased errors in terms of the training data but increased errors in the actual data.

To prevent bottlenecking and the out-of-memory phenomenon that typically occurs in the Windows OS environment, the research was conducted by replacing it with a Windows workstation environment. The system environment was built with an Intel Xeon Silver 421R CPU, a 32 GB memory size, and an Nvidia RTX A5000 GPU. The research was conducted using Python version 3.9.12 and PyTorch version 1.8.1.

3.3. Data Selection and Organization

This study is based on strawberry leaf data provided by a company specializing in smart farms in South Korea. Although the number of image data was not sufficient for the study, it was necessary to improve the quality of the image data to achieve higher prediction accuracy. Therefore, Beautifulsoup4, a Python library that can easily collect data from Hypertext Markup Language (HTML) information, and Selenium were used to collect strawberry leaf image data. A Chrome Driver was used to store the crawled leaf image data. In addition, the required leaf image data were augmented using the AI HUB, which is operated by the National Information Society Agency (NIA), a government agency in Korea.

However, while the normal leaf image data were sufficiently augmented, the number of datasets for abnormal (pest) leaf image data was not sufficient for deep-learning-based research. Therefore, the disease-type image data were reinforced through the hand-labeling of the unlabeled raw image data. The program used for labeling was the Visual Object Tagging Tool (VoTT), developed by Microsoft (version 2.2.0). VoTT is an open-source annotation and labeling tool written in React using TypeScript. MS VoTT provides a pipeline for processing multiple steps in a data processing system and a learning system at once, including the ability to label images or videos (frame by frame), import datasets from local or cloud storage providers, and export labeled data to local or cloud storage providers. Figure 3 shows the data collection process.

3.4. Data Preprocessing

Our dataset consists of 10,200 strawberry leaf images collected from multiple sources. Specifically, we obtained 2300 images from the company, downloaded 2720 images from AI Hub, and collected 1100 images via web crawling. We applied various data augmentation techniques to the images acquired from AI Hub to increase the dataset size. After excluding 200 low-quality images, our dataset contains 10,000 images. Table 2 shows the composition of the collected and augmented dataset. The pest data are organized into four classes, each with an equal number of samples.

To address the data imbalance issue, we employed various image augmentation techniques, such as horizontal and vertical flipping, rotation, scaling, and brightness adjustment. These techniques helped to increase the diversity of our dataset and improve the generalization capability of our model. Figure 4 displays the four types of datasets used in this study, excluding the normal image dataset. Each dataset was organized based on the collected image data. The normal image dataset shows healthy strawberry leaves, and the pest image dataset shows strawberry leaf images infected with pests. The pest image dataset was further classified into four types, and the dataset was constructed accordingly. The dataset was also pre-processed to remove image data deemed unsuitable for data training or that could adversely affect prediction accuracy.

The collected data were converted into an appropriate dataset. The number of training datasets was 8000, and the number of experimental datasets was 2000, with the datasets constructed in an 8:2 ratio. Three images with channels of 1024 × 1024 resolution were used, and each image data had one of five One-Hot Encoding Labels. The five labels represent the states of leaf image data in this study, and their names are as follows: Normal, Fusarium wilt, Cotton aphid, Powdery mildew, and Mite. This study aims to develop a classification model that can identify and classify strawberry leaves in the above five states.

Table 3 illustrates the arrangement of the training and experimental datasets used in this study. The meaning of the Image Array Shape numbers in the training dataset and the experimental dataset is as follows, from left to right: The first number represents the number of data used for data training. The second and third numbers represent the horizontal and vertical pixel sizes of the image data. The fourth number represents the number of image channels (RGB). The numerical meaning of the Label Array Shape is as follows from the left. The first number indicates the number of labeled data, and the second digit represents the image status that distinguishes the presence or absence of disease and the type of disease.

The format of the image data is set to jpg. The names of the image data were set from 1.jpg to 10,000.jpg. Of these, the training image data ranged from 1 to 8000, and the label array format was only half processed. The test image data ranged from 8001 to 10,000, and the label array format was fully processed.

The high resolution of the original image may cause a bottleneck when loading the image. A bottleneck is a phenomenon in which the overall benchmark or capacity of a system is limited by one or a few elements or resources. In this case, the performance or capacity of the entire system is limited by a small number of image elements that cause bottlenecks. To prevent this bottleneck, the size of the image data set was adjusted to an appropriate size (1024 × 1024). Although the size of the image data in the form of 1024 × 1024 may be rather large, the image size was not processed excessively small to obtain more accurate training and higher accuracy.

3.5. Data Modeling

3.5.1. Pseudo-Labeling

This study adopted a Pseudo-Labeling technique for more stable data learning. Pseudo-Labeling is a representative semi-supervised learning method and is a powerful technique frequently used in data analysis contests such as Kaggle and Dacon. Using the ImageNet model pre-trained through supervised learning with Pseudo-Labeling, prediction is performed on untagged image data. After Pseudo-Labeling, multiple training is performed using the expanded dataset.

The Pseudo-Labeling sequence applied in this study is as follows. First, the ImageNet model is trained with image data with correct labeling. Second, we predict unlabeled data using the previously created ImageNet model and create new Pseudo-Labeled data using the result as a label. Third, another ImageNet model is created and re-trained using both the previously created Pseudo-Label data and the existing image data labeled as the correct answer. Fourth, this process has been performed a total of 5 times in the same way to train the ImageNet model. This process can help improve model performance, particularly in cases where labeled data are scarce or expensive to obtain. Figure 5 shows the Pseudo-Labeling process.

When pseudo-labeling, we can use multiple models to compare performance and then select the best-performing model, as Tasin et al. presented in their paper on using machine learning companies to predict diabetes [22]. In this study, we opted to use a single high-accuracy ImageNet model (RegNet) for pseudo-labeling rather than utilizing a variety of models. Our aim was to propose a novel method and demonstrate its effectiveness, and we found that using a single model was sufficient for achieving our objectives. However, we recognize the potential for improving our method by exploring the use of various ImageNet models in future studies.

The performance of pseudo-labeling can be evaluated by comparing the performance of a model trained with pseudo-labeled data against a model trained without it, typically on a validation dataset. Additionally, the effectiveness of different ImageNet models used in the pseudo-labeling process can also be compared to evaluate the performance of pseudo-labeling. In this study, since the method of comparing the performance of the model trained with and without pseudo-labeled data was not used, we evaluated the performance of the model trained with pseudo-labeled data on a validation dataset, which demonstrated high accuracy results. This suggests a positive impact of pseudo-labeling on the model’s performance.

3.5.2. ImageNet Model

The codes used in this study were written by Akash Haridas [23] and Tarun Paparaju [24] in Kaggle—<Plant Pathology 2020>.

A previous study confirmed the proper use of the model for image processing and high accuracy in the preprocessing stage. Therefore, in this study, it was decided to use the EfficientNet model and the RegNet model. In this study, the EfficientNet-B3 model was used. Considering the size of the model, it was determined that the B3 model was sufficient.

The RegNetY-064 model was used as another model in this study. Using the PyTorch image model library, the model was trained using RegNet, which has a strong generalization performance as a basic model. In this study, Pseudo-Labeling based on RegNetY-064 was repeatedly performed.

3.5.3. Ensemble Learning

In this study, we used RegNet and EfficientNet to create a model directory for prediction and ensemble learning. Additionally, we used Pseudo-Labeling to create a data frame for training, repeating a total of five times. For the first Pseudo-Labelling, we used optimized RegNet Five-Fold ensemble learning. After that, we performed a total of four Pseudo-Labeling update trainings in the same model config, resulting in a total of five Pseudo-Labeling trainings.

We applied ensemble learning, combining RegNet and EfficientNet, in this study. Ensemble learning is a technique that leverages multiple machine learning models to find an optimal solution. It trains and learns data using multiple models and averages the predictions of all models to reduce errors and improve accuracy. Finally, we confirmed that the EfficientNet five-fold ensemble model showed the highest benchmark test. Figure 6 shows the structure of the five-fold ensemble learning used in this study.

4. Research Results

4.1. Confusion Matrix and Learning Curve

Figure 7 is the confusion matrix of this study. A confusion matrix is a representative index used to measure and evaluate the performance of a model and is a matrix-type index that shows how accurately predicted values are predicted to actual observed values. In the confusion matrix, the color of each cell is darker in proportion to the size of the number. Through Figure 7, it can be confirmed that four types of strawberry leaves, except for wilt disease, are distinguished with high accuracy. However, it was confirmed that the wilt disease leaf showed slightly lower accuracy than the prediction accuracy of other leaf image data.

Through the confusion matrix, the accuracy, recall, and precision of this study can be obtained. Accuracy is the ratio of correctly predicted data to the total amount of data, and it can be calculated using the formula shown in Equation (1). Recall is the ratio of true positives to all actual positives, and it can be calculated using the formula shown in Equation (2). Precision is the ratio of true positives to all predicted positives, and it reflects how accurately the model identifies positive instances. It can be calculated using the formula shown in Equation (3).

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(1)

R e c a l l = \frac{T P}{T P + F N}

(2)

P r e c i s i o n = \frac{T P}{T P + F P}

(3)

Figure 8 is the result of the learning curve of this study. As a result of a series of training processes, the model achieved an accuracy of 0.8560, with the training taking approximately one hour and the number of Epochs set to 30. The training time of one hour refers to the time taken to complete all 30 epochs of training using an ensemble of RegNet and EfficientNet models trained on the ImageNet dataset. In this study, we employed the Adam optimizer and CrossEntropyLoss function as the optimization and loss function, respectively. The learning curve graph shows the loss function value and prediction accuracy for each training and validation dataset. Train loss was measured to be 0.0147, and Val loss: 0.1420. The confusion matrix shown in Figure 7 can be used to calculate the recall and precision values using the formulas shown in Equations (2) and (3). As shown in the learning curve graph in Figure 8, it can be seen that high prediction accuracy is shown even at low epochs. Overall, it shows high accuracy from the beginning of training, and it can be confirmed that it continuously increases as the number of epochs increases. However, compared to the previous study [21], the prediction accuracy seems to be somewhat lower.

As for the different accuracies for each type of pest, it is likely due to differences in image quality and image characteristics of each type. It should be noted that we did not compare the performance of our model with existing studies on strawberry pest classification, as we could not find any studies that categorized multiple types of pests for strawberries. This study aims to fill this gap and provide a more comprehensive approach to pest classification in strawberry crops.

Overall, it shows high accuracy from the beginning of training, and it can be confirmed that it continuously increases as the number of epochs increases. The reason for the high prediction accuracy for leaf image data in five states is the use of a powerful ImageNet model and effective data training due to ensemble learning, as well as noise image data or leaf image data that may have problems with learning recognition during the data preprocessing process. It is also considered an appropriate reason to apply data training by erasing in advance.

In conclusion, the leaf image data of this study were trained through a series of preprocessing processes, and through this, a model study that can distinguish five types of leaf image data with high accuracy was completed.

4.2. Utilization Scenarios

This section describes how the deep learning model proposed in this study can be used in a sustainable smart farm. There are two suggestions for the utilization of this study. First, the learned image net model can be used to provide a service in the form of a personal digital assistant (PDA) or mobile app in strawberry farms, and direct photography or drone photography using it is also possible. It is believed that utilizing this technology can help improve the efficiency and productivity of strawberry farming.

The second suggestion is that the service using the learned ImageNet model can be applied to all people in the horticultural crop industry, including beginners who are considering commercializing horticultural crops, such as those returning to farming or young farmers. A virtual service realization screen and use process are provided in Figure 9 to demonstrate how this technology can be applied. This suggests that the service can be used as a tool to support and educate those who are new to the industry, potentially leading to increased participation and success in horticultural crop production.

Second, it is possible to apply the process of this study to other leaf image data to conduct research. As a related study, it is confirmed that most research is conducted through the use of tomato image data, which is considered to be focused on tomato image data research because tomato image data are currently the most shared data. If other image data are supplemented through the shared data hub in the future, it is judged that the process of this study can be used to apply to other ImageNet model studies or image classification studies in the future.

Lastly, if the process of this study is applied to facility gardening, it is expected to provide more smooth and useful services in building smart farms. Figure 10 is a blueprint for building a smart farm using the model of this study.

Two expected effects can be suggested using this study. First, it is possible to block the transmission route at an early stage and minimize damage from pests by identifying the exact symptoms and taking appropriate measures for the symptoms. As a representative example, in the case of damage caused by pests such as cotton aphids, when the damage is confirmed, pesticide prevention or spraying of chemicals should be performed around the area with a certain diameter. On the other hand, in the case of strawberry anthracnose, the seedling itself must be destroyed if the damage is confirmed. As such, it is expected that damage from pests can be minimized by taking correct measures through the application of this study because the size of the damage and coping methods are completely different depending on the symptoms of each pest.

Second, it is possible to increase the production of strawberries and other horticultural crops through proper countermeasures against pests. As a typical example, in the case of wilt disease, an average of 10 to 43% of plants were damaged in an environment of abnormally high temperatures after the rainy season. If the application point of this study is service planning, the number of plants damaged by pests can be reduced, and through this, it can lead to increased production and improved profitability of farms.

4.3. Discussion

In this section, we analyze the characteristics of the leaf image data used in the study and organize the factors that hinder the prediction accuracy of a deep-learning-based classification model. Additionally, we discuss the image pre-processing method used in this study to resolve the factors that hinder prediction accuracy and consider ways to further improve the results of Section 3.4.

The image data used in this study are based on the strawberry leaf image data provided by a smart farm company, as well as other image data generated through web crawling, open data, and data labeling. Since the image data were collected in various ways and was unstructured, it contained a lot of irregular noise. Identifying the image data from the seedling period to the growing period was not easy, and there were many images of complex pests. Furthermore, there was a possibility that the image data training may have been adversely affected due to noise and quality degradation of pixel values. As a result, we spent a lot of time refining the image data through pre-processing and post-processing.

As shown in Figure 7 in Section 4.1 above, the prediction accuracy for toxic wilt disease is lower than that of other pest crops. For wilt disease image data, the image quality was generally low, and there were image data with less pre-processing and noise in the pre-processing process, leading to lower accuracy. In the case of image data for other pests, image data could be easily converted due to the presence of redundant image data. However, not much image data were collected for wilt disease, so the image data itself could not be converted in the post-processing process because there were not enough data.

The accuracy of the learned model was 0.8560 as a result of training through a series of processes. However, compared to the high prediction accuracy derived in the previous study, the prediction accuracy in this study is rather low. The previous study distinguished between normal leaf image data and abnormal leaf image data, so there were not many variables applied. On the other hand, this study classifies leaf image data of five states of pests in addition to normal leaf image data, making it a more complex study, which may account for the relatively lower prediction accuracy.

To improve the prediction accuracy, we applied various types of EfficientNet models with RegNet and compared and analyzed the results of training. EfficientNet performed ensemble learning by applying models from B0 to B7 to RegNetY-064. While EfficientNet-B2 showed the highest prediction accuracy as a single ImageNet model, EfficientNet-B3 showed the highest prediction accuracy when RegNetY-064 and ensemble learning were applied.

Based on our experiments, we observed that the choice of the EfficientNet version is dependent on the specific environment and purpose of use. While higher versions of EfficientNet tend to perform better, their larger model size may lead to longer training times and higher memory usage. In our experiments, we found that EfficientNet-B2 performed the best when used alone, whereas EfficientNet-B3 showed the best performance when combined with RegNet (Refer to Figure 11). This can be attributed to the optimal combination of network width, height, depth, resolution, and learning method in EfficientNet-B2 and B3, given the number and quality of the image data.

In this study, we used Gradient-weighted Class Activation Mapping (Grad-CAM) to identify which parts of the strawberry leaf image the model mainly learned and determined the presence or absence of pests [25]. Grad-CAM is a technique that allows you to visually check which parts of the CNN model have been seen or recognized more visually. Grad-CAM allows you to see how the deep learning model proposed in this study determines which parts of the image are diseased. Figure 12 is a diagram that visually confirms the classification of this research model through Grad-CAM. From left to right, these are images of Fusarium wilt, Cotton aphid, Mite, and Powdery mildew. The figure illustrates the significance of different regions in an input image for a specific class prediction, where warmer colors (e.g., red, yellow) indicate the most critical areas for the prediction, while cooler colors (e.g., blue, green) represent the less important regions.

5. Conclusions

The proposed study aims to develop a deep-learning-based classification model that can detect early symptoms of strawberry leaf pest infections. This will contribute to the establishment of sustainable smart farms. This study proposed a stable learning method using Pseudo-Labeling as an effective pre-processing method, and various ImageNet models, including RegNet and EfficientNet, were used to classify foliar pests with high accuracy. Through ensemble learning, accurate and robust models can be proposed.

The study has significant implications for learning, including the early detection of crop diseases, the identification of patterns through data analysis, and the automation of human decisions. Additionally, the study can aid in collecting and analyzing strawberry leaf image data through web crawling, comparative analysis of accuracy with other models, and utilization of ensemble learning.

The industrial implications of the study are that it can help identify the marketability, scale, and added value of strawberries and horticultural crops, as well as strengthen the management and engineering grounds to increase competitiveness in the agricultural industry. By detecting and responding to crop diseases at an early stage, the study is expected to increase production and bring about a change in agricultural awareness by introducing the latest technology to the industry.

However, it should be noted that the study has certain limitations, mainly stemming from the paucity of pest leaf data and the frequent errors encountered during the experimental phase, which negatively impacted image learning training. In order to further enhance the model’s capabilities, future studies may consider implementing various image processing and augmentation methods. Additionally, it is imperative that further research be conducted to more accurately distinguish between different types and severity levels of strawberry leaf pests. To further advance this area of research, it is recommended that future studies focus on measuring the performance of pseudo-labeling techniques and comparing the efficacy of various ImageNet models in the pseudo-labeling process. By doing so, a more comprehensive understanding of the strengths and weaknesses of these techniques can be achieved, leading to further improvements in image classification and other related fields.

Overall, the study succeeded in deriving high prediction accuracy, and the proposed model can be utilized for artificial intelligence services and even commercialization in the future.

Author Contributions

Conceptualization, H.K. and D.K.; methodology, H.K. and D.K.; software, H.K.; validation, H.K. and D.K.; formal analysis, D.K.; resources, H.K.; data curation, H.K.; writing—original draft, H.K.; writing—review and editing, D.K.; visualization, H.K.; supervision, D.K.; funding acquisition, D.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the Korea Institute for Advancement of Technology (KIAT) grant funded by the Korea Government (MOTIE) (P0017123, The Competency Development Program for Industry Specialist).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Choi, S.-W.; Shin, Y.J. Role of Smart Farm as a Tool for Sustainable Economic Growth of Korean Agriculture: Using Input–Output Analysis. Sustainability 2023, 15, 3450. [Google Scholar] [CrossRef]
Walter, A.; Finger, R.; Huber, R.; Buchmann, N. Smart farming is key to developing sustainable agriculture. Agric. Sci. 2017, 114, 6148–6150. [Google Scholar] [CrossRef]
Durai, S.K.S.; Shamili, M.D. Smart farming using Machine Learning and Deep Learning technique. Decis. Anal. J. 2022, 3, 100041. [Google Scholar] [CrossRef]
You, H.; Son, C.-H. Recognizing Apple Leaf Diseases via Segmentation-Aware Deep Convolutional Neural Networks for Smart Farm. J. Korean Inst. Inf. Technol. (JKIIT) 2017, 17, 197–201. [Google Scholar]
Kim, Y. A Study on Feature Analysis of Tomato Pest Classification Systems; Jeonju University: Jeonju, Republic of Korea, 2019. [Google Scholar]
Han, J.; Choi, E.; Kim, E.; Lee, Y.; Lee, M. Development of Enhanced Mushroom Image Classification Model based EfficientNet. In Proceeding of the 2021 General Conference of Korea Contents Association, Virtual, 15–19 November 2021; pp. 399–400. [Google Scholar]
Kim, E. Classification of Malware Image Types Using EfficientNet; Ajou University: Suwon-si, Republic of Korea, 2021. [Google Scholar]
Ji, B. Prediction of Rheological Properties of Asphalt Binders Through Transfer Learning of EfficientNet. J. Korean Recycl. Constr. Resour. Inst. 2021, 9, 348–355. [Google Scholar]
Kim, M.-B.; Choi, C.-Y. Superpixel-based Apple Leaf Disease Classification using Convolutional Neural Network. J. Brodcast Eng. 2020, 25, 208–217. [Google Scholar]
Kang, J.; Kim, I.; Lim, H.; Gwak, J. A Crack Detection of Wooden Cultural Assets using EfficientNet model. Korea Soc. Comput. Inf. 2021, 29, 125–127. [Google Scholar]
Dong, C.; Zhang, Z.; Yue, J.; Zhou, L. Automatic recognition of strawberry diseases and pests using convolutional neural network. Smart Agric. Technol. 2021, 1. [Google Scholar] [CrossRef]
Xiao, J.-R.; Chung, P.-C.; Wu, H.-Y.; Phan, Q.-H.; Yeh, J.-L.; Hou, T.-K. Detection of Strawberry Diseases Using a Convolutional Neural Network. Plants 2021, 10, 31. [Google Scholar] [CrossRef]
Atila, U.; Ucar, M.; Akyol, L.; Ucar, K. Plant leaf disease classification using EfficientNet deep learning model. Ecol. Inform. 2021, 61, 101182. [Google Scholar] [CrossRef]
Radosavovic, I.; Kosaraju, R.-P.; Girshick, R.; He, K.; Dollar, P. Designing Network Design Spaces; Conell University: Ithaca, NY, USA, 2020. [Google Scholar] [CrossRef]
Chung, D.-T.-P.; Tai, D.-V. A fruits recognition system based on a modern deep learning technique. J. Phys. Conf. Ser. 2019, 1327, 012050. [Google Scholar] [CrossRef]
Ham, H.-S.; Cho, H.-C. A Study on Improvement of Tomato Disease Classification Performance According to Various Image Augmentation. Trans Korean Inst. Elect. 2021, 70, 2000–2005. [Google Scholar] [CrossRef]
Jeon, Y.-A.; Cha, M.-K.; Cho, Y.-Y. Expert System for Diagnosing Disease and Insects of Strawberry. Hortic. Abstr. 2015, 33, 101. [Google Scholar]
Kim, J.-B. Pathogen, Insect and Weed Control Effects of Secondary Metabolites from Plants. J. Korean Soc. Appl. Biol. Chem. 2005, 48, 1–15. [Google Scholar]
Choi, Y.-W.; Kim, N.-E.; Paudel, B.; Kim, H.-T. Strawberry Pests and Diseases Detection Technique Optimized for Symptoms Using Deep Learning Algorithm. J. Bio-Environ. Control. 2022, 31, 255–260. [Google Scholar] [CrossRef]
Ko, H.-R.; Lee, J.-K. Diagnosis and Control of Major Parasitic Nematodes of Strawberries; National Institute of Agricultural Sciences: Jeonju, Republic of Korea, 2017; Available online: https://www.nl.go.kr/NL/contents/search.do?pageNum=1&pageSize=30&srchTarget=total&kwd=%EB%94%B8%EA%B8%B0+%EC%A3%BC%EC%9A%94+%EA%B8%B0%EC%83%9D%EC%84%A0%EC%B6%A9+%EC%A7%84%EB%8B%A8%EA%B3%BC+%EB%B0%A9%EC%A0%9C+ (accessed on 1 March 2023).
Kim, H.; Kim, D. Early Detection of Strawberry Diseases and Pests using Deep Learning. ICIC Express Lett. Part B Appl. 2022, 13, 1069–1075. [Google Scholar]
Tasin, I.; Nabil, T.U.; Islam, S.; Khan, R. Diabetes prediction using machine learning and explainable AI techniques. Healthc. Technol. Lett. 2023, 10, 1–10. [Google Scholar] [CrossRef]
Kaggle-Plant Pathology 2020 in PyTorch. Available online: https://www.kaggle.com/code/akasharidas/plant-pathology-2020-in-pytorch (accessed on 1 March 2023).
Kaggle-Plant Pathology 2020: EDA + Models. Available online: https://www.kaggle.com/code/tarunpaparaju/plant-pathology-2020-eda-models (accessed on 1 March 2023).
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. arXiv 2016, arXiv:1610.02391.

Figure 1. Architecture of a sustainable smart farm.

Figure 2. Overall framework of proposed analysis method.

Figure 3. Leaf image data collection process.

Figure 4. Strawberry leaf image data samples.

Figure 5. Pseudo-Labeling process.

Figure 6. 5-Fold ensemble learning.

Figure 7. Confusion matrix. In the confusion matrix, the color of each cell is darker in proportion to the size of the number.

Figure 8. Learning curve.

Figure 9. Example of a service implementation scenario.

Figure 10. Building a sustainable smart farm using the proposed model.

Figure 11. Intercomparison analysis graph of predictive accuracy of ensemble models.

Figure 12. EfficientNet verification with Grad-CAM visualization.

Table 1. Hyperparameters settings.

Hyperparameters
Optimizer	Lamb
Scheduler	Cycle
Batch size	8
Weight decay	1 × 10⁻³
Epochs	30

Table 2. Composition of collected and augmented dataset.

Classification	Normal Data	Pest Data	Total
Company	700	1600	2300
AI Hub (Before Aug.)	400	2320	2720
AI Hub (After Aug.)	1000	5800	6800
Web Crawling	300	800	1100

Table 3. Training and test dataset.

Classification	Training Dataset	Test Dataset
Image Array Shape	(8000, 1024, 1024, 3)	(2000, 1024, 1024, 3)
Label Array Shape	(4000, 5)	(2000, 5)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, H.; Kim, D. Deep-Learning-Based Strawberry Leaf Pest Classification for Sustainable Smart Farms. Sustainability 2023, 15, 7931. https://doi.org/10.3390/su15107931

AMA Style

Kim H, Kim D. Deep-Learning-Based Strawberry Leaf Pest Classification for Sustainable Smart Farms. Sustainability. 2023; 15(10):7931. https://doi.org/10.3390/su15107931

Chicago/Turabian Style

Kim, Haram, and Dongsoo Kim. 2023. "Deep-Learning-Based Strawberry Leaf Pest Classification for Sustainable Smart Farms" Sustainability 15, no. 10: 7931. https://doi.org/10.3390/su15107931

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep-Learning-Based Strawberry Leaf Pest Classification for Sustainable Smart Farms

Abstract

1. Introduction

2. Related Work

3. Proposed Methods

3.1. Overall Framework

3.2. Research Environment Setup

3.3. Data Selection and Organization

3.4. Data Preprocessing

3.5. Data Modeling

3.5.1. Pseudo-Labeling

3.5.2. ImageNet Model

3.5.3. Ensemble Learning

4. Research Results

4.1. Confusion Matrix and Learning Curve

4.2. Utilization Scenarios

4.3. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI