1. Introduction
Plant diseases significantly impact growers’ productivity [
1,
2,
3]. Weather, the environment, microorganisms, viruses, and bacteria make plants vulnerable to various diseases during their growth. Among these diseases, leaf diseases are the most commonly encountered. However, relying solely on visual observation and empirical judgment often proves challenging for the timely and accurate identification of disease types due to the large number of diseases and their similarities. Consequently, delayed diagnosis exacerbates disease spread and leads to significant losses in crop yield and economic benefits. Therefore, it is imperative to swiftly and accurately determine disease types to facilitate effective plant disease control.
Traditional plant disease recognition algorithms [
4,
5,
6] have significantly progressed in extracting and analyzing image features using conventional classification methods. For instance, Wu et al. [
7] extracted features, including color, HSV, texture, and histograms of directional gradients, from diseased grape leaf images. They employed principal component analysis (PCA) for dimensionality reduction and a multi-feature fusion approach for feature vector formation. Ultimately, the support vector machine (SVM) algorithm was utilized for disease recognition, achieving an accuracy of 92.5%. Mokhtar et al. [
8] utilized a support vector machine (SVM) algorithm with different kernel functions for the classification and identification of tomato mosaic disease with an average accuracy of 92%. Despite the success of these approaches, the manual feature extraction process is intricate and subjective, making it difficult to determine an optimal and robust feature set. Furthermore, plant diseases often exhibit comprehensive features encompassing texture, shape, and color, posing significant challenges to traditional plant leaf disease recognition algorithms and limiting improvements in recognition outcomes.
The advent of deep learning has introduced a novel approach to disease recognition. Plant leaf disease recognition methods based on convolutional neural networks (CNNs) [
9,
10,
11] offer notable advantages, including independence from specific features and high recognition accuracy. For instance, Yang et al. [
12] proposed a fine-grained classification model, LFC-Net, with a self-supervised mechanism to classify images of eight tomato diseases and healthy leaf images with 99.7% accuracy. X Sun et al. proposed a transfer-learning-based method for maize disease recognition, fine-tuning pre-trained inception series network models, resulting in an improved recognition accuracy and reduced training time [
13]. Yan et al. presented an enhanced model based on VGG16 for identifying apple leaf diseases, achieving an overall accuracy of 99.01% in apple leaf classification [
14]. By replacing the first layer of the convolutional kernel with three 3 × 3 convolutional kernels and adjusting the base number to 64, Wang et al. achieved a recognition accuracy of 90.22% in detecting 276 actual corn disease images using an improved version of the original ResNeXt101 model [
15]. However, traditional enhanced CNNs and similar approaches fail to further analyze disease features, exhibiting numerous parameters and high computational complexity. Although these methods yield superior recognition results, they necessitate abundant computational resources and extensive storage space for operation, thereby limiting their application in resource-constrained mobile devices. Consequently, there is a growing demand for lightweight network models that offer a high performance with a low computational cost in plant disease recognition applications in real-life scenarios. A typical neural-network-lightweighting approach replaces the original convolutional layers using depth-separable convolution. However, the introduced depth-separable convolution may ignore or lose some vital information, which leads to a decrease in the recognition accuracy of the model [
16,
17,
18,
19]. We also tried depth-separable convolution as a lightweighting strategy in our initial experiments, which resulted in about a 2% decrease in the recognition accuracy of the model without shrinking the convolution kernel.
So, based on depth-separable convolution, we propose a new lightweighting method. We removed the pointwise convolution operation in depth-separable convolution and set the number of groups to the number of input channels in the depthwise convolution operation, which we call extreme grouping convolution. Considering the advantages of ResNet18 [
20], such as its small size, good performance, and extensibility with a modular design, we used it as our improved model.
Our improvement points are as follows:
- (1)
We used the extreme grouping convolution method to replace the original model convolutional layer in order to reduce the number of model parameters and computation amount to achieve lightweighting while maintaining the recognition accuracy.
- (2)
Considering the characteristics of extreme grouping convolution, there is less information interaction between groups. We added an SE [
21] attention module with a squeeze ratio of 16 to each basic block, which promotes the information interaction between groups and, at the same time, emphasizes the critical diseases and suppresses the influence of irrelevant factors on the model.
- (3)
Combining the characteristics of small and similar plant lesions, we canceled the downsampling in layers 3 and 4 of the model, increased the resolution of the feature map to 4 times that of the original, and introduced dilated convolution [
22] to maintain the size of the receptive field of the model.
- (4)
We improved the residual connection in the original model so that the model can learn the errors of two neighboring layers to further improve the performance of the network model.
- (5)
Finally, by reducing the number of convolution kernels to (16, 32, 64, 128) and removing the network redundancy, we constructed a lightweight residual network called Model_Lite.
4. Summary and Outlook
4.1. Summary
This study proposed a lightweight plant leaf disease recognition network model based on ResNet18, which addresses the issues of numerous parameters, extensive computation amounts, and complexity involved in recognition models. The proposed model improved the characteristics of the disease recognition process:
First, a grouping convolution method was proposed, with the number of groups being equal to the number of channels in order to reduce the model size. The number of parameters in the improved model was approximately 1/46 of that in the original ResNet18 model, and the number of operations was approximately 1/11. Moreover, the average recognition accuracy was higher by 0.47%.
We introduced the SE attention module to address the challenges in recognizing diseases with complex backgrounds. This module enhanced the interaction of information between grouped convolutional units and improved the model’s ability to extract the crucial features of the diseases. By incorporating the SE maximal grouping, the average recognition accuracy of the model was improved by 0.48%.
To tackle the issue of losing disease features due to the low resolution of the feature map, the downsampling operation of the original network was removed in layer 3 and layer 4 of the model. This led to an increased resolution of the feature map of 28 × 28. Furthermore, a cavity volume was incorporated to maintain the exact size of the network’s receptive field. These modifications significantly enhanced the model’s performance. The average recognition accuracy of the maximum grouping model witnessed a 0.53% improvement, with only a slight increase in the computational effort required.
Boosting the residual connection to make the feature extraction more comprehensive and intricate led to further enhancements in the model’s performance. The average recognition accuracy of the maximum grouping model was enhanced by 0.19%, while the same number of parameters and computational resources was maintained.
Rational streamlining of the network models was achieved by reducing the number of convolutional kernels.
Model_Lite, the final model with fewer convolutional kernels, had 1/344 of the parameters and 1/35 of the computational effort of the original ResNet18 model. Despite a 0.34% decrease in the average accuracy on the experimental dataset, Model_Lite achieved the highest recognition accuracy of 91.21% compared to lightweight networks such as MobileNet, ShuffleNet, SqueezeNet, and GhostNet. Moreover, the number of parameters in the model and the computation amount were significantly less than those of the mainstream lightweight network models. Deploying the model to a mobile terminal could enable plant growers to conveniently identify diseases and minimize the economic losses resulting from untimely disease diagnosis. Additionally, this study’s improved methodology provides a reference for future research on lightweight network models.
4.2. Outlook
Although this study made some breakthroughs, challenges and limitations remain. The following are a few core issues we identified and some future research directions:
Considering the relatively small number of complex background apple disease samples we constructed on our own, the learning potential of the model is still not fully unleashed. We plan to expand the dataset further to improve the model’s performance more comprehensively.
As the resolution of the feature map increased, the recognition accuracy of the model was optimized, but the computational effort of the model also increased. For this reason, subsequent research will focus on this area, seeking strategies to reduce the computational cost while maintaining accuracy.
Currently, our model has yet to be deployed on mobile platforms. This is an area to be developed, and in the future, we will focus on how to make it lightweight and optimize its performance for the mobile environment.