Rice Disease Identification Method Based on Attention Mechanism and Deep Dense Network

Jiang, Minlan; Feng, Changguang; Fang, Xiaosheng; Huang, Qi; Zhang, Changjiang; Shi, Xiaowei

doi:10.3390/electronics12030508

Open AccessArticle

Rice Disease Identification Method Based on Attention Mechanism and Deep Dense Network

by

Minlan Jiang

^1,*,

Changguang Feng

¹,

Xiaosheng Fang

¹,

Qi Huang

¹,

Changjiang Zhang

² and

Xiaowei Shi

³

¹

College of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua 321004, China

²

School of Electronics and Information Engineering, Taizhou University, Taizhou 318000, China

³

Hangzhou Hikvision Digital Technology Co., Ltd., Hangzhou 310051, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(3), 508; https://doi.org/10.3390/electronics12030508

Submission received: 11 November 2022 / Revised: 4 January 2023 / Accepted: 5 January 2023 / Published: 18 January 2023

(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)

Download

Browse Figures

Versions Notes

Abstract

:

It is of great practical significance to quickly, accurately, and effectively identify the effects of rice diseases on rice yield. This paper proposes a rice disease identification method based on an improved DenseNet network (DenseNet). This method uses DenseNet as the benchmark model and uses the channel attention mechanism squeeze-and-excitation to strengthen the favorable features, while suppressing the unfavorable features. Then, depth wise separable convolutions are introduced to replace some standard convolutions in the dense network to improve the parameter utilization and training speed. Using the AdaBound algorithm, combined with the adaptive optimization method, the parameter adjustment time reduces. In the experiments on five kinds of rice disease datasets, the average classification accuracy of the method in this paper is 99.4%, which is 13.8 percentage points higher than the original model. At the same time, it is compared with other existing recognition methods, such as ResNet, VGG, and Vision Transformer. The recognition accuracy of this method is higher, realizes the effective classification of rice disease images, and provides a new method for the development of crop disease identification technology and smart agriculture.

Keywords:

image recognition; deep learning; disease classification; attentional mechanisms

1. Introduction

As the main food crop in the world, rice has the characteristics of a wide planting area and is difficult to find diseases in it in the early stages. In 2020, China’s rice planting area will resume to grow to 30.076 million hectares, an increase of 382,000 hectares over 2019. With the continuous expansion of the rice planting scale, the problem of rice diseases will become more obvious [1,2]. Rice disease is an important reason affecting the rice yield. At the present, the traditional manual identification technology of rice disease mainly depends on human observation in the field, which requires a lot of time and manpower, is difficult to carry out in a large scale, and is limited by the observer’s own experience, with low accuracy. The timely detection of rice diseases and the accurate and rapid identification of disease types are of great significance to ensuring the safety of rice plants and controlling the spread of diseases [3]. Therefore, the accurate, rapid, and efficient identification of disease types is the key to taking effective measures and achieving accurate spraying, which is of great significance to ensuring China’s grain yield and food security. The rapid development of deep learning [4] has led to its application in many fields; for example, image segmentation [5,6,7], medical image processing [8,9,10], face recognition [11,12,13,14], and autonomous driving [15,16].

Meanwhile, more and more researchers apply the combination of image recognition and deep learning to crop disease recognition. Ma et al. [17] used a deep convolutional network to identify the disease categories of cucumbers. First, the disease images were segmented to construct a disease image dataset, and then they used AlexNet and DCNN for identification, with an accuracy rate of 93.4%. Almasoud et al. [18] proposed a rice disease fusion model based on efficient deep learning. The model is mainly based on median filtering and k-means to locate the disease spot features and uses the gray-level co-occurrence matrix and Inception to derive the features. Finally, FSVM is used for classification, and the classification accuracy reaches 96.17%. Chen et al. [19] proposed the algorithm of optimizing back propagation neural networks to identify and classify three common diseases of rice. The recognition accuracies of the three diseases are 98.5%, 96%, and 92.5%, respectively. Chen et al. [20] adopted the transfer learning method, using MobileNetV2 pre-trained on ImageNet as the backbone network, and added an attention mechanism to enhance the learning ability of the disease spot features, with the average recognition accuracy of rice diseases reaching 98.48%. Qiu et al. [21] used the deep convolution network to establish the rice disease recognition model, trained with the keras deep learning framework, and set different convolution kernel sizes and pooling functions to study the classification and recognition of three rice diseases, with an accuracy of more than 90%. Krishnamoorthy et al. [22] proposed a transfer learning based InceptionResNetV2 model, which integrates features in the form of weights and fine-tunes hyperparameters to identify three rice diseases, achieving a recognition accuracy of 95.67%. Rahman et al. [23] fine-tuned the VGG16 and Inception V3, and detected rice diseases, with an accuracy of 93.3%. These studies show that deep learning combined with image processing can be used for crop disease detection and achieve good results.

Wang Chunshan et al. [24] introduced a multi-scale feature extraction module on the basis of resnet18, established a multi-scale residual network, changed the connection method of the residual layer, and performed grouping convolution operations. The accuracy rate of self-collecting real environmental disease image data is 93.5%. Wu et al. [25] uses the Bayesian algorithm to reduce the difficulty of training and add a residual module to the basic neural network, which can effectively identify tomato diseases. Waheed et al. [26] proposed an optimized DenseNet network with reduced parameters to identify maize leaf diseases. Zhou et al. [27] combined the residual network and dense network and formed a hybrid network architecture for tomato disease identification by adjusting the hyperparameters. Cheng et al. [28] used deep residual network to identify crop pest categories in the complex background. Under the background of complex farmland, the classification accuracy of ten types of crop diseases and insect pests’ images reached 98.67%.

The above literature shows that the convolutional neural network is suitable for crop disease identification, but the network model used at this stage ignores the over-fitting problem caused by the disappearance of the gradient to improve the identification accuracy. Certainly, a network model that alleviates the vanishing gradient problem, such as DenseNet [29] and ResNet [30], has been widely used in the field of crop disease identification. Although the problem of vanishing gradient is alleviated to a certain extent [31], it also brings problems such as low feature utilization and redundant features. Therefore, this paper takes the DenseNet network as the backbone and introduces the extrusion and excitation module (SE) with the advantages of feature weight adaptation. Starting from the feature channel, the feature is merged to solve the problem of low feature utilization and feature redundancy. In addition, using the AdaBound [32] algorithm and depth wise separable convolution, a method for the identification and classification of rice leaf disease categories is constructed.

In summary, the main contributions of our work are as follows:

1. An improved DenseNet network-based rice disease identification method is proposed for identifying multiple rice diseases.

2. Depending on the location of the attention module embedded in the DenseNet network, we propose three variants of the SE-DenseNet network.

3. We introduced the AdaBound optimization algorithm to construct the AB-SE-DenseNet network.

The rest of the paper is organized as follows: Section 2, our proposed method and its variants are introduced, followed by the dataset used and the data pre-processing methods; Section 3, includes the results and a discussion of ablative and comparative experiments; and Section 4, is the conclusion of the paper and future research directions.

2. Materials and Methods

2.1. Experimental Method

2.1.1. DenseNet

The DenseNet network is a dense convolutional neural network proposed in 2017. The main structure of the DenseNet network is the internal dense block (Dense Block). The inner dense module consists of batch normalization (BN), ReLU activation function, and 3 × 3 and 1 × 1 convolutional layers (Conv), as shown in Figure 1. Each neuron is not only connected to its previous neuron, it also establishes connections with the back neurons; for a dense block containing L neurons, there are L × (L + 1)/2 connections, and the output of the first layer is

X_{l} = H_{l} ([x_{0}, x_{1}, \dots, x_{l - 1}])

(1)

where H_l represents the convolution operation of layer l, and x_l represents the output of layer l.

Generally, to avoid complex calculations, a bottleneck module, that is, a 1 × 1 convolutional layer, is added to the dense block to reduce the number of features. Adjacent dense blocks are connected through transition layers to reduce the overall parameter amount of the network and improve computational efficiency. The transition layer consists of convolutional layers and pooling layers. The transition layer is composed of a convolutional layer and a 2 × 2 AvgPooling layer. The DenseNet network structure is shown in Figure 2.

2.1.2. SE Module

Using the DenseNet network for rice disease identification, although the gradient disappearance problem is alleviated to a certain extent, it can be seen from Figure 1 that the connection between any two layers is an equal output fusion, and there is no setting of the connection weight. For the input of each layer, the output of the upper layer of the main road should be used as an important processing object of the input of the lower layer. The combined output of the previous layers should reduce the proportion accordingly, but the features extracted from some earlier layers may still be used directly by the deeper layers, and there are transition layers that will output a large number of redundant features, resulting in low utilization of the previous output features by subsequent dense blocks [33].

This paper proposes an unequal dense convolutional neural network with adaptive weights. The SE [34] module is added to the DenseNet network structure. The weight adaptive method is adopted, and the weight of each channel is allocated by using the interdependence of the feature channels, to enable the neural network to learn important feature information and reduce the impact of feature redundancy.

The SE module mainly includes the basic structures of global average pooling, two activation functions, and fully connected layers. Mainly divided into squeeze-and-excitation operations. The compression operation uses special pooling layers to compress the features map of size C × H × W (W: the width of feature map, H: the height of feature map, and C: the number of feature channels) into C × 1 × 1 features, thereby reducing the amount of parameters. In addition, it will not change the total channel dimension. If the size of the input feature map is C × H × W, then there is input set U = [u₁, u₂, …, u_c], the mapping relationship of the compression operation is

Z c = F s q (u_{c}) = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} u_{c} (i, j)

(2)

where c ∈ C, Zc represents the global information of c feature maps, and Fsq represents the squeeze operation.

The excitation operation is composed of a full connection layer and a sigmoid activation function. The full connection layer integrates all input characteristic information, and the sigmoid function maps the input to the interval [0, 1]. The mapping relationship of the incentive operation is

s = F e x (Z, W) = σ (g (Z, W)) = σ (W_{2} \cdot δ (W_{1} \cdot Z))

(3)

where σ is the sigmoid activation function, δ is the ReLU activation function, Fex is the excitation operation, and W₁ and W₂ are weight parameters of the full connection layer.

Finally, the method performs the scale operation, multiplies the weight of the channel feature learned by the SE module by the weight of the input channel, and fuses the weight with the original feature to obtain the fused feature. This produces the fused input features of rice pests and diseases as the input of the network, reducing feature redundancy and improving the network performance as a result. The structure is shown in Figure 3.

2.1.3. AB-SE-DenseNet

We use the DenseNet framework of Section 2.1 and the SE module of Section 2.2 to establish the SE-DenseNet network, as shown in Table 1. The network uses the SE module to perform adaptive learning of the features and uses global information to selectively enhance favorable features and to suppress unfavorable features, adjusting the adaptation of feature channels, reducing the impact of feature redundancy caused by DenseNet, and improving the network performance. The depth wise separation convolution replaces part of the standard convolution in the dense network. The input channel is convolved with the corresponding channel convolution kernel through the depth wise convolution of the depth wise separable convolution, and, then, all feature maps are integrated through the point convolution operation process to reduce the number of network parameters. The frame structure parameters of the SE-DenseNet model are shown in Table 1.

The SE module is embedded in different locations in DenseNet. This paper proposes three different network models: SE-DenseNet-1, SE-DenseNet-2, and SE-DenseNet-3, as shown in Figure 4. Among them, SE-DenseNet-1 embeds the SE module in the adjacent transition layer and the dense block in the DenseNet model, SE-DenseNet-2 embeds the SE module in the dense block of the DenseNet model, and SE-DenseNet-3 embeds the SE module in the transition layer and in the dense block of DenseNet simultaneously.

To accelerate the training process with better generalization ability and to improve the learning rate and training results of the model, we use the AdaBound algorithm to dynamically tailor the learning rate and create the AB-SE-DenseNet network model, the structure is shown in Figure 5. AdaBound is used to speed up the model fitting. The AdaBound optimizer combines the advantages of the Stochastic Gradient Descent (SGD) and Adam optimizers to take advantage of the dynamic bound on the learning rate, which varies with the descent gradient, and becomes tighter and tighter over time, limiting the learning rate reduction to a minimal value, so the model will also become more and more stable during the training process. The transition from the adaptive optimizer to the SGD optimizer is also realized during the training of the model, which not only improves the convergence speed of the model, but also avoids the problem that the model is easy to fall into a local minimum.

2.2. Experimental Materials and Evaluation Indicators

2.2.1. Experimental Materials and Experimental Environment

To evaluate our method’s performance, we used the experimental environment configuration shown in Table 2.

The data used in the experiment comes from public data and contains images of five diseases: rice blast, blight, brownspot, sheath blight, and tungro, as shown in Figure 6. The images were divided as follows: 298 rice blast, 283 blight, 292 brownspot, 236 leaf sheath blight, and 244 tungro. We expanded the dataset by changing the images’ brightness to simulate sunny and cloudy days in the natural environment, horizontal inversion, viewing angle, and colors to simulate occlusions to obtain a total of 4235 images. We divided the dataset in the ratio of 6:2:2 and preprocessed the experimental dataset images to a size of 224 × 224 pixels. Figure 7 shows an example of the data augmentation. We used cross-entropy loss as the loss function, along with an initial learning rate of 0.001, 50 iterations, and a batch size of 32.

2.2.2. Evaluation Criteria

A confusion matrix is an evaluation metric used to evaluate model performance in deep learning [35] that compares the accuracy rate (Acc), precision rate (Pre), recall rate (Rec), and F1 value (F1) of the model. The accuracy rate is the ratio of the correctly classified samples to all samples. The precision rate is the ratio of the number of actual correct samples to the total number of samples classified as correct. The recall rate is the ratio of the number of samples classified as correct to the total number of samples that are actually correct. The F1 value is the weighted average of precision and recall. The calculation of all metrics is given in formulas (4)–(7). In these formulas, TP is the number of positive samples predicted correctly; FP is the number of false samples incorrectly predicted as positive; TN is the number of false samples predicted correctly; and FN is the number of positive samples predicted incorrectly as false.

A c c = \frac{T P + T N}{T P + T N + F P + F N}

(4)

P r e = \frac{T P}{T P + F P}

(5)

R e c = \frac{T P}{T P + F N}

(6)

F 1 = \frac{2 T P}{2 T P + F P + F N}

(7)

3. Results and Discussion

3.1. Ablation Experiment

We also performed ablation tests using the variation curve of the relationship between the training accuracy and the number of iterations of the models AB-SE-DenseNet, SE-DenseNet, and the original model DenseNet, along with the model loss and the number of iterations as shown in Figure 8. The comparison curves show that the training accuracy of the three models increased, while the prediction error decreased. Stochastic Gradient Descent (SGD) and Adam optimizer DenseNet converged the slowest, requiring about 45 epochs to converge. SE-DenseNet tended to converge in 30 epochs, with a final training accuracy rate of 92.62%. The improved AB-SE-DenseNet model tended to converge in 20 epochs, with the final training accuracy rate reaching 99.93%.

Table 3 shows that the accuracy of the three SE-DenseNet models was higher than that of the DenseNet model, with an accuracy increase of between 6.54% and 10%. The SE-DenseNet’s other evaluation indicators were also better than those of the DenseNet model, indicating that the combination of the SE module and the DenseNet was helpful for the model. As far as the three models are concerned, the SE-DenseNet-3 had the highest accuracy, reaching 95.59%, better than the accuracy of both the SE-DenseNet-1 and the SE-DenseNet-2. This shows that embedding the SE modules in transition layers and dense blocks calibrated the feature channels and recalibrated the original features successfully, boosting the useful features and suppressing the less useful features. The classification performance of the three models of AB-SE-DenseNet was also better than that of the corresponding SE-DenseNet model, confirming that AdaBound dynamically tailored the learning rate successfully, improved the training results of the model, and improved the classification accuracy as a result. The AB-DenseNet-3 had the highest evaluation index, with an accuracy rate of 99.4% and an F1 value of 0.9942.

To understand the performance of the model, we compared the AB-SE-DenseNet and SE-DenseNet models and measured the performance of the model through the confusion matrix. The classification and recognition confusion matrix of each model is shown in Figure 9. Figure 9 shows that AB-SE-DenseNet-3 had the fewest false identifications and the highest accuracy of identifying the five rice diseases.

3.2. Comparison of Different Models

To further verify the performance of our proposed model, we selected the AB-SE-DenseNet-3, ResNet50, DenseNet, VGG, and Vision Transformer models for comparison experiments. The confusion matrix comparison of different methods is shown in Figure 10.

The confusion matrix shows that the model proposed here misidentified only three types of diseases, mostly misidentifying rice blast as blight and brownspot. The rice blast lesions are similar to the blight and brownspot lesions, making it difficult to distinguish. Compared with the 14, 10, 17, and 10 misidentifications by the ResNet, DenseNet, VGG, and Vision Transformer models, the recognition error rate of our method was low, and our model correctly classified more of the diseases than other models.

Table 4 and Table 5 show the comparison of the classification accuracy of each disease class of different models. The classification accuracy of our proposed method was not the best at classifying blast. The Vision Transformer had a blast classification accuracy of 98.8%, which was 1.7 percentage points higher than that of our method. This occurred because the blast lesions are similar to those of other disease types, and our method was less suited to learning the extracted features from blast images. On the remaining four diseases, our method had the best classification and recognition rates at 100%, 99.4%, 100%, and 100%. Overall, the average recognition rate of our method was more accurate and stable.

To compare the prediction time of the model, we used the trained model parameters to time the test of a single image. The results are shown in Table 6. The recognition time of our method was 3.71 s, which is faster than that of ResNet, VGG, and Vision Transformer, and slightly slower than that of DenseNet. However, the recognition accuracy of our method was higher, which shows that our method was more suitable for rice disease recognition.

To better represent the effectiveness of the attention mechanism we used, we compared it with the ECA attention module and the CBAM attention module on the same data set and in the same experimental setting, and the results of the comparison are shown in Table 7. From Table 7, we can see that our proposed model outperforms the DenseNet model of the ECA attention module in terms of accuracy and recognition time. Compared with the DenseNet model of the CBAM attention module, the accuracy is slightly lower by 0.4%, but the recognition time is more than 20 s slower than our proposed method, so that our proposed method has better recognition speed.

In addition, we compared our proposed model with those identified in the literature in the field of rice diseases, as shown in Table 8. The accuracy rate of our model for rice disease identification reached 99.4%, which was higher than those reported in the literature, which further shows that the AB-SE-DenseNet model had better identification accuracy.

3.3. Feature Visualization

Figure 11 shows a visualization of the class activation heatmap of the sample input images to more intuitively illustrate the classification prediction effect of our proposed model. Compared with other models, our proposed model learned more comprehensive rice disease features and a variety of feature information, with better recognition and classification results.

4. Conclusions

In this study, we introduced a DenseNet-based channel attention mechanism for rice disease recognition and classification. The average recognition accuracy of the improved model for rice diseases is 99.4%. Compared with the DenseNet, ResNet, VGG, and Vision Transformer models, our proposed method has the highest average accuracy and has a faster recognition speed of 3.71 s. In addition, the recognition accuracy for each disease can reach more than 97%, and some can even reach 100%, indicating that our proposed method is effective for rice disease identification. In future work, we will focus on the severity of the disease and use a deep learning approach to evaluate the severity of rice diseases and deploy it on the mobile so that it can identify diseases and their severity more quickly.

Author Contributions

Conceptualization, M.J. and C.F.; methodology, M.J.; software, C.F.; validation, Q.H. and C.F.; formal analysis, X.F.; investigation, Q.H.; resources, C.Z.; data curation, C.F.; writing—original draft preparation, C.F.; writing—review and editing, M.J.; visualization, X.S.; supervision, C.Z.; project administration, X.S. and X.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (41575046) and by the Talents Scheme in Zhejiang Province (2021R404065), as well as the General scientific research project of Zhejiang Provincial Department of Education (Y202045743).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

If scholars need more specific data, they can send an email to the corresponding author or the first author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhou, Q.L. Research on the control of rice planting diseases and insect pests. Rural. Pract. Technol. 2021, 2021, 98–99. [Google Scholar]
Xu, C.C.; Ji, L. Analysis of my country’s rice industry situation in 2020 and outlook for 2021. China Rice 2021, 27, 1–4. [Google Scholar]
Zahid, I.; Attique, K.M.; Muhammad, S.; Hussain, S.J. An automated detection and classification of citrus plant diseases using image processing techniques: A review. Comput. Electron. Agric. 2018, 153, 12–32. [Google Scholar]
Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Khan, Z.; Yang, J. Bottom-up unsupervised image segmentation using FC-Dense u-net based deep representation clustering and multidimensional feature fusion based region merging. Image Vis. Comput. 2020, 94, 103871. [Google Scholar] [CrossRef]
Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
Farhat, H.; Sakr, G.E.; Kilany, R. Deep learning applications in pulmonary medical imaging: Recent updates and insights on COVID-19. Mach. Vis. Appl. 2020, 31, 42–53. [Google Scholar] [CrossRef]
Chen, H.; Qi, X.; Yu, L.; Dou, Q.; Qin, J.; Heng, P.A. DCAN: Deep contour-aware networks for object instance segmentation from histology images. Med. Image Anal. 2017, 36, 135–146. [Google Scholar] [CrossRef]
Cruz-Roa, A.A.; Arevalo Ovalle, J.E.; Madabhushi, A.; González Osorio, F.A. A Deep Learning Architecture for Image Representation, Visual Interpretability and Automated Basal-Cell Carcinoma Cancer Detection. In Proceedings of the 16th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Nagoya, Japan, 22–26 September 2013; pp. 403–410. [Google Scholar]
Li, Y.F.; Zeng, N.Y. Improving Deep Learning Feature with Facial Texture Feature for Face Recognition. Wirel. Pers. Commun. 2018, 103, 1195–1206. [Google Scholar] [CrossRef]
Sharma, S.; Kumar, V. Voxel-based 3D face reconstruction and its application to face recognition using sequential deep learning. Multimed. Tools Appl. 2020, 79, 17303–17330. [Google Scholar] [CrossRef]
Ranjan, R.; Sankaranarayanan, S.; Bansal, A.; Bodla, N.; Chen, J.C.; Patel, V.M.; Castillo, C.D.; Chellappa, R. Deep Learning for Understanding Faces Machines may be just as good, or better, than humans. IEEE Signal Process. Mag. 2018, 35, 66–83. [Google Scholar] [CrossRef]
Adjabi, I.; Ouahabi, A.; Benzaoui, A.; Taleb-Ahmed, A. Past, Present, and Future of Face Recognition: A Review. Electronics 2020, 9, 1188. [Google Scholar] [CrossRef]
Kocic, J.; Jovicic, N.; Drndarevic, V. An End-to-End Deep Neural Network for Autonomous Driving Designed for Embedded Automotive Platforms. Sensors 2019, 19, 2064. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Grigorescu, S.; Trasnea, B.; Cocias, T.; Macesanu, G. A survey of deep learning techniques for autonomous driving. J. Field Robot. 2020, 37, 362–386. [Google Scholar] [CrossRef] [Green Version]
Ma, J.; Du, K.; Zheng, F.; Zhang, L.; Gong, Z.; Sun, Z. A recognition method for cucumber diseases using leaf symptom images based on deep convolutional neural network. Comput. Electron. Agric. 2018, 154, 18–24. [Google Scholar] [CrossRef]
Almasoud, A.S.; Abdelmaboud, A.; Eisa, T.A.E.; Al Duhayyim, M.; Elnour, A.A.H.; Hamza, M.A.; Motwakel, A.; Zamani, A.S. Artificial Intelligence-Based Fusion Model for Paddy Leaf Disease Detection and Classification. Cmc Comput. Mater. Contin. 2022, 72, 1391–1407. [Google Scholar] [CrossRef]
Chen, Y.; Guo, S.Z. Research on rice disease recognition algorithm based on optimized BP neural network. Appl. Electron. Technol. 2020, 46, 85–87+93. [Google Scholar]
Chen, J.; Zhang, D.; Zeb, A.; Nanehkaran, Y.A. Identification of rice plant diseases using lightweight attention networks. Expert Syst. Appl. 2021, 169, 114514. [Google Scholar] [CrossRef]
Qiu, J.; Liu, J.; Cao, Z.; Li, J.; Yang, Y. Research on rice disease image recognition based on convolutional neural network. J. Yunnan Agric. Univ. 2019, 34, 884–888. [Google Scholar]
Krishnamoorthy, N.; Prasad, L.N.; Kumar, C.P.; Subedi, B.; Abraha, H.B.; Sathishkumar, V.E. Rice leaf diseases prediction using deep neural networks with transfer learning. Environ. Res. 2021, 198, 111275. [Google Scholar]
Rahman, C.R.; Arko, P.S.; Ali, M.E.; Khan, M.A.I.; Apon, S.H.; Nowrin, F.; Wasif, A. Identification and recognition of rice diseases and pests using convolutional neural networks. Biosyst. Eng. 2020, 194, 112–120. [Google Scholar] [CrossRef] [Green Version]
Wang, C.S.; Zhou, J.; Wu, H.R.; Teng, G.F.; Zaho, C.J.; Li, J.X. Improved Multi-scale ResNet for vegetable leaf disease identification. Trans. Chin. Soc. Agric. Eng. 2020, 36, 209–217. [Google Scholar]
Wu, H.R. Tomato leaf disease identification method based on deep residual network. Smart Agric. 2019, 1, 42–49. [Google Scholar]
Waheed, A.; Goyal, M.; Gupta, D.; Khanna, A.; Hassanien, A.E.; Pandey, H.M. An optimized dense convolutional neural network model for disease recognition and classification in corn leaf. Comput. Electron. Agric. 2020, 175, 105456. [Google Scholar] [CrossRef]
Zhou, C.; Zhou, S.; Xing, J.; Song, J. Tomato Leaf Disease Identification by Restructured Deep Residual Dense Network. IEEE Access 2021, 9, 28822–28831. [Google Scholar] [CrossRef]
Cheng, X.; Zhang, Y.; Chen, Y.; Wu, Y.; Yue, Y. Pest identification via deep residual learning in complex background. Comput. Electron. Agric. 2017, 141, 351–356. [Google Scholar] [CrossRef]
Huang, G.; Liu, Z.; Laurens, V. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Tian, Y.; Yang, G.; Wang, Z.; Li, E.; Liang, Z. Detection of Apple Lesions in Orchards Based on Deep Learning Methods of CycleGAN and YOLOV3-Dense. J. Sens. 2019, 2019, 7630926. [Google Scholar] [CrossRef]
Luo, L.C.; Xiong, Y.H.; Liu, Y.; Sun, X. Adaptive Gradient Methods with Dynamic Bound of Learning Rate. arXiv 2019, arXiv:1902.09843. [Google Scholar]
Wang, J.H. PCB Bare Board Defect Detection Based on Improved ORB Image Registration and Deep Learning; Wuhan University of Technology: Hubei, China, 2020. [Google Scholar]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [Google Scholar] [CrossRef] [Green Version]
Taner, A.; Öztekin, Y.B.; Duran, H. Performance Analysis of Deep Learning CNN Models for Variety Classification in Hazelnut. Sustainability 2021, 13, 6527–6537. [Google Scholar] [CrossRef]
Liu, F.W.; Yu, L.; Luo, J.X. A hybrid attention-enhanced DenseNet neural network model based on improved U-Net for rice leaf disease identification. Front. Plant Sci. 2022, 13, 922809. [Google Scholar] [CrossRef] [PubMed]
Yang, B.; Zhang, L.N. Intelligent collection of rice disease images based on convolutional neural network and feature matching. J. Electron. Imaging 2022, 31, 051410. [Google Scholar] [CrossRef]
Wang, Y.B.; Wang, H.F.; Peng, Z.H. Rice diseases detection and classification using attention based neural network and bayesian optimization. Expert Syst. Appl. 2021, 178, 114770. [Google Scholar] [CrossRef]

Figure 1. Structure of internal dense modules. BN: batch normalization; ReLU: activation function. Convolution layers are denoted as Conv, including 1 × 1 convolution and 3 × 3 convolution.

Figure 2. DenseNet network architecture.

Figure 3. SE module architecture. Fsq represents the squeeze operation in the SE module. Fex represents the excitation operation in the SE module. Scale represents multiplying channel weights.

Figure 4. SE-DenseNet model architecture. (a) SE-DenseNet-1; (b) SE-DenseNet-2; (c) SE-DenseNet-3.

Figure 5. AB-SE-DenseNet model architecture. The variants of the AB-SE-DenseNet network are all based on the three SE-DenseNet networks shown in Figure 4.

Figure 6. Images of rice disease: (a) rice blast; (b) blight; (c) brownspot; (d) sheath blight; and (e) tungro.

Figure 7. Data enhancement example: (a) original image; (b) flip; (c) rotation; (d) contrast enhancement; (e) brightness enhancement; and (f) color enhancement.

Figure 8. (a) Model training accuracy; (b) Model training loss.

Figure 9. Confusion matrix of each model: (a) SE-DenseNet-1; (b) SE-DenseNet-2; (c) SE-DenseNet-3; (d) AB-SE-DenseNet-1; (e) AB-SE-DenseNet; and (f) AB-SE-DenseNet-3.

Figure 10. Confusion matrix comparison of different methods: (a) DenseNet; (b) VGG; (c) ResNet; and (d) Vision Transformer.

Figure 11. Class activation heatmap of different models: (a) original image (b) VGG; (c) DenseNet; (d) ResNet; and (e) AB-SE-DenseNet-3.

Table 1. SE-DenseNet model frame architecture.

Network	Parameter
Input Layer	224 × 224 × 3
Convolution Layer	3 × 3 DwsConv
Dense Block 1	{3 × 3 DwsConv} × n
Transition Layer 1	1 × 1 Conv 2 × 2 average pool, stride2
SE Layer	Squeeze, Excitation
Dense Block 2	{3 × 3 DwsConv} × n
Transition Layer 2	1 × 1 Conv 2 × 2 average pool, stride2
SE Layer	Squeeze, Excitation
Dense Block 3	{3 × 3 DwsConv} × n
Classification Layer	7 × 7 global average pool Full layer

Table 2. Experimental environment configuration.

Operating system	64 Bit Windows 10
Graphics card	GTX1050Ti (4 GB)
CPU	Intel i7-6700
Learning framework	Pytorch
RAM	16 GB
Hard disk	1 TB

Table 3. Recognition performance table of each model.

Models	Accuracy	Precision	Recall	F1 Score
DenseNet	85.6%	87.36%	85.12%	0.8623
SE-DenseNet-1	92.14%	93.36%	92.42%	0.9289
SE-DenseNet-2	94.04%	94.2%	95.78%	0.9498
SE-DenseNet-3	95.59%	96%	95.52%	0.9576
AB-SE-DenseNet-1	94.4%	94.82%	94.46%	0.9465
AB-SE-DenseNet-2	98.57%	98.68%	98.52%	0.9860
AB-SE-DenseNet-3	99.4%	99.4%	99.44%	0.9942

Table 4. Comparison of statistical results of different models.

		AB-SE-DenseNet-3	DenseNet	ResNet	VGG	Vision Transformer
	Total Images	Correct Identification	Correct Identification	Correct Identification	Correct Identification	Correct Identification
Blast	163	162	154	148	135	160
Blight	183	181	169	175	164	180
Brownspot	182	180	74	172	136	174
Sheath_blight	149	149	148	116	130	143
Tungro	163	163	157	110	11	160

Table 5. Disease recognition accuracy of different models.

	AB-SE-DenseNet-3	DenseNet	ResNet	VGG	Vision Transformer
Blast	97.6%	66.1%	70.2%	46.1%	98.8%
Blight	100%	86.7%	80.2%	85.4%	94.2%
Brownspot	99.4%	95.5%	99.2%	74.7%	96.7%
Sheath_blight	100%	90.3%	95.2%	82.8%	99.3%
Tungro	100%	98.2%	80.7%	68.8%	98.2%

Table 6. Comparison of recognition time of different models.

Model	AB-SE-DenseNet	DenseNet	ResNet	VGG	Vision Transformer
Recognition time	3.71 s	3.40 s	5.05 s	11.73 s	7.55 s

Table 7. Comparison with other attention mechanisms.

Model	Accuracy	Recognition Time
DenseNet + ECA	99.2%	8.67 s
DenseNet + CBAM	99.8%	29 s
Ours	99.4%	3.71 s

Table 8. Comparison with models proposed in the related literature.

Authors	Models	Accuracy
Liu [36]	Attention-enhanced DenseNe	96%
Yang [37]	Improved ResNet	97.4%
Wang [38]	ADSNNBO	94.65%
Krishnamoorthy [22]	InceptionResNetV2	95.67%
ours	AB-SE-DenseNet	99.4%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, M.; Feng, C.; Fang, X.; Huang, Q.; Zhang, C.; Shi, X. Rice Disease Identification Method Based on Attention Mechanism and Deep Dense Network. Electronics 2023, 12, 508. https://doi.org/10.3390/electronics12030508

AMA Style

Jiang M, Feng C, Fang X, Huang Q, Zhang C, Shi X. Rice Disease Identification Method Based on Attention Mechanism and Deep Dense Network. Electronics. 2023; 12(3):508. https://doi.org/10.3390/electronics12030508

Chicago/Turabian Style

Jiang, Minlan, Changguang Feng, Xiaosheng Fang, Qi Huang, Changjiang Zhang, and Xiaowei Shi. 2023. "Rice Disease Identification Method Based on Attention Mechanism and Deep Dense Network" Electronics 12, no. 3: 508. https://doi.org/10.3390/electronics12030508

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Rice Disease Identification Method Based on Attention Mechanism and Deep Dense Network

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Method

2.1.1. DenseNet

2.1.2. SE Module

2.1.3. AB-SE-DenseNet

2.2. Experimental Materials and Evaluation Indicators

2.2.1. Experimental Materials and Experimental Environment

2.2.2. Evaluation Criteria

3. Results and Discussion

3.1. Ablation Experiment

3.2. Comparison of Different Models

3.3. Feature Visualization

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI