1. Introduction
In the last few years, climate change impacts on food production and human life have become more serious regarding the huge changes in humans lifestyle, urbanization, natural resources shortages. The large increase in population densities has led to the increase of food security and safety demands. The need to increase food production and mitigate or adapt to the climate change impacts on the agricultural field were the driving force for integrating the smart system in agriculture production. Based on [
1], 690 million people around the world are suffering from hunger and more than 200 million had malnutrition. Agri-food production systems are requesting to increase food production with sustainable action to match the Sustainable Development Goals (SDG).
Smart agriculture utilizes technologies such as sensors, robotics, IoT and artificial intelligence (AI). The rapid development in the field of IoT [
2] has supported the revolution of smart agriculture and its ability to be deployed in the open field and the greenhouse. It is implemented by collecting data from sensors then the data are diagnosed and analyzed by the system to identify anomalies. Based on the problems identified, the utilized platform decides the action that needs to be taken to solve them. There are many applications concerning the smart agriculture field such as smart irrigation, agricultural machinery, disease detection in the open field and micro-climate control, environmental control and marketing chain in the smart greenhouse. The applications of smart agriculture are summarized in
Figure 1 concluded from Food and Agriculture Organization of the United Nations (FAO) Catalogue, 2021 [
1].
Kodali, Ravi et al. [
3] studied the improvement of agricultural practices by providing a model of a smart greenhouse to eliminate the use of manual inspection and monitoring the plants’ environmental conditions which helped reduce 80% of the water waste and provide climate control to ensure proper growth for these plants.
Wiangtong, Theerayod et al. [
4] developed a controller to monitor data such as temperature and humidity and send them to clients via the internet while the hardware take decisions to regulate temperature and humidity.
Awan, Kamran et al. [
5] proposed a time-driven trust management mechanism to secure the data transmitted from the sensors to the cloud by identifying malicious nodes that can affect secure environments.
Nowadays, the current Egyptian government has taken revolutionary steps to digitize most of the governmental services sectors such as health, traffic and agriculture. They started deploying IoT systems including sensors to measure the humidity and moisture of the soil and then the data are transmitted to farmers’ phones via satellite signals and the farmer would be able to irrigate his land while staying at home. This initiative will help regulate the irrigation process which will lead to a drastic reduction of water waste and increase crops productivity.
Climate change has severely impacted the crops yield and quality in Egypt. The most destructive plant diseases (Potato Late Blight, Tomato Late Blight) have expanded in the last few decades in response to climate change [
6]. Tomato is the most important vegetable in terms of world production and consumption [
7]. Egypt is the fifth worldwide producer of tomatoes after India, the United States, China and Turkey. These countries represent 62% of the world’s tomato production. The main reasons for yield decrease in tomato production are the diseases affecting the plants which start from the leaves then spread to the entire plant [
8]. To date, farmers in Egypt rely on human inspection to identify tomato leaf diseases which can lead to a huge waste of time and a large probability of error. The need for using new technologies such as Artificial Intelligence (AI) and Computer Vision (CV) arises in the efforts to improve the plant disease detection procedure.
The field of artificial intelligence and computer vision has seen an immense surge due to its ability of image recognition and classification [
9]. Machine learning and deep learning are both subcategories of AI [
10]. Machine learning is the process of training machines to be able to perform certain tasks without the need of explicit programming. Deep learning is a subset of machine learning [
11] and is based on neural networks, it can be based on supervised or unsupervised learning [
12]. Deep learning has become more popular recently due to its large variety of applications in the field of computer vision and natural language processing. The word “deep” in deep learning refers to the large number of layers that are embedded into the models. Unlike machine learning, deep learning models have the ability to extract features on its own without the need for a human to make adjustments or choose these features. Among many of deep learning structures, A convolutional neural network (CNN) is a type of deep learning model that is widely used in image classification due to its ability to automatically and adaptively learn features [
13]. A CNN normally consists of three types of layers: convolutional layer, pooling layer and fully-connected layer [
14]. The first two layers, convolutional and pooling are the layers responsible for the feature extraction of the images while the fully connected layer transforms the extracted features into image classification. Image classification is the process of inspecting a certain image and predicting the class to which the image belongs to [
15].
This research studies different deep learning models such as ResNet50 [
16], InceptionV3 [
17], AlexNet [
18], MobileNetV1 [
19], MobileNetV2 [
20] and finally MobileNetV3 [
21]. Based on the literature review, there were multiple research points that were not clear. First, the testing and deployment of MobileNetV3 on the tomato leaf diseases dataset. Second, most researches did not discuss in detail the effect of using different optimizing techniques on the previously mentioned CNN models. Third, the details of the hardware deployment of the previous models were not disclosed. In this study, a workstation and a Raspberry Pi 4 were used to test the performance of several deep learning models especially MobileNetV3 which to the best of our knowledge was not tested with PlantVillage’s tomato leaf disease dataset. Each of the CNN models were tested using different optimizers to achieve the highest possible accuracy. The Raspberry Pi 4 was chosen for the models deployment due to its low cost and due to the absence of internet connection in most of Egypt’s agricultural lands which are placed in rural areas.
This paper is divided into six sections.
Section 2 consists of a description of Deep Learning, Transfer Learning and the presentation of the dataset used in this research.
Section 3 is concerned with the evaluation and benchmark of different CNN models with the accuracy and loss metrics. In
Section 4, the CNN models were deployed and tested on a workstation and a Raspberry Pi 4 Model B (Sony factory, Wales, UK) to evaluate their performance in real-time prediction. In
Section 5, the results based on the training and deployment of the models on both the workstation and the Raspberry Pi 4 are discussed and compared.
Section 6 concludes the summarized results of the implementation of different models on a Raspberry Pi as a first step in this research to build a handheld device capable of detecting tomato leaf diseases.
3. Results
3.1. Experimental Setup
In this study, the results are divided into two phases: training and testing. The training and testing phase were deployed on the operating workstation which consists of an Intel Core i7-6800k CPU (Massachusetts, USA), NVIDIA GTX 1080 GPU (Hsinchu, Taiwan), 32 GB RAM and a 512 GB Samsung NVMe PCIe M2 Solid State Driver (Hwaseong, South Korea). The environment is set up using Microsoft’s Visual Studio Code and Python 3.7 (Delaware, United States) with the Tensorflow 2.0 (open-source artificial intelligence library).
In the direction of building a handheld device capable of tomato leaf disease detection, a Raspberry Pi 4 Model B was also used for evaluating and testing the models after training. It consists of Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 1.5 GHz processor (San Jose, California), 2 GB RAM and a 16 GB SD card for storage running on Raspbian 64-bit operating system (Cambridge, UK).
3.2. Evaluation and Benchmark
In order to measure the performance and accuracy of the proposed CNN models, they were compared using the metrics of accuracy and loss. Each training has been carried out for 50 epochs with a batch size of 32. The training of the following models was done using multiple optimizers such as Adam, Adagrad, RMSProp, SGD with momentum:
InceptionV3;
ResNet50;
AlexNet;
MobileNetV1;
MobileNetV2;
MobileNetV3 Large;
MobileNetV3 Small.
The evaluation of the former CNN models is studied by applying different optimization techniques. The confusion matrices of the best and worst optimizers are discussed in the following section.
3.2.1. InceptionV3
The comparison between different optimizers used while training the dataset using the InceptionV3 CNN Model is shown in
Figure 8.
Based on
Figure 8, it is concluded that the SGD optimizer with momentum converged in approximately 10 epochs and achieved the highest accuracy of 99.92% with a loss value of 0.0027. The Adagrad optimizer took more than 25 epochs to converge and achieved an accuracy of 99.53% with a loss value of 0.0146. The lowest accuracy was achieved by the Adam optimizer at a value of 99.06% with a loss value of 0.0255. The evaluation results of using different optimizers are summarized in
Table 3.
Figure 9a shows the confusion matrix for the Adam optimizer achieving the lowest accuracy of 93.76% and a loss value of 0.0901 due to a large misclassification of the Early Blight disease.
Figure 9b shows the confusion matrix after testing InceptionV3 model optimized with SGD achieving the highest test accuracy of 99.62% and a loss value of 0.011.
3.2.2. ResNet50
The comparison between different optimizers used while training the dataset using the ResNet50 CNN Model is shown in
Figure 10. These figures demonstrates that the Adagrad optimizer converged at approximately 20 epochs and achieved the highest accuracy of 99.80% with a loss value of 0.0069 while the SGD optimizer achieved a lower accuracy of 99.74% and a loss value of 0.01 but it converged at 15 epochs. The lowest accuracy was achieved by the Adam optimizer with a value of 99.08% and a loss value of 0.0288. The evaluation results of using different optimizers are summarized
Table 4.
Figure 11a shows that the model trained with Adam optimizer misclassfied multiple diseases and achieved a test accuracy of 79.08% with a loss value of 2.4515. In the training phase of the model, Adagrad optimizer achieved the highest accuracy of 99.80% while the SGD optimizer achieved a marginally lower accuracy of 99.74%. However, in the testing phase of the model, SGD optimizer achieved a higher accuracy of 99.62% with a loss value of 0.0126 demonstrated in
Figure 11b.
3.2.3. AlexNet
The comparison between different optimizers used while training the dataset using the AlexNet CNN Model is shown in
Figure 12.
Figure 12 demonstrates that Adam, Adagrad and RMSProp optimizers haven’t converged at 50 epochs and the Adagrad optimizer failed to surpass 50% accuracy. The highest accuracy was achieved by the SGD optimizer at a value of 98.26% and a loss value of 0.0520 while the lowest accuracy was achieved by the Adagrad optimizer at a value of 28.95% with a loss value of 2.0. The evaluation results of using different optimizers are summarized in
Table 5.
Table 5 demonstrates that accuracy achieved by the AlexNet model when trained with different optimizers. The Adagrad optimizer achieved the lowest accuracy and did not converge, this is most likely due to the scaling down of the learning rate so much that the algorithm ends up stopping before reaching the optimum [
32].
Figure 13a shows that the AlexNet model trained and optimized with Adagrad failed to converge at 50 epochs to classify tomato leaf diseases and achieved an accuracy of 34.50% with a loss value of 1.8941. The highest testing accuracy was achieved by the SGD optimizer at a value of 96.68% and a loss value of 0.0957 demonstrated in
Figure 13b.
3.2.4. MobileNetV1
The comparison between different optimizers used while training the dataset using the MobileNetV1 CNN Model is shown in
Figure 14. MobileNetV1 model trained with SGD converged earlier than the rest of the optimizers and achieved the highest accuracy of 99.83% and a loss value of 0.0039. The lowest accuracy was achieved by the Adam optimizer with a marginally lower accuracy of 99.46% and a loss value of 0.0811. The evaluation results of using different optimizers are summarized in
Table 6.
Figure 15 show the confusion matrix evaluated for the Adam and SGD optimized MobileNetV1 CNN model.
Figure 15a shows that the Adam optimizer achieved the lowest test accuracy of 98.76% with a loss value of 0.0322 while
Figure 15b achieved the highest test accuracy of 99.49% and a loss value of 0.0130.
3.2.5. MobileNetV2
The comparison between different optimizers used while training the dataset using the MobileNetV2 CNN Model is shown in
Figure 16. This comparison indicates that similarily to MobileNetV1, SGD optimizer achieved the highest accuracy of 99.89% and a loss value of 0.0035 and converged earlier than the rest of the optimizers. The lowest accuracy was achieved by the adam optimizer with a value of 99.17% and a loss value of 0.0262. The evaluation results of using different optimizers are summarized in
Table 7.
Figure 17 shows the confusion matrix evaluated for the Adam and SGD optimizers for the MobileNetV2 CNN model.
Figure 17a demonstrates that the model optimized by Adam misclassified multiple diseases, hence achieving the lowest test accuracy with a value of 85.53% and a loss value of 1.0276.
Figure 17b shows the confusion matrix for the SGD optimizer and it can be concluded that it achieved the highest test accuracy of 99.49% and a loss value of 0.0130.
3.2.6. MobileNetV3 Large
The comparison between different optimizers used while training the dataset using the MobileNetV3 Large CNN Model is shown in
Figure 18.
Figure 18 demonstrates that the model optimized with SGD converged earlier than the other optimizers and achieved an accuracy of 99.92% and a loss value of 0.0029. However, the Adagrad optimizer achieved a marginally higher accuracy of 99.98% and a lower loss of 0.005. The lowest accuracy was achieved by the RMSProp optimizer at a value of 99.49%. The evaluation results of using different optimizers are summarized in
Table 8.
Figure 19 shows the confusion matrix evaluated for the Adagrad and RMSProp optimizers. Even though all optimizers had similar training accuracy, in the testing phase
Figure 19b demonstrates that the Adagrad optimizer has achieved the highest test accuracy by far with a value of 99.81% and a loss value of 0.0088. The lowest test accuracy was achieved by the RMSProp optimizer at a value of 92.67% and it can be shown in
Figure 19a.
3.2.7. MobileNetV3 Small
The comparison between different optimizers used while training the dataset using the MobileNetV3 Small CNN Model is shown in
Figure 20. Unlike all previous results, SGD took the longest time to converge and achieved an accuracy of 99.59% while Adagrad optimizer converged earlier than the other optimizers and achieved the highest accuracy of 99.86% and a loss value of 0.0039. The evaluation results of using different optimizers are obtained in
Table 9.
Figure 21 shows the confusion matrix evaluated for the Adagrad and RMSProp optimizers.
Figure 21b demonstrates that the Adagrad optimizer had the best performance achieving a test accuracy of 98.99% and a loss value of 0.0331 while the RMSProp achieved the lowest accuracy of 94.11% and a loss value of 0.9751 leading to a large misclassifcation of diseases demonstrated in
Figure 21a.
4. Hardware Deployment
The previous CNN models were trained on the workstation discussed in the Materials and Methods Section. The same workstation was used to test and evaluate the performance of these models. A Raspberry Pi 4 Model B was also used to test and evaluate the CNN models’ performance on a less powerful hardware environment. The Raspberry Pi 4 Model B was chosen due to its powerful computing capability compared to its size [
33] and compact design. It is the first step in this research to create a handheld device capable of real-time tomato leaf disease detection. The trained models discussed in the previous section were then transferred to the device to measure the time it takes to predict a single image.
Figure 22 compares the prediction time for each CNN model taken to identify a single image on the workstation. MobileNetV3 Large was able to make a prediction in half of the time it took InceptionV3 and with a marginal difference in the accuracy between the models with a value of 50 ms.
Based on
Figure 22, MobileNetV3 Small achieved the minimum latency out of all the tested models on the Raspberry Pi 4. However, the ratio between the prediction time on the workstation and the Raspberry Pi 4 is not uniform and that could be due to the different architecture in their processors which requires more research.
The CNN models ResNet50 and InceptionV3 took the longest time to predict a single image and that is due to their complexity and deep architecture.AlexNet had a reasonable latency with an accuracy of 96.68%. The three versions of MobileNet have a similar latency value varying from 50 ms to 74 ms on the workstation and 315 ms to 348 ms with the exception of MobileNetV3 Small with a latency of 66 ms on the workstation and 251 ms on the raspberry pi 4 achieving an accuracy of 98.99% and a loss value of 0.0331 using the Adagrad optimizer. Both the workstation and the Raspberry Pi 4 were used to generate the confusion matrices. The generation time took 52 s on the workstation and 322 s on the Raspberry Pi 4.