1. Introduction
Artificial intelligence is becoming a driving force in all aspects of life, and the agricultural and food industries are not far behind. Artificial intelligence has been used in numerous areas, such as healthcare, education, agriculture, and many other areas. In the area of healthcare, AI has been used for diagnosing numerous diseases, such as skin cancer disease [
1], identifying different anatomy objects [
2], predicting neurodevelopmental disorders in children [
3], mental health, and other issues [
4,
5]. When it comes to agriculture, the world is facing challenges, such as increasing global population, global warming, and other human-caused environmental hazards, which may eventually lead to an increase in demand for food supplies. This is where AI and the computer-vision-driven Agtech industry appears to come to the rescue by speeding up the process of harvesting, quality control, picking and packing, sorting, grading, and other processes [
6]. When it comes to fruits, they are very delicate and decay quickly. Around 30–35% of harvested fruits get wasted due to improper and delayed identification, classification, and the grading of fruits undertaken by non-skilled workers. Fruit classification is considered the most difficult and vital process when it comes to selling/purchasing fruit. A person dealing with selling/buying fruits needs to have knowledge of the different varieties of fruit(s) for prizing purposes. So, a person needs to have good knowledge of recognizing the different varieties of fruits.
Many fruit, vegetable, and seed identification, classification, and grading methods have been developed [
7]. There have been different classification methods proposed for different classes of fruits. For instance, Altaheri et al. [
8] proposed a robotic harvesting model designed to classify five different types of date fruits. This model achieved around 99% accuracy. This model used an in-house dataset for training and testing. The dataset contained overall 8000 images. In another study, Shamim Hossain et al. [
9] developed a fruit classification model for industrial applications. They used the publicly available dataset to train and test their model. One of the datasets contained images of fruits which are complex to identify. The proposed model achieved an accuracy of 85%. Gulzar et al. [
10] proposed a model for seed classification based on VGG16. They used thirteen types of different seeds, and the model achieved 99% accuracy. On the other hand, Hamid et al. [
11] proposed a model based on the same dataset and used MobileNetV2 as the base model. They incorporated a transfer learning technique [
12], and the model acquired 99% accuracy. Saranya et al. [
13] undertook a comparative study in which they trained different machine learning and deep learning models on a public dataset. This dataset contains images of different fruits, such as apples, bananas, oranges, and pomegranates. They concluded that deep learning-based models outperform machine learning models. Rojas-Aranda et al. [
14] developed a model to classify fruits in retail stores using deep learning. The purpose of this study was to improve the checkout process in retail stores. The model showed an accuracy of 95% when the fruits were within the plastic bags, whereas the accuracy was recorded as 93% when the fruits were not covered by plastic. Sridhar et al. [
15] proposed a model for 31 different types of fruits using a hybrid approach. They incorporated CNN and an autoencoder to handle these huge data of 31 different fruits. They claim that their model achieved 99% of accuracy. Zhou et al. [
16] developed a model to detect the plumpness of the strawberry fruit. They attained around 86% accuracy in terms of detecting strawberries in the greenhouse. They used RGD data while training the proposed model. Mamat et al. [
17] proposed a model based on deep learning using with Only Look Once (YOLO) versions and adopted transfer learning for palm oil fruit. The model attained 98.7% accuracy for palm oil fruit.
Some researchers focused on the identification and classification of fruit diseases [
18]. This study used VGG19 architecture as a base model. They claimed that their proposed model obtained around 99% accuracy in classifying fruits and their diseases. In another study, Assuncao et al. [
19] proposed a deep learning model to operate on mobile devices. This model aims to classify peaches based on their freshness as well as to identify three types of diseases found in peach fruit. The accuracy of the model was recorded as 96%. They incorporated some preprocessing techniques to improve the accuracy of the proposed model.
There have been some studies that focused on the quality of the fruits, such as [
20,
21,
22,
23]. Garillos-Manliguez et al. [
20] proposed a model for the estimation of the maturity of papaya fruit. The unique thing about this model is that it is trained on hyperspectral and visible-light images, unlike other models. These images not only show the external characteristics but also provide details about the inside of the fruit. The model acquired 97% of accuracy in terms of estimating the maturity of papaya fruit. Herman et al. [
21] chose oil palm fruit to check its ripeness. The dataset they used contained around seven different types of ripeness levels of the oil palm fruit. They trained two well-known architectures (AlexNet and DenseNet) on this dataset and concluded that DenseNet outperformed AlexNet in terms of accuracy by 8%. Mahmood et al. [
23] performed a comparative study on two well-known architectures (AlexNet and VGG16) to check the maturity level of jujube fruit. The dataset contained three different varieties of images in terms of ripeness (unripe, ripe, and over-ripe). They also used some preprocessing techniques, such as data augmentation. They claimed that VGG16 outperformed the AlexNet architecture by achieving an accuracy of 98%.
When it comes to apple fruits, the apple is a Rosaceae family fruit that originated in Asia. It is grown in over 63 countries throughout the world, with China being the main producer. Due to their high-water content, carbohydrates, organic acids, vitamins, minerals, and dietary fibers, apples are regarded as the most nutritious food. The apple, which is the fourth most widely cultivated and consumed fruit on the planet, can be divided into several types depending on its qualitative characteristics [
24]. There are around 7500 different varieties of apples found in the world [
25]. Different apple varieties have different benefits when it comes to health. For a common person, it is not easy to identify all kinds of apples and other fruits with many varieties. Therefore, there is a need for an approach/model based on deep learning, which can identify different kinds of fruits and solve the problem of being dependent on an expert and improve the efficiency and accuracy in identifying and classifying different fruit types.
In this study, a deep learning approach was proposed for the classification and identification of different kinds of fruits. The proposed model incorporates a transfer learning technique, which helps to solve problems involving issues of insufficient training data. This technique encourages the idea of not training the model from scratch and significantly helps in reducing training time. In this study, a well-known deep learning model, MobileNetV2 [
26], was used as the base model but was modified by adding five different layers for improving the accuracy and reducing the error rate during the classification process. The proposed model is trained on a dataset containing 40 varieties of fruits. The results show that the proposed model achieved the highest accuracy rate in identifying different types of fruits. The proposed model can be deployed in a mobile application for practical usage. Further details about it are mentioned in
Section 2.
The following points summarize the contributions of this paper:
A detailed review was conducted to examine the most notable work in fruit classification via machine learning and deep learning.
A fruit classification problem was re-introduced based on a pre-trained MobileNetV2 CNN model, in which different kinds of fruits were classified.
A modified model was proposed using advanced deep learning techniques for fruit classification, and different model-tuning techniques were used to reduce the chances of model overfitting, such as dropout and data augmentation techniques.
An optimization technique was developed to monitor any positive change in the validation accuracy and validation error rate. In case of change, a backup of an optimal model was taken to make sure that the proposed model shows optimal accuracy and the least validation loss.
The remainder of this article is organized as follows: In
Section 2, the description of the dataset, model selection, proposed model, model tuning, and experimental settings are reported and discussed. In
Section 3, the results and discussions are provided, whereas
Section 4 describes the conclusion.
3. Results
This section presents the performance of the proposed model in terms of training accuracy, training loss, as well as validation accuracy, as shown in
Figure 5. The proposed model was trained for 100 iterations. From the figure, it can be noticed that the training accuracy of the proposed model started with 44% from the first iteration, and the accuracy increased dramatically. The accuracy touched 90% within the first 10 iterations. At the 30th iteration, the model reached maximum accuracy (100%). From the 30th iteration onwards, it can be inferred that the accuracy rate of the proposed model has remained at the maximum until the end of the training.
Figure 5 also shows the training loss of the proposed model. From the figure, it is depicted that at the beginning of the training phase, the training loss is high as the model has not been exposed to the data. However, gradually the model reads the images and starts to remember them, and eventually the training loss gets reduced. It can be noticed that the training loss reaches 0.6 within the first 20 iterations and gets reduced dramatically by each iteration. By the 100th iteration, the training loss has reached 0.3 which infers the characteristics of a good model.
Generally, most of the models perform well during training. However, they don’t perform well during the validation phase. This is due to the fact that the model has only been trained on supervised data. In order to find out its performance, it is important to validate the performance of the model. For this purpose, the proposed model was validated as well.
Figure 5 shows the validation accuracy and loss of the proposed model. It depicts that the model starts with 15% of accuracy at the beginning of the process and within 10 iterations, the model reaches 100%. From the 5th iteration onwards, the model’s validation accuracy remains steady. When it comes to the validation loss of the proposed model, it can be seen from the figure that the model starts with a very high validation loss, which is common in all deep learning models. The validation loss of the proposed model dramatically falls and reaches 0.5 within the first 10 iterations. From the 10th iteration onwards until the 50th iteration, the validation loss remains constant and then starts falling again from the 55th iteration. Finally, it reaches 0.35 at the end of the 100th iteration.
During the training and validation process, the model shows stability in its performance. The proposed model shows a very high accuracy rate during training, and it reflected the same during validation. When it comes to training loss and validation loss, the model proves that it does not overfit. This is due to the fact the preprocessing techniques used in the proposed model helped the model to achieve better results without overfitting. Moreover, the data augmentation technique incorporated in the proposed model played a vital role in exposing the model to different variations of the images. Additionally, using of dropout technique helped the model’s validation performance by making sure that the model does not deviate much from its training performance.
Table 3 shows the performance of the TL-MobileNetV2 model based on each fruit class in terms of precision, recall, F1-score, and support. The supports show the number of images used for training and validating the model after applying the data augmentation technique. As mentioned earlier, for each instance of the image, the image augmentation technique creates ten different instances of that image, which are used for model training. It can be inferred from the table that the model achieved the maximum value for each class when it comes to precision, recall, and F1-score except for some of fruits, such as Apple Golden 1, Apple Red 3, Apple Red Yellow 2, Banana Lady Finger, Cantaloupe 2, Cherry Wax Red, and Kaki. This is due to the fact that the dataset does not have any variations present in the background of any images. As stated earlier, the background of each instance was removed, which makes this fruit dataset fit for the proposed model. However, the precision of Apple Golden 1, Apple Red 3, and Apple Red Yellow 2 is 0.97, 0.96, and 0.98, respectively. This is due to the fact that Apple Golden 1 resembles Apple Golden 2 in terms of color; likewise, Apple Red 3 resembles Apple Red 2, and Apple Red Yellow 2 resembles Apple Red Yellow 1. The overall accuracy of the proposed model for all classes of fruits during training is 100%, as shown in the figure. It is important to note that applying different preprocessing techniques in the model helped to achieve a high accuracy rate.
Usually, models perform well during training and under supervised data. However, when it comes to real-world data, they usually do not perform well. For that reason, we tested our proposed model’s performance by feeding it unseen data during the testing phase. The dataset contains instances of images for testing purposes. It is important to note that the model has not seen testing dataset images before. So, using such a dataset for testing will help to identify the fair performance of the proposed model without any bias.
Table 4 presents the testing results of the TL-MobileNetV2 model in terms of precision, recall, F1-score, and support. From the table, it can be inferred that the proposed model has achieved the maximum accuracy in all classes except for a few, such as Apple Golden 1, Apple Red 3, Apple Red Yellow 2, Banana Lady Finger, Cantaloupe 2, Cherry Wax Red, and Kaki. Their precision is recorded 0.92, 0.96, 0.98, 0.92, 0.95, 0.92, and 0.94, respectively, whereas the F1-score of these fruits is 0.94, 0.97, 0.96, 0.94, 0.96, 0.93, and 0.95, respectively. This may be due to the fact that there are chances of false negative predictions between Golden 1 and Golden 2, as well as between Red 1 and Red 3, and Red Yellow 1 and Red Yellow 2 as these classes are somehow similar in terms of color. The model achieved 99% accuracy in both the training and testing phase. This also proves that the model did not overfit as there would have been a difference in the training and testing score of the TL-MobileNetV2 model.
4. Conclusions
Machine learning techniques, particularly those that are suited for computer vision, have started to be widely used in precision agriculture. These techniques are used in various areas, such as fruit classification, quality analysis, yield estimation, and disease prediction. The success of these techniques has encouraged the development of deep learning models for seed classification. In this study, a deep learning model TL-MobileNetV2 was developed based on MobileNetV2 architecture. A dataset of forty types of fruits were used to train and test the proposed model. In the TL-MobileNetV2 model, five different layers were added after removing the classification layer present in the MobileNetV2 architecture to improve the efficiency and accuracy of the model. Along with this, different preprocessing and model-tuning techniques were used to make the TL-MobileNetV2 perform well on the said dataset without overfitting. The experimental results show that the TL-MobileNetV2 has performed well on the fruit dataset by attaining 99% accuracy.
In future work, a mobile-based application will be further enhanced using a larger number of different fruits, which aims to lead to a wider range of fruit classification. This application will help people with limited knowledge to classify different types of fruit and their different varieties. Furthermore, different CNN models will be trained on the dataset, and their results will be compared to identify the best-fit model in terms of accuracy and efficiency.