Radish Growth Stage Recognition Based on GAN and Deep Transfer Learning

Zhou, Ximeng; Yang, Xinhao; Luo, Songshi

doi:10.3390/app13148306

Open AccessArticle

Radish Growth Stage Recognition Based on GAN and Deep Transfer Learning

by

Ximeng Zhou

,

Xinhao Yang

^*

and

Songshi Luo

School of Mechanical and Electrical Engineering, Soochow University, Suzhou 215006, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(14), 8306; https://doi.org/10.3390/app13148306

Submission received: 4 June 2023 / Revised: 10 July 2023 / Accepted: 14 July 2023 / Published: 18 July 2023

(This article belongs to the Special Issue Convolutional Neural Network and Its Applications in Image Detection and Recognition)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Image recognition of plant growth states provides technical support for crop monitoring; this reduces labor costs and promotes efficient planting. However, difficulties in data collection, the required high levels of algorithm efficiency, and the lack of computing power resources create challenges to the development of intelligent agriculture. As a result, a deep transfer learning algorithm is proposed in this paper. The main motivation for this study was the lack of a dataset of plant growth stages. The key idea was to collect radish growth stage images in an experimental field using standardized equipment and to generate more images using DCGAN. By improving the deep transfer learning model, radish growth stages can be identified much more accurately. In this study, five different deep migration models were selected, namely, Inception-v3, MobileNet, Xception, VGG-16, and VGG-19. Our experiment demonstrated that Inception-v3 was the most suitable model for the recognition of plant growth states. Based on Inception-v3, we propose three improved models. The test accuracies for the radish and Oxford Flower datasets were 99.5% and 99.3%, respectively. Additionally, the accuracy of the pest and disease dataset also achieved excellent performance, with an accuracy of 94.7%, 2.4% higher than previously. These results demonstrate the wide applicability of our model and the rationality of constructing a radish growth stage dataset.

Keywords:

plant growth stage; deep transfer learning; convolutional neural network; DCGAN

1. Introduction

The state of agricultural development has a decisive effect on national development. Mechanization alone is insufficient. With the rapid development of computer vision technology and machine learning, especially deep learning, an increasing number of researchers are applying such technologies to the agricultural field. Different from previous methods, computer vision technology has the following advantages: low cost, good real-time data acquisition, and no contact with the tested object, thereby avoiding damage to the tested object. The ability of smart agriculture to determine the plant growth stage is of great significance to the healthy growth and high yield of crops. Plant growth stage identification would provide valuable data for the planning, organization, and timely implementation of agricultural activities (spraying, irrigation, fertilization, etc.), thereby playing an important role in guiding such activities and predicting crop yield [1,2,3].

Of the many existing plant species, Cruciferae is economically valued for its variety and distribution all over the world. Numerous varieties are closely related to people’s daily life as sources of food and nutrition, and their application is extensive, e.g., some parts can be used as vegetables and oil crops, while others can be made into medicinal herbs, spices, etc. Vegetables in the cruciferous family comprise two genera, one being Brassica and the other being the more common radish genus, which includes the large green radish, radish, and so on. Vegetable planting technology is mature; however, it also increases the cost of human learning. Scientific and efficient planting can generate great economic benefits and social value. Therefore, improving the economic benefits of vegetable planting and realizing automatic management is of great significance. Scientific studies have shown that cruciferous plants have different needs in terms of illumination, nutrition, and water at different growth stages. In addition, corresponding field operations, such as thinning, topdressing, and soil raising, depending on growth state, are crucial to the quality and yield of crops. Traditional identification methods are based on manual observation. Correct identification of plant growth stages and states requires professional knowledge, and this knowledge requires time to learn and practice. At present, information on plant growth stage and status must be manually and regularly recorded from collected images, and then analyzed. Once tagged and archived, the information can then be used for the development of irrigation, fertilization, and disinfection systems, as well as yield prediction. However, this method is time consuming and costly, and the direct contact involved may cause damage to plants.

In recent years, with the breakthrough of computer vision technology, image recognition methods based on deep learning have become widely used. Image recognition technology is not only low cost but also convenient for real-time analysis. Radish genera share similar growth characteristics, with rosette basal leaves and alternate stems and leaves, so their morphologies at different growth stages are very similar. We can use this characteristic to realize the identification of the growth stages of radish plants and develop different auxiliary and automatic planting strategies according to the different growth stages, thereby promoting the development of smart agriculture.

A convolutional neural network [4] (CNN) automatically extracts the important features in a picture, which saves labor costs. However, the study of plant growth stage identification and growth state monitoring is not complete. This is because few datasets exist for each growth stage of plants; moreover, a small dataset would lead to a complex convolutional neural network overfitting, making it difficult to achieve improved recognition effects. In view of this, in this paper, solutions are introduced to avoid the cumbersome process of manual monitoring and make it possible for the comprehensive automation of smart agriculture; this, in turn, is of great significance to the development of smart agriculture in general. The main contents of this paper include (1) a dataset of radish growth stages; (2) the proposal of a growth stage recognition method based on DCGAN [5] for image preprocessing and data enhancement; (3) the introduction of a deep transfer learning method to improve the accuracy of image classification using a small dataset; and (4) a demonstration of the effectiveness of the proposed model using the dataset collected in this paper, the official dataset of Oxford Flowers, and the AI Challenger 2018 pest and disease classification dataset.

2. Related Work

The convolutional neural network (CNN), the core element of deep learning methods in the field of image recognition, has developed rapidly and achieved good results in the application of image recognition. In 2012, AlexNet was proposed [6] and its significant effect began the upsurge in the era of convolutional neural network research. In 2014, the concept of Inception was introduced and won the most difficult ImageNet challenge for visual object recognition—the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). In the ILSVRC challenge, the emergence of large datasets, such as ImageNet, and the rapid improvement of GPU computing performance led to the rapid development of research on convolutional neural networks. VGG, GoogLeNet, and Inception are also considered the most popular deep learning architectures due to their perfect performance in object recognition [7,8], face detection [9,10], scene understanding [11], and other fields. As a powerful technique in the field of artificial intelligence, deep learning is now a popular method of fruit detection and has achieved accurate results on desktop computers [12,13,14]. Hani et al. [15] used a method of combining U-Net with Faster R-CNN for apple detection in an orchard and reached yield estimation accuracy of 97.8%. Santos et al. [16] utilized Mask R-CNN with ResNet-101 to detect grapes and showed a 0.84 F1-score at an intersection over union (IoU) of 0.5. The authors proposed a method for detecting plant diseases using deep ensemble neural networks in article [17]. Their method used classification techniques to fine-tune multiple pre-trained neural network models, including ResNet-50, Inception-v3, DenseNet, MobileNet-v3, and NasNet. When the DENN method was applied to the Plant Village database, the performance of DENN was better than other pre-trained neural network models. A method for the identification and classification of plant leaf diseases is disclosed in publication [18]. The authors used K-means clustering technology for image segmentation and GLCM for feature extraction. A plant dataset was used for classification by a support vector machine (SVM), achieving a classification accuracy of 99.99%. The above research shows that deep learning is helpful for plant recognition and relevant research has achieved good results.

Most convolutional neural networks, however, have too many convolutional layers and parameters, consuming a great amount of computing resources and reducing the computing efficiency. Moreover, although datasets in the field of image recognition are increasingly rich, they still fail to meet the needs of individual researchers, and the vast majority of datasets still require annotation by the researchers themselves. The production of a large number of datasets requires much time and energy on the part of the researcher, and the researchers may also lack professional equipment. Therefore, the deep transfer learning method, which can obtain a good generalization performance model when trained using a small dataset, could be widely used in the study of image recognition, which is also deemed a future trend.

Deep transfer learning (DTL) can apply weight parameters trained under a large-scale dataset to relevant problems by introducing high-performance and lightweight deep convolution models, such as Inception-v3 [19,20,21], MobileNet [22], etc. Deep transfer learning is generally divided into two methods: freezing and fine-tuning [23,24], of which the freezing method is suitable for smaller datasets. The depth migration study model structure flexibility is poor, and it is difficult to change the network structure. For example, in 2014, Zhang et al. proposed the DDC method [25] by adding the maximum mean difference (MMD) adaptive measurement criterion before the classification layer, adding the difference between the source domain and the target domain in the loss function; however, its training ability was limited. Later, based on the DDC method, the proposed DAN method [26] improved the adaptation layer number and the MMD method; however, this method was only applicable to AlexNet, and experimental verification was needed for other networks. In order to establish the connection between the training field and the testing field, the method of domain transfer combined with task transfer can be used [27]. However, this results in complicated training and undoubtedly creates a burden on the calculation of the convolutional neural network. The problem of overfitting caused by too small a dataset is still unresolved. The proposed adaptive batch normalization (ADABN) method [28] can easily transfer the trained model to a new domain by adjusting the statistical data in the BN layer. The method is simple to implement and saves computing resources.

3. The Proposed Model

In this paper, cherry radishes of the radish genus were selected as the research object. The growth cycle of the radish genus is similar, being generally divided into three stages: the germination, seedling, and leaf growth flourishing stages. The germination period is from seed germination to cotyledon expansion. The seedling stage is from the first pair of true leaves to the appearance of four or so rosettes. During the vigorous phase of leaf growth, the stem surface near the ground cracks open, the leaf area increases rapidly, and fleshy roots grow until maturity. Sample images at different stages are provided in Figure 1 below; these illustrate the changes in size and characteristics of the cherry radish. The changes affect the accuracy of model verification, explained in the next section. The cherry radish is rich in nutrition, and its roots and stems can be used as medicine. In addition, the growth cycle of the cherry radish is short, lasting approximately one month. Moreover, cherry radishes require little space to grow, allowing them to be grown in a laboratory environment. Due to these characteristics, the cherry radish was selected as a typical research object for plant growth stage identification. Finally, it was useful to shorten the whole experiment cycle and accelerate the acquisition of images at different growth stages in the experimental process. After the collection and classification of the equipment, we obtained 1519 images of the germination stage, 1520 images of the seedling stage, and 1543 images of the leaf growth flourishing stage. Part of the sample images are shown in Figure 1.

The research model included two deep learning networks. The first network was DCGAN, and the second network was an improved deep transfer learning model. DCGAN was used in the data preprocessing stage, and the deep transfer learning model was used in the training and testing stages. The overall framework of the model is shown in Figure 2.

Firstly, the preprocessing of the collected images was set to a size of 80 × 80 pixels and this was followed by the classification of labels. These images were used for DCGAN to generate a new dataset. The dataset information is described in Table 1. This operation sacrificed image quality to some extent but greatly reduced the training time. The size of the generated image was the same as that of the input image, and the newly generated dataset also corresponded to the label. The parameters pre-trained on the large dataset of ImageNet were loaded into the convolution neural network, and all the layers before the original model classification layer were frozen. The improved model proposed in this paper was trained with the newly constructed dataset, and the model performance was finally calculated.

3.1. Deep Convolution Generates Adversarial Network

DCGAN consists of two different types of networks that are trained simultaneously end-to-end. The first network is used for image generation, while the other is used for recognition. The generator network consists of five transposed convolutional layers, four Relu layers, four batch normalized layers, and a Tanh layer at the end of the model. The discriminator network consists of five convolutional layers, four Leaky Relu layers, and three batch normalized layers. The main function of the batch normalization layer is to accelerate the convergence speed. Figure 3 shows the structure and sequence of layers of the network.

DCGAN was used to overcome the overfitting problem caused by the limited number of images in the dataset. In addition, compared with the original dataset, the edges of the output samples were relatively fuzzy, but the contour shape was basically the same. In order to avoid affecting the accuracy of training, the training was added properly; this also simulates the low pixel images in the natural environment. The 64 newly generated images were extended to the acquisition dataset, which is beneficial to improving the robustness of model training and the recognition accuracy for practical application. Taking the germination period as an example, the DCGAN output samples are shown in Figure 4.

3.2. Deep Transfer Learning Improvement

3.2.1. Inception-v3 Model

The Inception-v3 model is an updated version of the Inception-v1 model. It is a 48-layer deep pre-trained convolutional neural network model, and shown in Equation (1). Inception-v3 uses correction methods to increase accuracy and reduce computational complexity. It uses the idea of replacing a large convolutional kernel with a smaller convolutional kernel, the smaller kernel being split into two asymmetric convolution kernels. As a result, Inception-v3 provides higher accuracy with fewer image datasets than other machine learning techniques. Typically, the Inception module includes one maximum pooling and three convolutions of various sizes [29].

A X = [\begin{matrix} A 1, 1 & \dots & A 1 N \\ A 21 & \dots & A 2 N \\ A M 1 & \dots & A M N \end{matrix}] \times [\begin{matrix} B 1, 1 & \dots & B 1 N \\ B 21 & \dots & B 2 N \\ B M 1 & \dots & B M N \end{matrix}] = \sum_{i = 0}^{M - 1} \sum_{j = 0}^{N - 1} A (M - 1), (N - j) B (i + 1), (j + 1)

(1)

After the convolution operation, the channel is aggregated for the network output of and the preceding layer, and the non-linear fusion is then performed. In this model, overfitting can be avoided while simultaneously enhancing the network’s expression and flexibility to various scales, as shown in Figure 5.

Additionally, a key feature of Inception-v3 is its ability to scale large datasets and to handle images of different sizes and resolutions. This is very important in the field of monitoring plant growth state because the images of crops can vary greatly in terms of resolution and quality in different natural environments.

3.2.2. MobileNet Model

MobileNet was proposed by the Google team in 2017 to focus on lightweight CNN networks in mobile or embedded devices. Compared with traditional CNN, the model parameters and calculation amounts are greatly reduced under the premise of decreasing the accuracy range. Since the introduction of Inception-v1, Inception-v2, and Inception-v3, various structures have been improved. MobileNet is a CNN architecture for mobile devices. As shown in Figure 6, the architecture is based on deep separable convolution, which happens to be a form of decommissioned complexity that decommissions standard complexity into deep complexity, with 1 × 1 complexity known as point complexity. The BN layer is to normalize the input layer, which can greatly improve the generalization ability of the model.

3.2.3. The Improved Models

Deep transfer learning includes the freezing and fine-tuning methods [20,21,22,23,24,25,26,27,28,29]. The fine-tuning method required a large amount of training data to produce its advantages, and there has been no appropriate solution for the determination of freezing layer number so far. We therefore propose improved measures on the basis of the freezing method. Bousmalis [30] proposes that both source domain and target domain are composed of the public part and the private part. The public part learns the public features, while the private part is used to maintain the independence of each domain. In the freezing method, the parameters are trained in the source domain of all the layers before the last full connection layer is frozen, and the extraction of common features is guaranteed; however, the private features of the target domain are not fully learned.

To solve this problem, we propose a method for extracting the private features by adding network structure before the classification layer, as shown in Figure 7a, where all layers of freezing parameters are bottleneck layers. Private features are trained by adding an appropriate fully connected layer, Batch Normalization (BN), Leaky Relu, and a Dropout layer after the bottleneck layer. Compared with Relu, Leaky Relu improves the degree of gradient disappearance. The ADABN method [28] puts forward means and variances of BN layer as representing the characteristics between different domains. The difference in the distribution between the source domain and the target domain may lead to a mismatch in the operation, while the normalization operation will not only conduct statistics in the same way but also gather the data in the sensitive area of the nonlinear function to avoid the gradient disappearance problem. The calculation formula for the normalized operation is shown as follows:

μ = \frac{1}{m} \sum_{i = 1}^{m} x_{i}

(2)

σ^{2} = \frac{1}{m} \sum_{i = 1}^{m} {(x_{i} - μ)}^{2}

(3)

{\hat{x}}_{i} = \frac{x_{i} - μ}{\sqrt{σ^{2} + ε}}

(4)

where μ is the mean value, σ is the standard deviation,

x_{i}

is the input data, and

{\hat{x}}_{i}

is the output data.

However, because of the increase in the depth of the network, overfitting problems may be generated during training. We therefore improved the model, as shown in Figure 7b. Inspired by the Inception module, this method may extract more dimensional features and inhibit the occurrence of overfitting by increasing the width of the model. SGD-M was selected as the optimizer, for which the number of neurons in the full connected layer was 50. To speed up the simulation, the initial weight was set to the normal distribution with a standard deviation of 0.001, the initial bias was set to 0.1, the dropouts were 0.8 and 0.6, the learning rate was 0.01, 2000 steps were executed, and the batch size was 100. The experimental results are shown in the next section. In addition, we drew on the idea of the Attention mechanism [31] to enhance the important features of the image and improve the model structure, as shown in Figure 7c. Model improvement focused on the branch encompassing the full connected layer b. This branch passes through the full connected layer containing 50 neurons, and the sigmoid of the subsequent connection plays the role of self-gating. This limits all values within the range of (0, 1), and its output results represent different importance degree weights. The weight is multiplied by the output of the branch where the full connection layer a is used as the input of the next full connected layer. The improved structure assigns more weight to the important features, which accelerates the training and improves the recognition accuracy in the training process.

Figure 7. Deep migration structure improvement; (a) Model improvement method 1; (b) Model improvement method 2; (c) Model improvement method 3.

4. Experimental Results

The model proposed in this paper was coded in Python 3.6 and developed on an Intel Core i7-9700U processor, with 32 GB RAM and a GeForce RTX 2080TI GPU; the experimental results were obtained using the TensorBoard visualization tool. The performance of Inception-v3 was compared to MobileNet and three other transfer learning models (Xception, VGG-16, and VGG-19). During the training process, the variables and adjusted parameters were used as shown in Table 2. We used the “step” learning strategy; the basic learning rate of the other transfer learning models was set to 0.001~0.0001 and decreased every five epochs, with the factor of learning rate decaying by 0.5. DropBlock was set to 0.8 in the Inception-v3 and MobileNet models; the other transfer learning models do not have an optimized technique in the convolution process.

As shown in Table 3, we compared the accuracy of different models at an image size of 128 × 128. The training parameters were set as shown in Table 2. All the models ran 10 times. The mean and the minimum accuracies for Inception-v3 were 98.75% and 97.84%, respectively. The best accuracy for Inception-v3 achieved 99.32%, which was higher than the accuracy of other methods. The mean accuracies for the MobileNet, Xception, VGG-16, and VGG-19 models were 98.54%, 96.98%, 96.24%, and 95.44%, respectively. According to the experimental data, Inception-v3 achieved the best prediction result, and the MobileNet prediction accuracy was also perfect. The Inception-v3 prediction accuracy was 1.77%, 2.51%, and 3.31% higher than that of the Xception, VGG-16, and VGG-19 models, respectively.

As Inception-v3 and MobileNet performed better in terms of accuracy, the experiment used the collected dataset to compare and test the five original freezing depth transfer learning models: Inception-v3 and MobileNet_0.25_128, MobileNet_0.50_128, MobileNet_0.75_128, and MobileNet_1.0_128. In order to verify the effectiveness and robustness of the above models, 75% of the experimental data were randomly divided into training sets and the remaining 25% were used for testing sets. In this paper, we used accuracy, loss, precision, recall, and F1-score to evaluate the performance of these models for the detection of radish growth stages. Formulas (4)–(7) were used as follows:

A c c u r a c y = \frac{T P + T N}{T P + F P + F N + T N} \times 100

(5)

P r e c i s i o n = \frac{T P}{T P + F P} \times 100

(6)

R e c a l l = \frac{T P}{T P + F N} \times 100

(7)

F 1 - s c o r e = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} \times 100

(8)

In these formulas, TP (true positives) means that the positive class is determined as a positive class, FP (false positives) means that the negative class is determined as a positive class, FN (false negatives) means that the positive class is determined as a negative class, and TN (true negatives) means that the negative class is determined as a negative class. The comparison of performance between Inception-v3 and MobileNet is shown in Table 4.

To verify the performance of the Inception-v3 model proposed in this paper, we chose to evaluate precision, recall, and F1-score for the identification of growth stages in cherry radish images. The identification performance verification results for Inception-v3 and MobileNet are compared in Table 4. The identification precision of Inception-v3 was 95%, the recall rate was 100%, and the F1-score was 97%, all of which were higher than the data in the other four MobileNet models. Additionally, the recognition accuracy levels of Inception-v3, MobileNet_0.25_128, MobileNet_0.50_128, MobileNet_0.75_128, and MobileNet_1.0_128 were 99.5%, 97.2%, 98.7%, 98.5%, and 98.9%, respectively. The loss in Inception-v3 was 0.06; for all the other models, the loss was higher than 0.10.

Table 4. Precision, recall, F1-score, and accuracy results for Inception-V3 and MobileNet models.

Model	Precision	Recall	F1-Score	Accuracy	Loss
Inception-v3	95%	96%	97%	99.5%	0.06
MobileNet_0.25_128	88%	91%	89%	97.2%	0.35
MobileNet_0.50_128	92%	88%	89%	98.7%	0.24
MobileNet_0.75_128	94%	93%	93%	98.5%	0.22
MobileNet_1.0_128	78%	89%	83%	98.9%	0.19

The test accuracy and loss figures for the five models are shown in Figure 8. It can be seen from the figure that with the increase in the number of convolution kernels in MobileNet, the model can achieve higher accuracy more rapidly. However, it is worth noting that due to the small dataset, MobileNet is more prone to overfitting and oscillation, and its generalization performance is poor compared with that of Inception-v3; accordingly, the experiment was performed on Inception-v3.

In order to verify the effectiveness of the improved deep transfer learning method proposed in this paper, the improved methods 1, 2, and 3 proposed in Figure 9 are compared to the original method. The experimental results from these three models on the collected dataset are shown in the figure below.

It can be clearly seen from the figure that although Model 1 achieved higher accuracy more rapidly than the original model, its curve fluctuated greatly and was prone to overfitting. The proposed improved Model 2 has the same final accuracy as the original model: it achieved rapid convergence, reduced the loss to a lower level, greatly reduced the local optimal oscillation of the model, and significantly improved the model recognition performance. The improved Model 3 also performed well up to 700 steps, but its performance was not very stable in the following training, possibly due to the small dataset. The accuracy and loss for each model are shown in Table 5.

Figure 9. (a) Accuracy of the validation set of four models. (b) Loss of the validation set of four models.

The Oxford Flower dataset was used to further test the universal applicability of the model proposed in this paper. The testing dataset contained five types of flowers, and included 633 daisies, 898 dandelions, 641 roses, 699 sunflowers, and 799 tulips. Example images are shown in Figure 10.

The experimental results are shown in Figure 11. The figures show that the original model converged slowly and produced large oscillations. Among the four models, proposed Models 2 and 3 still performed very well, both for accuracy and loss. The consistency of the experimental results demonstrates the rationality of the dataset produced in this study. Table 6 shows the specific data for accuracy and loss, from which we can see that proposed Model 3 has the highest identification accuracy (99.3%) and the lowest identification loss (0.02). Compared with the original model, the validation accuracy of proposed Models 1 and 2 was also higher, and their loss was less than 0.1.

Additionally, we chose to use the AI Challenger 2018 pest and disease classification dataset to verify the models proposed in this paper. This dataset of crop leaf images comprises 50,000 labeled images, including 27 diseases for 10 plants. We used 75% percent of the images for training and the remaining 25% of the images for testing; the images were selected randomly from the whole dataset. In our testing process, we achieved an average testing accuracy of 94.4%, 93.6%, and 94.7% with proposed Models 1, 2, and 3, respectively; these figures were all over 2% higher than the accuracy of the original model. The loss rates of the three proposed models were also much lower than that of the original model, with that of Model 2 being the lowest (0.04). Our experiment demonstrates the advantage of the proposed models when dealing with the recognition of the state of plants. The experimental results are shown in Figure 12 and Table 7.

5. Discussion

This paper focused on the identification method for radish growth stages. We took the cherry radish as the research object and generated a dataset using DCGAN. Parameters pre-trained on the source dataset were loaded into all layers before the final fully connected layer of the Inception-v3 layer was frozen. The target dataset was used to continue training on the improved model to identify three stages of growth: germination, seedling, and vigorous leaf growth. As a very important element of smart agriculture, plant growth monitoring is of great significance to ensure the timely irrigation and fertilization of plants; this helps improve the yield and quality of crops and, thus, increases the economic efficiency of agriculture.

The experimental results show that the model presented in this paper performed well. Based on the improved model proposed in this paper, the test accuracy and loss for the radish dataset were 99.5% and 0.01, respectively, which demonstrates its perfect applicability to the task of recognizing plant growth state. In order to verify the robustness of the proposed models, the Oxford Flower dataset and the pest and disease classification dataset were selected for use in the experiment. The validation accuracy and loss for the two common datasets both exhibited good performance. The accuracy of the improved models is shown in Figure 13.

6. Conclusions

The plant growth stage identification model proposed in this paper has achieved preliminary results. Due to its short existence, research into intelligent agriculture based on machine vision is still at the preliminary exploration stage: the datasets collected so far are relatively few and generally single in type. Future work should focus on creating more convenient image acquisition systems and professional data collection methods, expanding the number and types of images, and improving the versatility of systems. The model in this paper realized the recognition of the growth stage without target detection. Further enrichment of datasets would enable a target detection label to realize the function of target detection, meaning that identification and detection could be integrated and the running time of the program could be reduced.

Author Contributions

Conceptualization, X.Z. and S.L.; methodology, X.Z., X.Y. and S.L.; software, X.Z.; validation, X.Z. and S.L.; formal analysis, X.Z. and S.L.; investigation, S.L.; resources, X.Z., X.Y. and S.L.; data curation, X.Z. and S.L.; writing—original draft preparation, X.Z.; writing—review and editing, X.Y. and S.L.; visualization, X.Y.; supervision, X.Y.; project administration, X.Z., X.Y. and S.L.; funding acquisition, X.Z., X.Y. and S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by National Natural Science Foundation of China 61971297.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data included in this study are available upon request by contract with the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Fang, Y.; Wang, X.; Shi, P.; Lin, C.; Zhai, R. Automatic identification of two growth stages for rapeseed plant: Three leaf and four leaf stage. In Proceedings of the 4th International Conference on Agro-Geoinformatics, Istanbul, Turkey, 20–24 July 2015. [Google Scholar]
Jiang, B.; Wang, P.; Zhuang, S.; Li, M.; Gong, Z. Drought stress detection in the middle growth stage of maize based on Gabor filter and deep learning. In Proceedings of the 38th Chinese Control Conference, Guangzhou, China, 27–30 July 2019. [Google Scholar]
Lin, T.L.; Chang, H.Y.; Chen, K.H. Pest and Disease Identification in the Growth of Sweet Peppers using Faster R-CNN. In Proceedings of the IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), Yilan, Taiwan, 20–22 May 2019. [Google Scholar]
Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef] [Green Version]
Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet Classification with Deep Convolutional Neural Networks; Curran Associates Inc.: Red Hook, NY, USA, 2012. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hayat, S.; Kun, S.; Tengtao, Z.; Yu, Y.; Tu, T.; Du, Y. Deep Learning Framework Using Convolutional Neural Network for Multi-Class Object Recognition. In Proceedings of the IEEE 3rd International Conference on Image, Vision and Computing (ICIVC), Chongqing, China, 27–29 July 2018; pp. 194–198. [Google Scholar] [CrossRef]
Schroff, F.; Kalenichenko, D.; Philbin, J. FaceNet: A Unified Embedding for Face Recognition and Clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
Ding, C.; Tao, D. Trunk-Branch Ensemble Convolutional Neural Networks for Video-Based Face Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 1002–1014. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhou, B.; Lapedriza, A.; Xiao, J.; Torralba, A.; Oliva, A. Learning deep features for scene recognition using places database. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2014; pp. 487–495. [Google Scholar]
Fu, L.; Feng, Y.; Wu, J.; Liu, Z.; Gao, F.; Majeed, Y.; Al-Mallahi, A.; Zhang, Q.; Li, R.; Cui, Y. Fast and accurate detection of kiwifruit in orchard using improved YOLOv3-tiny model. Precis. Agric. 2021, 22, 754–776. [Google Scholar] [CrossRef]
Fu, L.; Majeed, Y.; Zhang, X.; Karkee, M.; Zhang, Q. Faster R-CNN-based apple detection in dense-foliage fruiting-wall trees using RGB and depth features for robotic harvesting. Biosyst. Eng. 2020, 197, 245–256. [Google Scholar] [CrossRef]
Gao, F.; Fu, L.; Zhang, X.; Majeed, Y.; Li, R.; Karkee, M.; Zhang, Q. Multi-class fruit-on-plant detection for apple SNAP system using Faster R-CNN. Comput. Electron. Agric. 2020, 176, 105364. [Google Scholar] [CrossRef]
Häni, N.; Roy, P.; Isler, V. A comparative study of fruit detection and counting methods for yield mapping in apple orchards. J. Field Robot. 2020, 37, 263–282. [Google Scholar] [CrossRef] [Green Version]
Santos, T.T.; de Souza, L.L.; dos Santos, A.A.; Avila, S. Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association. Comput. Electron. Agric. 2020, 170, 105247. [Google Scholar] [CrossRef] [Green Version]
Pham, T.N.; Van Tran, L.; Dao, S.V.T. Early disease classification of mango leaves using feed-forward neural network and hybrid metaheuristic feature selection. IEEE Access 2020, 8, 189960–189973. [Google Scholar] [CrossRef]
Singh, A.; Kaur, H. Potato Plant Leaves Disease Detection and Classification using Machine Learning Methodologies. In Proceedings of the IOP Conference Series: Materials Science and Engineering, Sanya, China, 12–14 November 2021; IOP Publishing: Bristol, UK, 2021; Volume 1022, p. 012121. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef] [Green Version]
Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 6–11 July 2015. [Google Scholar]
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Yosinski, J.; Clune, J.; Bengio, Y.; Lipson, H. How transferable are features in deep neural networks? arXiv 2014, arXiv:1411.1792. [Google Scholar]
Jang, Y.; Lee, H.; Hwang, S.J.; Shin, J. Learning What and Where to Transfer. arXiv 2019, arXiv:1905.05901. [Google Scholar]
Tzeng, E.; Hoffman, J.; Zhang, N.; Saenko, K.; Darrell, T. Deep Domain Confusion: Maximizing for Domain Invariance. arXiv 2014, arXiv:1412.3474. [Google Scholar]
Long, M.; Cao, Y.; Wang, J.; Jordan, M. Learning Transferable Features with Deep Adaptation Networks. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 6–11 July 2015. [Google Scholar]
Tzeng, E.; Hoffman, J.; Darrell, T.; Saenko, K. Simultaneous Deep Transfer Across Domains and Tasks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. [Google Scholar]
Li, Y.; Wang, N.; Shi, J.; Liu, J.; Hou, X. Revisiting Batch Normalization For Practical Domain Adaptation. Pattern Recognit. 2016, 80. [Google Scholar] [CrossRef]
Guan, Q.; Wan, X.; Lu, H.; Ping, B.; Li, D.; Wang, L.; Zhu, Y.; Wang, Y.; Xiang, J. Deep convolutional neural network inception-v3 model for differential diagnosing of lymph node in cytological images: A pilot study. Ann. Transl. Med. 2019, 7, 307. [Google Scholar] [CrossRef] [PubMed]
Bousmalis, K.; Trigeorgis, G.; Silberman, N.; Krishnan, D.; Erhan, D. Domain Separation Networks. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Transactions on Pattern Analysis and Machine Intelligence, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]

Figure 1. Dataset samples.

Figure 2. Model framework.

Figure 3. DCGAN structure.

Figure 4. The output samples.

Figure 5. Architecture of inception network.

Figure 6. Architecture of MobileNet network.

Figure 8. (a) Accuracy for the test set of five models. (b) Loss for the test set of four models.

Figure 10. Example images of the Oxford Flower dataset.

Figure 11. (a) Validation accuracy for the Oxford Flower dataset. (b) Validation loss for the Oxford Flower dataset.

Figure 12. (a) Validation accuracy for the pest and disease classification dataset. (b) Validation loss for the pest and disease classification dataset.

Figure 13. The accuracy of the proposed models in three datasets.

Table 1. The dataset information.

Growth Stage	Image Quantity	Image Size (Pixels)
Germination stage	1519	80 × 80
Seedling stage	1520	80 × 80
Flourishing stage	1543	80 × 80

Table 2. Experiment setting variables parameters.

Variables	Variable Definitions	Values Inception-v3	Values MobileNet	Values Other Transfer Learning
Batch_size	Number of samples processed for each batch of training setting	128	128	128
Basic learning rate	The learning rate of the initial training setting	0.001	0.001	0.001~0.0001
Learning rate decay	After every 10 trainings, the learning rate decay coefficient	0.5	0.5	0.5
Training epoch	Total training iterations	50	50	50
DropBlock	A regularization technique in the convolution process	0.8	0.8	-
Learning strategy	Step: Learning rate change strategy	Step	Step	Step

Table 3. Comparison results of Inception-v3, MobileNet, and transfer learning models.

Run NO	Inception-v3	MobileNet	Transfer Learning Models
Run NO	Inception-v3	MobileNet	Xception	VGG-16	VGG-19
1	98.73	98.54	97.4	96.08	94.94
2	99.05	97.79	97.33	97.34	95.4
3	99.32	98.21	96.88	97.25	95.58
4	97.84	98.96	97.14	92.82	93.32
5	98.78	99.04	97.18	96.95	95.55
6	99.20	98.31	96.93	97.01	96.92
7	98.44	97.84	96.72	95.86	97,81
8	98.98	99.33	96.55	95.69	95.12
9	98.05	99.22	97.02	96.84	95.16
10	99.11	98.19	96.69	96.59	96.97
Mean	98.75	98.54	96.98	96.24	95.44
Std	0.47	0.53	0.27	1.26	1.03

Std: The standard deviation of a set of data.

Table 5. Comparison results based on our data.

Model	Mean Accuracy	Mean Loss
Original model	99.5%	0.06
Proposed Model 1	98.9%	0.03
Proposed Model 2	99.5%	0.01
Proposed Model 3	99.0%	0.03

Table 6. Comparison results based on the Oxford Flower dataset.

Model	Mean Accuracy	Mean Loss
Original model	96.8%	0.14
Proposed Model 1	98.0%	0.08
Proposed Model 2	99.2%	0.02
Proposed Model 3	99.3%	0.02

Table 7. Comparison results based on pest and disease classification dataset.

Model	Mean Accuracy	Mean Loss
Original model	92.3%	0.25
Proposed Model 1	94.4%	0.09
Proposed Model 2	93.6%	0.04
Proposed Model 3	94.7%	0.06

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, X.; Yang, X.; Luo, S. Radish Growth Stage Recognition Based on GAN and Deep Transfer Learning. Appl. Sci. 2023, 13, 8306. https://doi.org/10.3390/app13148306

AMA Style

Zhou X, Yang X, Luo S. Radish Growth Stage Recognition Based on GAN and Deep Transfer Learning. Applied Sciences. 2023; 13(14):8306. https://doi.org/10.3390/app13148306

Chicago/Turabian Style

Zhou, Ximeng, Xinhao Yang, and Songshi Luo. 2023. "Radish Growth Stage Recognition Based on GAN and Deep Transfer Learning" Applied Sciences 13, no. 14: 8306. https://doi.org/10.3390/app13148306

APA Style

Zhou, X., Yang, X., & Luo, S. (2023). Radish Growth Stage Recognition Based on GAN and Deep Transfer Learning. Applied Sciences, 13(14), 8306. https://doi.org/10.3390/app13148306

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Radish Growth Stage Recognition Based on GAN and Deep Transfer Learning

Abstract

1. Introduction

2. Related Work

3. The Proposed Model

3.1. Deep Convolution Generates Adversarial Network

3.2. Deep Transfer Learning Improvement

3.2.1. Inception-v3 Model

3.2.2. MobileNet Model

3.2.3. The Improved Models

4. Experimental Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI