Surface Defect Detection of “Yuluxiang” Pear Using Convolutional Neural Network with Class-Balance Loss

Sun, Haixia; Zhang, Shujuan; Ren, Rui; Su, Liyang

doi:10.3390/agronomy12092076

Open AccessArticle

Surface Defect Detection of “Yuluxiang” Pear Using Convolutional Neural Network with Class-Balance Loss

by

Haixia Sun

,

Shujuan Zhang

^*,

Rui Ren

and

Liyang Su

College of Agricultural Engineering, Shanxi Agricultural University, Jinzhong 030801, China

^*

Author to whom correspondence should be addressed.

Agronomy 2022, 12(9), 2076; https://doi.org/10.3390/agronomy12092076

Submission received: 31 July 2022 / Revised: 29 August 2022 / Accepted: 30 August 2022 / Published: 31 August 2022

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

With increasing consumer expectations for the quality and safety of agricultural products, intelligent quality detection and gradation have considerable significance in agricultural production. The surface defect is an important indicator of quality, but is classified mainly using inefficient manual identification for “Yuluxiang” pears. Because of the uncertainty and high difficulty of image acquisition in agriculture, the data imbalance between categories is a common problem. For the resolution of these problems, the class balance (CB) was used to re-weight the sigmoid cross-entropy loss (SGM-CE), softmax cross-entropy loss (SM-CE), focal loss (FL) functions in this study. CB-SGM-CE, CB-SM-CE, and CB-FL were used to construct a GoogLeNet network as a convolutional neural network (CNN) generalized feature extractor, and transfer learning was combined to build detection models, respectively. The results showed that CB-SGM-CE, CB-SM-CE, and CB-FL were better than SGM-CE, SM-CE, and FL, respectively. CB-FL achieved the best detection results (F1 score of 0.993–1.000) in 3 CB loss functions. Then, CB-FL was used to construct VGG 16, AlexNet, SqueezeNet, and MobileNet V2 networks based on transfer learning, respectively. Machine learning (ML) and CNN were used to build classification models in this study. Compared with ML models and the other 4 CNN models, the CB-FL-GoogLeNet model achieved the best detection results (accuracy of 99.78%). A system for surface defect detection was developed. The results showed that the testing accuracy of the CB-FL-GoogLeNet model was 95.28% based on this system. This study realizes the surface defect detection of the “Yuluxiang” pear with an unbalanced dataset, and provides a method for intelligent detection in agriculture.

Keywords:

class balanced loss; convolutional neural network; defect; “Yuluxiang” pear; defection

1. Introduction

Agriculture transitions from traditional planting to smart agriculture as machine vision and artificial intelligence are developed. Smart agriculture improves productivity, efficiency and sustainability in agricultural production through the integration of modern information technology with agriculture [1]. The agricultural products industry is an important part of modern informational agriculture. With increasing consumer expectations of quality and safety, the demand for intelligent quality detection and gradation of agricultural products is growing, which has become a focus issue in the current trade and supply of agricultural products.

As a Chinese geographical indication protection product with a wide range of suitable varieties and high nutritional value, “Yuluxiang” pear has been exported to many countries [2,3]. Surface defects are an important factor affecting its quality, which is directly related to consumers’ willingness to buy. The detection of surface defects is an important part of the post-harvest grading of fruit. At present, the surface defects of the “Yuluxiang” pear are mainly identified using human vision, and their efficiency is low and intensity is high. Therefore, the intelligent detection of surface defects on “Yuluxiang” pear is important for achieving a reasonable determination of quality and price.

Machine vision technology [4] transforms the target into an image signal through an image acquisition unit, which is converted into a digital signal based on information such as pixel distribution, brightness and color. A variety of operations are carried out on these signals by an image processing system to extract target characteristics to achieve detection and discrimination. This technology with the advantages of high intelligence and rapid non-destructive detection has been widely applied in fruit quality detection [5,6,7]. In fruit defect detection, traditional methods based on image processing and machine learning (ML) required manual extraction for feature (such as color, texture, and morphology) or region segmentation for the detection of samples with a small set [8,9,10], which was inefficient and applicable to specific objects with weak repeatability.

Convolutional neural networks (CNN) could share convolutional kernels and automatically learn features from a dataset in a hierarchical manner [11,12]. Because of the flexible structure, CNNs were able to adapt to different learning strategies. Li et al [13] proposed an improved VGG model for Hami melon surface defect detection, and achieved a recognition accuracy of 93.5% in the developed detection software. In the detection of apple defects, compared with the model established by traditional image processing combined with support vector machine (a prediction accuracy of 87.1%), the CNN model (a prediction accuracy of 96.5%) obtained better prediction results [14]. Xue et al [15] reported that the GoogLeNet model obtained better results than ML methods for apple defect detection, and achieved a prediction accuracy of 91.91%. CNNs have been successfully used in fruit defect detection, but the application for “Yuluxiang” pear defect detection has not been reported. In the detection of defects for pears of other varieties, Jiang [16] and Chen [17] used ML to detect surface defects. Zhang et al. [18] applied CNNs to the detection of blackspot defects on Korla pear; CNN (the best accuracy of 97.35%) achieved better prediction results than ML.

In the actual production, the size of different defect types is uneven for pears. The problem of data imbalance between classes is prevalent in the actual dataset collected. In the case of imbalance in the number of samples between different categories, the learning ability of the model for the categories with a small size was easily inadequate, which affected the classification performance of the training model [19,20,21]. Data imbalance is an important factor affecting the performance of the model. In the quality classification of pistachios, Gao et al. [22] proposed an automatic balancing method using the classification augmentation of the training set dataset before the data were input into the CNN model, and average test accuracy was increased from 96.75% to 99.26%. Addressing the problem that traditional CNNs could not accurately and quickly identify tea leaf disease due to uneven distribution of images, Li et al. [23] used the Focal Loss function to reduce the weights of easily classified samples, and built an improved DenseNet model. The recognition accuracy was 92.66%, which was higher than the standard Cross Entropy Loss function. Data augmentation is generally required in CNN training, and the newly added samples may have repetitive information from the original sample. The model may decrease the marginal benefits extracted from the data, as the sample size increases. Therefore, although some progress has been achieved in the processing of unbalanced datasets, there are still challenges in better achieving the sorting of agricultural products with small sample sizes.

Aiming at the solution for the problem of low intelligent detection of surface defects and inter-class imbalance of a small data set on “Yuluxiang” pear, class balance loss (CB) was introduced to perform re-weighting for different defect classes based on the number of valid samples in this study. CNN detection models with CB were trained using a transfer learning strategy, and the effect of CB on models was analyzed. To determine the preferred detection model, the classification performance of different CNNs and ML models were compared. To validate the performance of models, a detection system was developed, which provides a theoretical basis for intelligent classification of “Yuluxiang” pear quality.

2. Materials and Methods

2.1. Dataset Construction

In this study, the “Yuluxiang” pears were collected from an orchard in Taigu district, Shanxi province, China. There are mainly two surface defects (fruit russet and canker) when the “Yuluxiang” pear is harvested, as shown in Figure 1.

The image acquisition system was composed of a computer, a dark chamber, two LED light bars, a camera with USB interface (Shengyue SY8031), a background plate and a platform, as shown in Figure 2. The mixing of external light could introduce stray light information during the image acquisition process, which could result in deterioration of image quality, and thus affect the classification results. To ensure that the image acquisition was not disturbed by external light, the dark chamber was used to effectively acquire data information about the target sample. The background plate was set to white for the purpose of distinguishing the sample from the background. The color temperature of the LED strips was 5500 K to ensure that the captured images were kept without color bias.

In this study, a total of 630 images (included 101 russeting samples, 206 intact samples, and 323 cankered samples) were collected. The number of samples in the three classes was significantly different, and the ratio of the number of samples was approximately 1:2:3. There was an imbalance in the number of samples between the categories. The dataset of each class was divided into a training set, a validation set and a test set according to 6:2:2, respectively.

The actual acquired images are large. Problems such as training time and memory consumption would be increased if these images were input directly into the CNN model. Therefore, the size of each image was adjusted to 224 × 224 pixels using an equal scale method before training the model. A large number of images are required in CNN training. Data augmentation was a way to expand the scale and richness of data using several techniques (such as rotation, cropping, and scaling), which could alleviate data-starved scenarios in deep learning [24]. The augmentation of the images was carried out using the original image, rotated by 90°, rotated by 180°, rotated by 270°, lightness enhanced by 20, lightness diminished by 20, and saturation enhanced by 20. The dataset was amplified to a size of 7 times the original.

2.2. Class Balance

The performance of the classifier may not be friendly to categories with small sample sizes because of the small sample size and the unbalanced amount of data between different categories on “Yuluxiang” pears, which affects the detection performance of the deep learning model. Therefore, rebalancing of losses based on a weighted improvement of the loss function using the number of valid samples per category, was performed to improve the detection accuracy in this study. A weighting factor was introduced in the Class Balance Loss (CB) [25]. The calculation method of CB is shown in Equation (1).

C B (P, y) = \frac{1}{E n_{y}} L (P, y) = \frac{1 - β}{1 - β^{n_{y}}} L (P, y)

(1)

where En_y means the valid number of samples. β is a hyperparameter used to adapt the class balance term, and β ∈ [0, 1]. n_y is the number of samples in the true class y. L (P, y) represents the loss function. P is the probability of the model’s predicted class, P = [p₁, p₂, …, p_C]^T, C is the total number of classes, and p_i ∈ [0, 1].

Based on sigmoid cross-entropy loss (SGM-CE), softmax cross-entropy loss (SM-CE), and focal loss (FL) [26] in this study, Class Balanced softmax cross-entropy loss (CB-SM-CE), Class Balanced sigmoid cross-entropy loss (CB-SM-CE), and Class Balanced sigmoid cross-entropy loss (CB-SM-CE) were constructed by adding class balanced terms to the original loss function. SM-CE, SGM-CE, FL, CB-SM-CE, CB-SGM-CE, and CB-FL were used as loss functions to construct the detection model, and to analyze the effect of class balance loss on the model.

2.3. CNN Networks

A CNN is composed of the input layer, convolutional layer, activation function, pooling layer, and fully connected layer. It automatically learns the features using spatial levels from low to high. The convolutional layer is the core of CNN, and the features are extracted based on the principle of weight sharing. The pooling layer performs feature compression, extracts the main features, and reduces the required weights and calculations. The activation function is used to add nonlinear factors to make the network more expressive. The fully connected layer is used to map the extracted features to the label space of the sample.

The training process of CNN requires a large amount of training data and parameters, and calculations are large. However, the sample size of “Yuluxiang” pears is small. If the number of images is too small, the accuracy and generalization ability of the model will decrease. The shallow learning features of convolutional neural networks have generalizability. Transfer learning [27,28] uses pre-trained deep networks as feature extractors, achieves transfer (from source domain to target domain) through the similarity between the knowledge of the source domain and the target domain, and improves the learning of the target domain knowledge with the help of the acquired knowledge of the source domain and target task. Transfer learning reduced the dependence on the amount of data and speeded up training, which was widely used in the training of complex network structures [29,30]. To reduce the dependence on the amount of data and speed up the process of model training, the transfer learning strategy was adopted to fine-tune the pre-trained deep network weights through the new dataset. The weights of the feature extraction layer were frozen, the original fully connected layer was replaced by a newly designed fully connected layer, and the weights of the fully connected layer were updated only during training using a CNN with transfer learning. In this study, GoogLeNet, VGG 16, AlexNet, SqueezeNet, and MobileNet V2 networks with transfer learning were used for the detection of surface defects.

AlexNet [31] contains 5 convolutional layers and 3 fully connected layers, uses local response normalization to enhance the generalization of the model, and uses data augmentation and Dropout to prevent overfitting. VGG 16 [32] has 16 network layers containing parameters (13 convolutional layers and 3 fully connected layers) to improve performance through continuous deepening of the network structure. The network contains a total of 5 convolutional blocks; each VGG convolutional block is composed of a set of convolutional layers, an activation function, and a maximum pooling function. GoogLeNet [33] is based on the deep neural network model of the Inception module. There are multiple convolution kernels of different scales in a layer of the network. The multi-scale feature information of the image is extracted using multiple convolution kernels of different sizes, then is fused to obtain a better characterization ability for the image. It also performs dimensionality increases or reduction processing on the output features. The GoogLeNet not only increases the network width, but also enhances the adaptability of models to the scale. SqueezeNet and MobileNet [34] are lightweight neural networks. MobileNet V2 [35] is improved on the basis of MobileNet V1. In order to prevent the loss of information in the nonlinear layer, a linear bottleneck layer was introduced. The separable convolution is applied to the residual structure to form an inverted residual block, which is stacked to form a linear bottleneck block. SqueezeNet [36] is composed of several fire module structures, which is a network unit structure similar to Inception. A fire module contains a squeeze convolutional layer (only contains a 1 × 1 convolution kernel) and an expanded convolutional layer (contains 1 × 1 and 3 × 3 convolution kernels).

2.4. Experimental Environment and Parameter

Windows 10 with Intel Corei7-10875H processor and NVIDIA RTX2060 GPU with 6 GB of graphics memory was used in this study. Experiments were performed in Jupyter Notebook, using CUDA and CUDNN libraries, Python 3.7 and pytorch 1.7 for programming.

Considering the performance of the hardware and the ability of the model to fully learn the data features, the epoch was 100 and the batch size was 16 for each experiment in this study. The SGD optimizer was used, and the momentum was 0.9. A strategy of dynamically adjusting the learning rate was used, with an initial learning rate of 0.0001 and a weight decay of 5 × 10⁻⁴.

2.5. Evaluation Indicators

To determine the effectiveness of the model for surface defect detection of “Yuluxiang” pears, the evaluation indexes were Accuracy, Precision, Recall, and F1 Score. The calculation formulas are shown in Equations (2)–(5), respectively.

Accuracy = \frac{T P + T N}{T P + T N + F N + F P}

(2)

Precision = \frac{T P}{T P + F P}

(3)

Recall = \frac{T P}{T P + F N}

(4)

F 1 Score = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(5)

where FP and FN are the negative and positive samples that are predicted to be in the negative class, respectively. TP and TN are the number of positive and negative samples predicted as the positive class, respectively.

3. Results and Discussion

3.1. Effect of Class Balance Loss on Detection Models

3.1.1. Modeling Based on Class Balance Loss

To analyze the effect of class balance loss on the surface defect detection model of “Yuluxiang” pears, SGM-CE, FL, SM-CE, CB-FL, CB-SM-CE, CB-SGM-CE were used, respectively, as loss functions to construct GoogLeNet networks as generalized feature extractors, and detection models were built based on a transfer learning strategy. The accuracy and loss value curves of the validation set with the 6 loss functions are shown in Figure 3 and Figure 4, respectively.

It was shown that the accuracy and loss value curves of the validation set with the 6 loss functions all converged stably in Figure 3 and Figure 4. For the accuracy curves of the validation set, the class balanced loss functions were all higher than the corresponding original loss functions in Figure 2. For the loss value curves of the validation set, the class balanced loss functions were all lower than the corresponding original loss functions in Figure 3. Compared to SGM-CE and SM-CE, FL had a lower loss value curve within the original loss function. Compared to CB-SGM-CE and CB-SM-CE, the convergence of the loss value curve of CB-FL was faster in the class balanced loss function. The optimal validation results using the 6 loss functions are shown in Table 1.

In Table 1, the validation accuracy of the SGM-CE, SM-CE, and FL loss functions with the class balance term was increased by 9.29%, 1.70%, and 2.61%, respectively. Among the 3 CB loss functions, CB-FL had the highest validation set accuracy of 99.55% with a loss value of 0.0428. As the data set was unbalanced, evaluating the performance of the model using only accuracy was inaccurate. There could be cases where the model had a high accuracy, while categories with low data volumes had a high classification error rate. Therefore, the confusion matrices of models based on different loss functions are shown in Figure 5.

Each row and column represent an actual and predicted category in Figure 4, respectively. For the same original function, the number of samples predicted incorrectly was reduced for each class after adding the class balance term, respectively. According to Figure 4, the calculation of precision, recall, and F1 scores for each category of samples were carried out, respectively. The results are shown in Figure 6, Figure 7 and Figure 8.

It was shown in Figure 6, Figure 7 and Figure 8 that russeting samples with a small size had poorer results than intact and cankered samples for the validation set results of the models built using SGM-CE, SM-CE, and FL. The SGM-CE performed the worst with a recall of 42.86% and an F1 score of 0.597 on the validation set. After adding the CB to the SGM-CE, SM-CE and FL, the precision, recall, and F1 score of the validation set were improved, respectively. Fruit russeting samples showed more significant improvement than intact and decayed samples in the validation results. This suggests that the CB performs well on samples with a small size. In particular, with the addition of the CB to SGM-CE, the validation results of the proposed model for the russeting samples improved the most, with a 55.00% increase in the recall and a 50.354 increase in the F1 score for the validation set. After adding CB terms to SM-CE and FL, the F1 scores of the russeting sample validation set were improved by 0.028 and 0.069, respectively. Overall, the validation set accuracy, recall, and F1 scores of the CB-FL were all higher than those of the other loss functions, where the validation results for the russeting samples were significantly better for CB-FL than for the other loss functions. It indicated that the built model with CB-FL obtained the optimal validation results. Based on the CB-FL, the validation precision, recall, and F1 scores for russeting samples were 98.58%, 99.29%, and 0.989, respectively. The validation precision, recall, and F1 scores for intact samples were 99.65%, 99.30%, and 0.994, respectively. The validation precision, recall and F1 scores for cankered samples were 99.78%, 99.78%, and 0.998, respectively.

3.1.2. Model Checking

To check the effectiveness of the proposed model in detecting surface defects on “Yuluxiang” pears, GoogLeNet combined with transfer learning models based on 6 different loss functions was used to classify samples from the test set individually. The accuracy, precision, recall, and F1 scores of the test set were calculated, separately. The specific results are shown in Table 2.

It was shown in Table 2 that the addition of CB terms to the SGM-CE, SM-CE, and FL increased the test set accuracy, recall, and F1 scores of the proposed models, with increases in test set accuracy of 8.55%, 0.900%, and 1.58%, respectively. The classification performance in the same model was worse for the russeting samples than for the intact and decayed samples. In the same model, the classification performance for russeting samples was worse than for both intact and cankered samples. The model built using the SGM-CE had the worst test results for the fruit russeting sample, with a test set recall of 42.85% and an F1 score of 0.600. After adding the CB to the SGM-CE, the fruit russeting sample showed the greatest improvement in test results, with a 51.71% increase in recall and a 0.372 increase in F1 score for the test set. Adding CB terms to the SM-CE and FL increased the test set F1 scores by 0.029 and 0.050 for the fruit russeting samples, respectively. The test results for the built models using the SM-CE and FL were similar, but both were worse than those for the built models using CB-SGM-CE, CB-SM-CE, and CB-FL. The best test results were obtained from the CB-FL in class balanced loss, with a test accuracy of 99.78%. Furthermore, the F1 scores for russeting, intact, and cankered samples were 0.993, 1.000, and 0.998, respectively.

For the accuracy, precision, recall, and F1 scores of both the validation and test sets, the model with the class balanced loss was higher than the model with the corresponding original loss function. In the results of the validation and test sets for russeting samples with low data volumes, CB-SGM-CE, CB-SM-CE, and CB-FL performed significantly better than SGM-CE, SM-CE and FL, and CB-FL performed better than CB-SGM-CE and CB-SM-CE in the class balanced loss. Using the model developed with CB-FL, the best validation and testing results were obtained for russeting, intact, and cankered samples. Therefore, the application of class balanced loss improved the performance of the surface defect detection model for “Yuluxiang” pears, and the best classification performance was obtained using the model with CB-FL.

3.2. Model Comparison

3.2.1. Comparison of CNN Models

To evaluate the performance of different network models for the detection of surface defects of “Yuluxiang” pears, GoogLeNet, VGG 16, AlexNet, SqueezeNet, and MobileNet V2 networks were compared. Using CB-FL as the loss function, five networks combined with transfer learning were used to build the model, respectively. Optimal results are shown in Table 3. The accuracy of the validation and prediction sets in all five models was above 90%. The SqueezeNet model achieved the worst results (93.08% and 90.10% accuracy in the validation and test sets, respectively). The GoogLeNet model achieved the best results. Compared with the VGG 16, AlexNet and MobileNet V2 models, the GoogLeNet model showed 1.70%, 3.52%, and 3.74% higher accuracy in the validation set and 1.35%, 1.24%, and 0.79% higher accuracy in the prediction set, respectively. The results of the four models were similar, and all performed well.

To reflect more clearly the classified situation of the model for each category of samples, the test set samples were tested based on the five models constructed. The precision, recall, and F1 scores were calculated for each category of samples, respectively. The results are shown in Table 4.

The SqueezeNet model had the lowest precision, recall, and F1 scores among the five models in Table 4, indicating that the model had the worst classification performance for the dataset created in this study. The SqueezeNet model had good classification performance for cankered samples, but had very poor classification ability for fruit russeting samples (recall of 41.50%, precision of 96.83%, and F1 score of 0.581). For the other four models, the test set precision was 97.14–100.00%, recall was 91.84–100.00%, and F1 scores were 0.951–1.00, which indicated that all four models achieved good classification performance for all 3 classes of samples.

The GoogLeNet model had the highest test set precision, recall, and F1 scores for russeting, intact, and cankered samples. Compared to the MobileNet V2, AlexNet, and VGG 16 models, the GoogLeNet model has a higher F1 score of 0.024, 0.039, and 0.042 for fruit russeting samples, 0.005, 0.009, and 0.003 for intact samples, and 0.005, 0.007, and 0.011 for cankered samples, respectively. Therefore, the GoogLeNet network with CB-FL combined with transfer learning had the best classification performance, which achieved a recall of over 98.64%, a precision of over 99.56%, the F1 score of over 0.993, and a test set accuracy of 99.78%.

3.2.2. Comparison of Traditional Machine Learning Models

For a more detailed study of defect detection methods for “Yuluxiang” pears, traditional ML methods were used to build models and compared with CNN models. Least squares support vector machines (LS-SVM) [37], partial least squares (PLS) [38], back propagation neural networks (BPNN) [16], random forests (RF) [39], and decision trees (DT) [17] were used for comparison to prove the effectiveness of the proposed methods. The test results of the ML models are shown in Table 5.

It was seen that RF and LS-SVM were better than BPNN, PLS, and DT for the test results in Table 5. RF achieved the highest test accuracy of 88.19%. In terms of recall and F-scores, RF was worse than LS-SVM for russeting samples, but better than LS-SVM for intact and cankered samples. The five ML models were easy to discriminate between intact and russeting samples, and the classification results were unsatisfactory. The F1 score for russeting samples was only 0.343–0.704. Combining Table 4 and Table 5, it was shown that CNN models with CB-FL (accuracy of 90.10–99.78%) were better than the ML models (accuracy of 80.32–88.19%) for test results. Compared to the ML models, the precision, recall, F1 score and accuracy of the models using GoogLeNet with CB-FL were significantly improved, respectively. Results of russeting samples were most improved in three classes, the precision, recall, and F1 scores were increased by 36.36–57.14%, 8.16–70.07%, and 0.289–0.650, respectively. Among the CNN and ML models, GoogLeNet network with CB-FL using transfer learning achieved the best detection results.

3.3. Model Test

To validate the practical ability of the model based on the GoogLeNet network of CB-FL combined with transfer learning, a surface defect detection system for “Yuluxiang” pear was developed based on PyQt5 and Python 3.7. There is a data collection button, a run button, a stop button, and a display of the recognition results for samples on the front end of the system, as shown in Figure 9. In the background of the system, the OpenCV camera was called using a written Python program to acquire images, and the developed model was called for the detection of surface defects for the pear.

Based on this system, “Yuluxiang” pear surface defect detection was carried out. Thirty-four russeting samples, 56 intact samples, and 37 cankered samples were used for testing. The results are shown in Table 6. The recall and F1 score were 91.89–98.21% and 0.928–0.965 respectively, which indicated that the model achieved good detection performance for all three classes. The accuracy of the model was 95.28% on this system. This research realizes the detection of surface defects, which has certain practical significance and application value for solving the quality automatic classification of the “Yuluxiang” pear.

Traditional ML methods are mainly used in the current defect detection of pears. Jiang [16] used BPNN and SVM to detect surface defects on “Dangshan” pears, and the average accuracy of the defective samples was 87% and 91.5%, respectively. Chen [17] used DT to perform defect identification of Korla Fragrant pears, and the average accuracy of defective samples was 90.5%. However, these methods required pre-processing and manual extraction of feature information for samples with a small size, and were difficult to expand to large data sets. CNNs were used in the classification of intact and blackspot defects for the Korla pear (an accuracy of 92.25–97.35%), and achieved better prediction results than ML (an accuracy of 73.29–91.26%) [18]. CNNs could automatically extract features, which reduced the complexity of manual extraction. Compared to the above studies, CB was introduced to improve the CNN for the classification of russetting, intact, and cankered “Yuluxiang” pears in this study. The results showed that the CNN (the best accuracy of 99.78%) was better than ML (an accuracy of 80.32–88.19%), and proved the capability of CNN in defect recognition.

Because of the uncertainty and high complexity of agriculture, there is a variety of data in the management of agronomic systems. Data imbalance between categories is a common problem in the actual datasets collected. In the detection of tea disease [23] and hydroponic lettuce seedlings status [40], the impact of data imbalance on CNN detection models was improved through the application of FL. Using SGM-CE, SM-CE, and FL as loss functions in this study, the GoogLeNet model combined with the transfer learning achieved good results for intact and cankerd samples, but unsatisfactory results for russeting samples with low sample numbers (a recall of 42.85–91.16% for the test set). Using CB-SGM-CE, CB-SM-CE, and CB-FL, the prediction results were improved for the 3 classes; the most significant improvement was seen for russeting samples (a recall improvement of 4.76–51.71%). CB-FL obtained the best prediction results in the 3 CB loss functions. Reweighting of the loss function using CB significantly enhanced the detection ability of the model for classes with small sample sizes, and improved the overall performance of the model. CNN includes a variety of classic network structures, such as GoogLeNe, VGG16, AlexNet, and MobileNet V2. When CB-FL was used in the training of the above networks with transfer learning, good results were achieved in this study. CB was independent of loss function and predicted class probabilities, and could be applied to a wide range of loss functions and deep networks in agriculture. Compared to other networks, the GoogLeNet network achieved the best results. For future applications in the “Yuluxiang” pear industry, a surface defect detection system was developed, and a test accuracy of 95.28% and an F1 score of 0.928–0.965 were achieved. During the training of the CNN model with CB, the weight was only related to the category to which the sample belongs when reweighting was performed, and CB achieved general class weighting, which provided a method for the automatic classification of agricultural product quality. This study addresses the impact of imbalance of collected data on CNN detection models in agriculture, achieved the detection of “Yuluxiang” pear surface defects, and helps to meet the needs of primary processing and reduce the post-production loss rate of agricultural products, which had certain practical significance and application value.

4. Conclusions

Fruit russet and canker are mainly two surface defects of “Yuluxiang” pears, which are classified mainly using inefficient manual identification. Besides, the collected dataset is imbalanced due to the unevenness of the number of pears between different defect types in actual production. To realize the surface defect detection and reduce the impact of data imbalance on the detection model of the “Yuluxiang” pear, this study introduced class balanced loss to construct a convolutional neural network as a generalized feature extractor, and compared the performance of CNN and machine learning models. Compared with the original loss function, CB-SGM-CE, CB-SM-CE, and CB-FL improved the performance, and the test set accuracy of GoogLeNet model based on the transfer learning was increased by 8.55%, 0.900%, and 1.58%, respectively. The class balanced loss achieved general class weighting based on the number of valid samples in the different classes, and improved the problem of suboptimal prediction for the class with a small sample size. Among the 3 class balance losses, CB-FL achieved the best classification performance, and the F1 scores of the test set of russeting, intact, and cankered samples were 0.993, 1.000, and 0.998, respectively. Compared with ML and the other CNN (VGG 16, AlexNet, SqueezeNet, and MobileNet V2 with CB-FL) models, the GoogLeNet model with CB-FL achieved the best detection results for all 3 classes. The accuracy of the test set was 99.78%. To validate the performance of models, a system for surface defect detection of “Yuluxiang” pear was developed. Based on this system, the testing accuracy of the GoogLeNet model with CB-FL was 95.28%. This study realized the defect detection of the “Yuluxiang” pear, and proved the classification ability of improved CNN with class balance in unbalanced datasets of agricultural products with small sample size. The proposed method could be extended to other future studies for intelligent detection in agriculture, which can help researchers and farmers to improve agricultural production and agronomic management. This study did not deploy the model into an industrial production line. In the future, a defect sorting production line for “Yuluxiang” pears will be developed, and the sorting efficiency of the model needs to be explored.

Author Contributions

Conceptualization, H.S.; methodology, H.S. and R.R.; software, H.S., L.S. and R.R.; writing—original draft, H.S.; writing—review and editing, H.S. and S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Applied Basic Research Project of Shanxi Province (Project No: 201901D211359), Science and Technology Innovation Fund Project of Shanxi Agricultural University (Project No: 2020BQ02), Award-funded Scientific Research Projects for Outstanding Doctors to Work in Shanxi Province (Project No: SXYBKY2019049), and The Key Research and Development Program of Shanxi Province (Project No: 201903D221027).

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhao, C. State-of-the-art and recommended developmental strategic objectives of smart agriculture. Smart Agric. 2019, 1, 1–7. [Google Scholar] [CrossRef]
Yang, S.; Bai, M.; Hao, G.; Zhang, X.; Guo, H.; Fu, B. Transcriptome survey and expression analysis reveals the adaptive mechanism of ‘Yulu Xiang’ Pear in response to long-term drought stress. PLoS ONE 2021, 16, e0246070. [Google Scholar] [CrossRef]
Wu, X.; Shi, X.; Bai, M.; Chen, Y.; Li, X.; Qi, K.; Cao, P.; Li, M.; Yin, H.; Zhang, S. Transcriptomic and Gas Chromatography-Mass Spectrometry Metabolomic Profiling Analysis of the Epidermis Provides Insights into Cuticular Wax Regulation in Developing ‘Yuluxiang’ Pear Fruit. J. Agric. Food Chem. 2019, 67, 8319–8331. [Google Scholar] [CrossRef]
Guo, Z.; Wang, Q.; Song, Y.; Zou, X.; Cai, J. Research progress of sensing detection and monitoring technology for fruit and vegetable quality control. Smart Agric. 2021, 3, 14–28. [Google Scholar] [CrossRef]
Shi, H.; Wang, Q.; Gu, W.; Wang, X.; Gao, S. Non-destructive Firmness Detection and Grading of Bunches of Red Globe Grapes Based on Machine Vision. Food Sci. 2021, 42, 232–239. [Google Scholar] [CrossRef]
Dhakshina Kumar, S.; Esakkirajan, S.; Bama, S.; Keerthiveena, B. A microcontroller based machine vision approach for tomato grading and sorting using SVM classifier. Microprocess. Microsyst. 2020, 76, 103090. [Google Scholar] [CrossRef]
Azarmdel, H.; Jahanbakhshi, A.; Mohtasebi, S.S.; Muñoz, A.R. Evaluation of image processing technique as an expert system in mulberry fruit grading based on ripeness level using artificial neural networks (ANNs) and support vector machine (SVM). Postharvest Biol. Technol. 2020, 166, 111201. [Google Scholar] [CrossRef]
Patel, K.K.; Kar, A.; Khan, M.A. Common External Defect Detection of Mangoes Using Color Computer Vision. J. Inst. Eng. Ser. A 2019, 100, 559–568. [Google Scholar] [CrossRef]
Ireri, D.; Belal, E.; Okinda, C.; Makange, N.; Ji, C. A computer vision system for defect discrimination and grading in tomatoes using machine learning and image processing. Artif. Intell. Agric. 2019, 2, 28–37. [Google Scholar] [CrossRef]
Narendra, V.; Pinto, A. Defects detection in fruits and vegetables using image processing and soft computing techniques. In Proceedings of the 6th International Conference on Harmony Search, Soft Computing and Applications, Istanbul, Turkey, 22–24 April 2020; Springer: Singapore, 2021. [Google Scholar]
Xie, W.; Wei, S.; Zheng, Z.; Yang, D. A CNN-based lightweight ensemble model for detecting defective carrots. Biosyst. Eng. 2021, 208, 287–299. [Google Scholar] [CrossRef]
Zhang, S.; Gao, T.; Ren, R.; Sun, H. Detection of Walnut Internal Quality Based on X-ray Imaging Technology and Convolution Neural Network. Trans. Chin. Soc. Agric. Mach. 2022, 53, 383–388. [Google Scholar] [CrossRef]
Li, X.; Ma, B.; Yu, G.; Chen, J.; Li, Y.; Li, C. Surface defect detection of Hami melon using deep learning and image processing. Trans. Chin. Soc. Agric. Eng. 2021, 37, 223–232. [Google Scholar] [CrossRef]
Fan, S.; Li, J.; Zhang, Y.; Tian, X.; Wang, Q.; He, X.; Zhang, C.; Huang, W. On line detection of defective apples using computer vision system combined with deep learning methods. J. Food Eng. 2020, 286, 110102. [Google Scholar] [CrossRef]
Xue, Y.; Wang, L.; Zhang, Y.; Shen, Q. Defect Detection Method of Apples Based on GoogLeNet Deep Transfer Learning. Trans. Chin. Soc. Agric. Mach. 2020, 51, 30–35. [Google Scholar] [CrossRef]
Jiang, L. Identification of DangShan Pears Surface Defects Based on Machine Vision. Master’s Thesis, Nanjing Forestry University, Nanjing, China, 2018. [Google Scholar]
Chen, F. Research on On-line Detection of External Defects of Korla Fragrant Pear. Master’s Thesis, Tarim University, Talimu, China, 2021. [Google Scholar]
Zhang, Y.; Wa, S.; Sun, P.; Wang, Y. Pear Defect Detection Method Based on ResNet and DCGAN. Information 2021, 12, 397. [Google Scholar] [CrossRef]
Johnson, J.M.; Khoshgoftaar, T.M. The Effects of Data Sampling with Deep Learning and Highly Imbalanced Big Data. Inf. Syst. Front. 2020, 22, 1113–1131. [Google Scholar] [CrossRef]
Krawczyk, B. Learning from imbalanced data: Open challenges and future directions. Prog. Artif. Intell. 2016, 5, 221–232. [Google Scholar] [CrossRef]
Guo, H.; Li, Y.; Shang, J.; Gu, M.; Huang, Y.; Gong, B. Learning from class-imbalanced data: Review of methods and applications. Expert Syst. Appl. 2017, 73, 220–239. [Google Scholar] [CrossRef]
Gao, J.; Ni, J.; Yang, H.; Han, Z. Pistachio visual detection based on data balance and deep learning. Trans. Chin. Soc. Agric. Mach. 2021, 52, 367–372. [Google Scholar] [CrossRef]
Li, Z.; Xu, J.; Zheng, L.; Tie, J.; Yu, S. Small sample recognition method of tea disease based on improved DenseNet. Trans. Chin. Soc. Agric. Eng. 2022, 38, 182–190. [Google Scholar] [CrossRef]
Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
Cui, Y.; Jia, M.; Lin, T.; Song, Y.; Belongie, S. Class-Balanced loss based on effective number of samples. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 9260–9269. [Google Scholar] [CrossRef]
Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2999–3007. [Google Scholar] [CrossRef]
Pan, S.; Yang, Q. A survey on transfer Learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
Gong, X.; Chen, Z.; Wu, L.; Xie, Z.; Xu, Y. Transfer learning based mixture of experts classification model for high-resolution remote sensing scene classification. Acta Opt. Sin. 2021, 41, 2301003. [Google Scholar] [CrossRef]
Su, S.; Qiao, Y.; Rao, Y. Recognition of grape leaf diseases and mobile application based on transfer learning. Trans. Chin. Soc. Agric. Eng. 2021, 37, 127–134. [Google Scholar] [CrossRef]
Rismiyati, R.; Luthfiarta, A. VGG 16 transfer learning architecture for salak fruit quality classification. Telematika 2021, 18, 37–48. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale. In Proceedings of the Image Recognition, IEEE Conference on Learning Representations, San Diego, CA, USA, 10 April 2015. [Google Scholar]
Szegedy, C.; Liu, W.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7 June 2015. [Google Scholar]
Howard, A.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. In Proceedings of the Computer Vision and Pattern Recognition, Honolulu, HI, USA, 17 April 2017. [Google Scholar]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobilenetV2: Inverted residuals and linear bottlenecks. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
Iandola, F.N.; Song, H.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-Level Accuracy with 50x Fewer Parameters and <0.5 MB Model Size. Available online: https://arxiv.org/abs/1602.07360 (accessed on 4 November 2016).
Suykens, J.; Vandewalle, J. Least squares support vector machine classifiers. Neural Process. Lett. 1999, 9, 293–300. [Google Scholar] [CrossRef]
Geladi, P.; Kowalski, B. Partial least-squares regression: A tutoria. Anal. Chim. Acta 1986, 185, 1–17. [Google Scholar] [CrossRef]
Luciano, A.C.d.S.; Picoli, M.C.A.; Duft, D.G.; Rocha, J.V.; Leal, M.R.L.V.; Maire, G. Empirical model for forecasting sugarcane yield on a local scale in Brazil using Landsat imagery and random forest algorithm. Comput. Electron. Agric. 2021, 184, 106063. [Google Scholar] [CrossRef]
Li, Z.; Li, Y.; Yang, Y.; Guo, R.; Yang, J.; Yue, J.; Wang, Y. A high-precision detection method of hydroponic lettuce seedlings status based on improved Faster RCNN. Comput. Electron. Agric. 2021, 182, 106054. [Google Scholar] [CrossRef]

Figure 1. “Yuluxiang” pear. (a) Intact sample; (b) Russeting sample; (c) Cankered sample.

Figure 2. Image acquisition system.

Figure 3. Accuracy curve for validation set using different loss functions.

Figure 4. Loss curve for validation set using different loss functions.

Figure 5. Confusion matrix based on different loss functions: (a) SGM-CE; (b) SM-CE; (c) FL; (d) CB-SGM-CE; (e) CB-SM-CE; (f) CB-FL.

Figure 6. Precision using different loss functions.

Figure 7. Recall using different loss functions.

Figure 8. F1 Score using different loss functions.

Figure 9. Example of surface defect detection for “Yuluxiang” pear: (a) Russeting pears; (b) Intact pears; (c) Cankered pears.

Table 1. Results of training and validation using different loss functions.

Loss Function	Accuracy of Train/%	Loss Value of Train	Accuracy of Validation/%	Loss Value of Validation
SGM-CE	82.27	0.5355	89.12	0.4906
CB-SGM-CE	100.00	0.0230	98.41	0.0950
SM-CE	92.76	0.2167	96.94	0.1849
CB-SM-CE	99.92	0.0160	98.64	0.0495
FL	91.89	0.0815	96.94	0.0593
CB-FL	100.00	0.0030	99.55	0.0428

Table 2. Results of testing using different loss functions.

Loss Function	Precision/%			Recall/%			F1 Score			Accuracy/%
Loss Function	Russet	Intact	Canker	Russet	Intact	Canker	Russet	Intact	Canker	Accuracy/%
SGM-CE	100.00	85.93	92.48	42.85	100.00	100.00	0.600	0.924	0.961	90.55
CB-SGM-CE	100.00	100.00	98.27	94.56	100.00	100.00	0.972	1.000	0.991	99.10
SM-CE	99.26	99.29	97.63	91.16	100.00	99.78	0.950	0.996	0.987	98.43
CB-SM-CE	100.00	100.00	98.70	95.92	100.00	100.00	0.979	1.000	0.993	99.33
FL	99.25	98.58	97.63	89.80	100.00	99.78	0.943	0.993	0.987	98.20
CB-FL	100.00	100.00	99.56	98.64	100.00	100.00	0.993	1.000	0.998	99.78

Table 3. Results of training and validation using different CNN models.

Network	Accuracy of Train/%	Accuracy of Validation/%	Accuracy of Test/%
GoogLeNet	100.00	99.55	99.78
VGG 16	99.51	97.85	98.43
AlexNet	99.92	96.03	98.54
SqueezeNet	89.24	93.08	90.10
MobileNet V2	99.51	95.81	98.99

Table 4. Results of testing set using different CNN models.

Network	Precision/%			Recall/%			F1 Score
Network	Russet	Intact	Canker	Russet	Intact	Canker	Russet	Intact	Canker
GoogLeNet	100.00	100.00	99.56	98.64	100.00	100.00	0.993	1.000	0.998
SqueezeNet	96.83	84.82	92.86	41.50	99.30	100.00	0.581	0.915	0.963
MobileNet V2	97.92	100.00	98.70	95.92	98.95	100.00	0.969	0.995	0.993
VGG 16	98.54	99.65	97.64	91.84	99.65	99.78	0.951	0.997	0.987
AlexNet	97.14	99.65	98.30	93.79	98.61	100.00	0.954	0.991	0.991

Table 5. Results of testing set using machine learning models.

Model	Precision/%			Recall/%			F1 Score			Accuracy/%
Model	Russet	Intact	Canker	Russet	Intact	Canker	Russet	Intact	Canker	Accuracy/%
LS-SVM	57.58	100.00	96.83	90.48	75.61	93.85	0.704	0.861	0.953	87.40
BPNN	54.55	93.55	98.41	85.71	70.73	95.39	0.667	0.806	0.969	85.83
RF	63.64	91.89	94.12	66.67	82.93	98.46	0.651	0.872	0.962	88.19
PLS	54.29	93.75	100.00	90.48	73.17	92.31	0.679	0.822	0.960	85.83
DT	42.86	91.89	81.58	28.57	82.93	95.39	0.343	0.872	0.879	80.32

Table 6. Results of testing set based on detection system.

True Class	Number of Samples for the Predicted Class			Recall/%	F1 Score
True Class	Russet	Intact	Canker	Recall/%	F1 Score
Russet	32	2	0	94.12	0.928
Intact	1	55	0	98.21	0.965
Canker	2	1	34	91.89	0.958

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sun, H.; Zhang, S.; Ren, R.; Su, L. Surface Defect Detection of “Yuluxiang” Pear Using Convolutional Neural Network with Class-Balance Loss. Agronomy 2022, 12, 2076. https://doi.org/10.3390/agronomy12092076

AMA Style

Sun H, Zhang S, Ren R, Su L. Surface Defect Detection of “Yuluxiang” Pear Using Convolutional Neural Network with Class-Balance Loss. Agronomy. 2022; 12(9):2076. https://doi.org/10.3390/agronomy12092076

Chicago/Turabian Style

Sun, Haixia, Shujuan Zhang, Rui Ren, and Liyang Su. 2022. "Surface Defect Detection of “Yuluxiang” Pear Using Convolutional Neural Network with Class-Balance Loss" Agronomy 12, no. 9: 2076. https://doi.org/10.3390/agronomy12092076

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Surface Defect Detection of “Yuluxiang” Pear Using Convolutional Neural Network with Class-Balance Loss

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset Construction

2.2. Class Balance

2.3. CNN Networks

2.4. Experimental Environment and Parameter

2.5. Evaluation Indicators

3. Results and Discussion

3.1. Effect of Class Balance Loss on Detection Models

3.1.1. Modeling Based on Class Balance Loss

3.1.2. Model Checking

3.2. Model Comparison

3.2.1. Comparison of CNN Models

3.2.2. Comparison of Traditional Machine Learning Models

3.3. Model Test

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI