*2.5. The Correction of Apple Defect Detection Based on Pruned YOLO V4 Network*

Although as a lightweight semantic segmentation model BiSeNet V2 could realize real-time semantic segmentation, it might incorrectly segment the apple stem/calyx region as defect region. Therefore, the object detection model was further used to accurately determine the location of the defect region.

The result of object-detection algorithm required not only identifying the object category in the pictures but also marking the position parameters of the objects. Among them, RCNN, Fast RCNN, SPP-Net [21] and Faster RCNN [22] could be divided into two main parts: region proposal and extraction regions. Therefore, YOLO model had less computation and was faster than two parts methods, as YOLO model replaced numerous regions through grid division and anchor method. YOLO V4 [23] model was implemented based on Darknet framework, which could easily and flexibly use C++ language to deploy the trained network model in practical application. Therefore, a defect detection model based on the YOLO V4 was proposed to identify the defect region in RGB apple images.

The YOLO V4 object-detection algorithm was an improved version of YOLO V3 [24]. Compared with the YOLO V3 object-detection algorithm, YOLO V4 improved the speed and accuracy of real-time detection of the algorithm [25].

CSP (Cross Stage Partial) module could improve the learning ability of the network. CBL module was composed of the Convolution, batch normalization and Leaky\_ReLU and CBM module was composed of the Convolution, batch normalization and Mish [26]. These two modules were used to extract input image features. SPP (Spatial pyramid pooling) module used the max-pooling of different scales to pool the input feature layers and then stacked them, which could greatly increase the receptive field.

Recently, YOLO V4 has been used for defect detection of a variety of objects. With the proposal of YOLOv5 and YOLOX, many researchers focus on the newly proposed network, but from the perspective of practical application, YOLO V4 is easier to deploy and realize the online detection of apple defects. The architecture of YOLO V4 network is shown in the Figure 3.

**Figure 3.** The architecture of YOLO V4 network.

In order to realize the real-time detection of apple surface defects, it was necessary to deploy the YOLO V4 model into the apple automatic grading system. After experiment comparison, it was found that a large number of network parameters, network layers and complex structure of YOLO V4 lead to excessive calculation. It could not meet the real-time requirements of apple surface defect detection. Considering the memory occupation of the YOLO V4 and ensuring the real-time and stability in the detection process, a lightweight model, pruned YOLO V4, was obtained by compressing the YOLO V4 network based on RGB images of apples. Model pruning method could achieve the best balance between model detection speed and detection accuracy. It was also a method to automatically obtain the simplest network structure of the original model. The pruned YOLO V4 model could be deployed on the Windows 10 operating system and hardware and realized the real-time detection of apple surface defects.

In order to obtain pruned YOLO V4, firstly, the sparsity training was introduced into the YOLO V4 network. The scale factors were sorted after sparsity training, and the maximum scale factor meeting the requirements of pruning rate was set as the threshold. Then, it deleted the channels that were less than the threshold. High contribution channels were retained, and low contribution channels were deleted according to the scale factor. So, the channel pruning was completed. However, the object detection model still could

not meet the requirements of real-time detection after the channel pruning of the model, which compressed the network width. So, the depth of the network model was further compressed using the layer-pruning method. The mean value of the scale factor of each layer was sorted. The layer with the lower mean value was selected for pruning, which completed the layer pruning. After completing the channel pruning and layer pruning of the YOLO V4 model, the accuracy of the model may decline. Fine-tune operation was used to improve the detection accuracy of the pruned model. Finally, a pruned YOLO V4 model for defect detection was obtained. The result of pruning is shown in Figure 4.

**Figure 4.** The channel changes of each layer of YOLO V4 model before and after pruning.

Pruned YOLO V4 network could accurately locate the location of apple defects and identify the apple stem/calyx. For the defection of regions identified by the pruned YOLO V4 network, the corresponding location information of each defect region was compared with the binary image IB generated by BiSeNet V2 network. For the position where the gray value Bv was in the defect area determined by pruned YOLO V4 network, it was determined as the defect area. Finally, the defect region in apple image was segmented accurately by combining BiSeNet V2 network and pruned YOLO V4 network.
