2.3.3. Experimental Platform and Parameter Setting

The experimental configuration consists of the software environment of Pytorch1.8.0 and the hardware equipment of the CPU of lntel (R) Core (TM) i5 and GPU of NVIDIA GeForce RTX 3060 Ti with 16 GB memory and the Windows10 operation system, as shown in Table 1. During the training process of the YOLOv5 model, Adaptive Moment Estimation (Adam), an algorithm for stochastic optimization, is used - which only requires little memory. The momentum factor was set at 0.937 and the initial learning rate was 0.001. The learning rate is adjusted by the cosine annealing method [42]. The weight attenuation coefficient, the Batch size, and the epoch times of training was set to 0, 16, and 300, respectively. The method of label smoothing was used to smooth the classification label of the image, which could help to avoid overfitting. The value of label smoothing was 0.01.

**Table 1.** Experimental platform configuration.


#### **3. Results**

*3.1. Comparison of Experimental Results on a Public Dataset*

In order to compare the performance of the improved model with that of the original YOLOv5 model, an experimental comparison was made on the PASCAL VOC dataset. The experimental results are shown in Table 2.

**Table 2.** Experimental results on PASCAL VOC dataset.


The improved YOLOv5 model can increase the mAP by 1.11% and enhance the effect of objection detection.

#### *3.2. Ablation Study*

In order to compare the performance of the improved model with other different models, ablation studies were carried out. The label smoothing method was used to process the images in the training experiment. Evaluation was performed every 10 epochs of training. A total of 300 epochs were trained in the experiment. The experimental results are shown in Table 3.

**Table 3.** Ablation study results for model on self-built construction waste dataset.


YOLOv5\_Y, YOLOv5\_C, YOLOv5\_S, YOLOv5\_D YOLOv5\_CS, YOLOv5\_CD, and YOLOv5\_SD represent the original YOLOv5 model, the model with CBAM, the model with the SimSPPF module, the model with improved multi-scale detection, the model with the CBAM and SimSPPF module, the model with CBAM and improved multi-scale detection, and the model with the SimSPPF module and improved multi-scale detection, respectively. Our study denotes the proposed YOLOv5 model, i.e., the model with all of the CBAM, the SimSPPF module and improved multi-scale detection. Additionally, AP values of the four classes for the YOLOv5 original model are described in Figure 9 and for the improved YOLOv5 model in Figure 10.

**Figure 9.** The AP Value of the Four Classes for YOLOv5 Original Model.

**Figure 10.** The AP Value of Four Types of objects for Our Improved Method.

It can be seen from Table 2 that the mAP of the original model of YOLOv5 on the construction waste dataset is the lowest, at only 0.8991. When CBAM was added, mAP can be improved by 4.6%, up to 0.9451, but with the replacement of the SPP module with the SimSPPF module, mAP could only be increased by 3.88%. Even so, the method of improvement of multi-scale detection can increase mAP by 4.69%. When both CBAM and the SimSPPF module were added, mAP increased by 4.76%. When increasing CBAM and improving multi-scale detection, mAP could be increased by 4.26%. With the addition of the SimSPPF module and the improvement of multi-scale detection, mAP was increased by 4.54%. However, in the experiment with our proposed method, i.e., when adding the CBAM, the SimSPPF module and improving multi-scale detection, mAP can be improved by 4.89%, the highest of all the different models.

Similarly, the F1-scores of brick, wood, stone and plastic of the original YOLOv5 model are 0.84, 0.82, 0.89 and 0.85, respectively, as shown in Table 2. However, in the proposed method, which adds all the CBAM and the SimSPPF module and improves multi-scale detection, the F1-scores of brick, wood, stone and plastic are 0.89, 0.91, 0.93 and 0.92, increasing by 5%, 9%, 4% and 7%, respectively. Therefore, the improved YOLOv5 model has a higher accuracy and better availability in objection detection, thereby improving the efficiency of sorting construction waste.

#### *3.3. Contrast Experiment*

To further verify the advantages and effectiveness of the improved YOLOv5 model on the construction waste dataset, a contrast experiment was also carried out. Comparing the improved model with other conventional models, such as YOLOv7, YOLOv5, YOLOv4, YOLOv3 and Faster-RCNN models, the loss indicator and mAP of every model in the training experiment and testing experiment are shown in Figure 11.

**Figure 11.** The Value of Loss and mAP of Different Models.

The loss of every model decreases rapidly in the first 20 epochs, which shows the training does not reach a stable state. If the training became stable, the loss in the curve would be flat rather than steep. When the training reaches a relatively stable state, the loss in our model is lower than that of other models. Meanwhile, the mAP value of every model increases rapidly at the first 60 epochs. Similarly, ours has the most obvious improvement among all the models. After 200 epochs of training, all models tend to be steadier, and the mAP of ours is significantly higher than that of other models.

The values of the evaluation indicators are also described in Table 4. Compared with YOLOv7, YOLOv5, YOLOv4, YOLOv3 and the Faster-RCNN model, the mAP value of our model increased by 2.15%, 4.89%, 8.24%, 16.12% and 7.78%, respectively. It shows that the improved YOLOv5 model can improve the accuracy to classify and detect construction waste.


**Table 4.** The Performance Contrast for Different Models.
