*2.8. Visual Recognition of Heatmap*

In order to visually explain the detection process of YOLOv7+CBAM network model, the visualization method of gradient-weighted class activation mapping (Grad-CAM) [29] was adopted in this paper. The recognition effect of YOLOv7 network model, YOLOv7+ECA network model and YOLOv7+CBAM network model was compared, respectively. In the Grad-CAM visualization method, the fusion weights of target feature maps are expressed as gradients, and the global average of gradients is used to calculate the weights. After the weights of all feature maps of each category are obtained, the weights are weighted and heatmap is obtained.

The heatmap can intuitively show the focus of attention of the model when extracting features. The warmer the color, the more attention of the model, and the red part (the warmest part) represents the focus of the model. Leafy, occluded and densely distributed buds were plotted using Grad-CAM, respectively, as shown in Figure 11. As can be seen in Figure 11, the YOLOv7+CBAM network model can accurately focus on different types of images and was little affected by background factors, which further proves that the network proposed in this study had a better effect on improving the detection effect of famous and excellent tea.

**Figure 11.** Activation graphs of famous and excellent tea image classes of different models.
