**3. Results**

#### *3.1. Experimental Configuration and Evaluation Metrics*

All experiments were conducted on a server running the Ubuntu 16.04 operation system. The server was equipped with two Tesla p40 GPUs and a Xeon Gold 5118 CPU. The resolutions of images from the dataset were resized to 560 × 420. Due to the utilization of transfer learning, the model converged rapidly, and the number of training epochs was set to 10 while each branch was separately trained. When two branches were trained jointly, the number was increased to 20. For all experiments, the initial learning rate was set to 0.0001 and the Adam optimizer was used. Additionally, 5-fold cross validation was implemented. The training group with fold *K* set for testing was named training group *K*.

Intersection over union (IoU) was utilized as the metric form of segmentation tasks of this paper to evaluate the accuracy of the outputs. IoU is calculated as follows:

$$IoI = \frac{TP}{TP + FP + FN} \tag{5}$$

In experiments of the foreground extraction task, only the IoU of foreground that represented bodies of target vehicles were counted.

In the EV fire trace recognition task, the number of classes was set to 4, so the mean IoU (mIoU) of 4 classes was calculated to evaluate the performance. As discussed in 3.1, the 4 classes were IN, MB, SB, BG, and the mIoU could be calculated as follows:

$$mIoI = \frac{1}{4}(IoI\_{BG} + IoI\_{IN} + IoI\_{MB} + IoI\_{SB})\tag{6}$$

Additionally, to evaluate the accuracy of vehicle body segmentation, the union regions of IN, MB and SB were regarded as "Vehicle Body" (VB) regions; to evaluate the segmentation accuracy of burnt regions as a whole, the union of MB regions and SB regions were regarded as "Fire Trace" (FT) regions. Their IoU was thus calculated as follows:

$$IoL\_{VB} = \frac{I\_{IN \cup MB \cup SB}}{I\_{IN \cup MB \cup SB}} \tag{7}$$

$$IoU\_{FT} = \frac{I\_{MB\cup SB}}{I\_{MB\cup SB}}\tag{8}$$
