**4. Individual Results**

Figure 12 gives some detection results from our detection network. The first column shows infrared images, the second column shows visible images, and the third column shows detection results. From the first three rows, we can see that our method is able to accurately recognize low-contrast objects. Owning to the focused feature enhancement module, the proposed network outputs good detection results for small objects (see the fourth row in Figure 12). Thanks to our proposed cascaded semantic extension module, the large object in the fifth row is accurately located. In the daytime (see the last row), our method still produces good detection performance.

**Figure 12.** (**<sup>a</sup>**–**<sup>c</sup>**) Some detection results from our detection network. In order to better display the results, the detection results are shown on the fused images. The images in the first three rows have very low contrast; the objects in the fourth row are relatively small; the object in the fifth row is relatively large; and the images in the last row are in the daytime.
