*4.4. Ablation Studies*

4.4.1. Evaluation of Temporally Aligned Pooling Manners

We evaluate the effects of the mentioned three temporally aligned pooling manners of our STAR algorithm on the iLIDS-VID dataset, i.e., average pooling (STAR\_avg), max pooling (STAR\_max), and key frame pooling (STAR\_key), as shown in Table 2. The results show that the performance of STAR with average pooling is slightly better than that with max pooling and much better than that with key frame pooling. STAR\_key performs the worst, for which we believe the reason is that it is very difficult to exactly localize the key points, due to discrete frames and noise. To demonstrate the effects of the temporal pooling manner, STAR with no pooling (STAR\_no) is also reported in Table 2. We observe that STAR\_no performs worse than STAR with a temporal pooling manner, even with key frame pooling. This validates the important role of temporally aligned pooling in video-based representation for re-id, and average pooling or max pooling is a good choice. STAR\_avg performs best, thus we employ average pooling manner for quantitative comparison in Section 4.2.

**Table 2.** Evaluation of three different pooling manners on iLIDS-VID. Bold values indicate the best performance.


Surprisingly, compared with Table 1, we observe that the proposed STAR with key frame pooling can perform comparably with TDL, and even STAR without pooling can outperform GEI + RSVM [59], HOG3D + DVR [30], Color + LFDA [82], STFV3D + KISSME [31], CS-FAST3D + RMLLC [57], and SRID [55]. This demonstrate the outstanding performance of the proposed method.
