**5. Conclusions**

We have proposed a novel superpixel-based temporally aligned representation for video-based person re-identification. This representation focuses on both spatial and temporal alignment problems in video-based representations. To achieve temporal alignment, we select a video fragment of a walking cycle and describe the video fragment using temporally aligned pooling. To further improve spatial alignment, a superpixel is introduced to extract motion information and describe a still image. Unlike most previous video-based representations for re-id that use all the frames to build a spatio–temporal feature, we proposed to use only a "best" walking cycle, to reduce redundant information and simultaneously keep the "best" information. The extensive experiments, conducted on iLIDS-VID, PRID 2011 and MARS datasets, demonstrate that our method outperforms the state-of-the-art approaches.

**Author Contributions:** Conceptualization, C.G. and J.W.; methodology, C.G.; software, C.G.; validation, C.G., L.L. and J.-G.Y.; writing–original draft preparation, C.G.; writing–review and editing, C.G., J.-G.Y. and N.S.; visualization, J.W. and L.L.; supervision, N.S.; project administration, N.S.; funding acquisition, C.G.

**Funding:** This work was supported by National Natural Science Foundation of China (No. 61876210), and Natural Science Foundation of Hubei Province (No. 2018CFB426).

**Conflicts of Interest:** The authors declare no conflict of interest.
