**3. Superpixel-Based Temporally Aligned Representation**

This paper focuses on appearance representation for video-based person re-id. In this section, we introduce the proposed superpixel-based temporally aligned representation by (1) extracting motion information based on superpixel tracking (Section 3.1), (2) selecting the "best" walking cycle using an unsupervised method (Section 3.2), and (3) constructing a 3D representation based on superpixel-based representation (Section 3.3) and temporally aligned pooling (Section 3.4). We depict the entire framework in Figure 1.

**Figure 1.** Framework of the proposed superpixel-based temporally aligned representation.
