**Step 1: Map sample vectors**

In this step, we utilize the Least Square Projection (LSP) [37], which is efficient for large data sets. The core process of LSP is mapping a representative subset first and then efficiently embedding others into the subset layout. To select a subset that best represents the original distribution, we use the SF-Kmedoids algorithm [38] on the basis of submatrix *DD* to split the samples into multiple clusters and define the centroid of each cluster as a control point. Then, the classical MDS algorithm [20] is applied to map them into 2D space. Since each point is located in the convex hull of its neighboring points, we embed other samples into the layout of the control points according to the neighborhood relationship among all samples. The projection results of samples *s i* (1 ≤ *i* ≤ *n*) are shown in Figure 4a, which was obtained using the air quality data analyzed in this paper. Each orange point represents a sample, and the Chengdu sample in March 2014 is marked in the figure.
