3.3.2. Exploratory Spatial Data Analysis

An exploratory spatial data analysis (ESDA) can perform a correlation and aggregation analysis of neighborhood spatial data, which can effectively verify the spatial clustering characteristics of regional CLUTs. Two types of autocorrelation coefficients are usually used for this measurement. The first is the global spatial autocorrelation coefficient: the distribution of the Moran scatter plot is used to show the spatial correlation of the CLUT in the study area. The expression is

$$I = \frac{n\sum\_{i=1}^{n}\sum\_{j=1}^{n}\mathcal{W}\_{ij}(\boldsymbol{\chi}\_{i}-\overline{\boldsymbol{\pi}})\left(\boldsymbol{\chi}\_{j}-\overline{\boldsymbol{\pi}}\right)}{\sum\_{i=1}^{n}\sum\_{j=1}^{n}\mathcal{W}\_{ij}\sum\_{i=1}^{n}\left(\boldsymbol{\chi}\_{i}-\overline{\boldsymbol{\pi}}\right)^{2}}\tag{1}$$

where *I* is the global Moran index, and *x<sup>i</sup>* and *x<sup>j</sup>* are the CLUT index in cities *i* and *j*, respectively, and *x* represents the average of the CLUT indices, and *Wij* is the spatial weight matrix. In this study, a spatial adjacency matrix was used, which was constructed by GeoDa software. The value of *I* is [−1,1]. When *I* = 0, this indicates that the space is not autocorrelated; when *I* > 0, this means that there is a positive correlation, and when *I* < 0, this indicates that there is a negative correlation. The closer the absolute value of *I* is to 1, the greater the degree of clustering and the spatial correlation.

The second type is the local spatial autocorrelation coefficient: it can use an LISA graph to check the heterogeneity of the data calculation and reveal the correlation degree of the attribute values between spatial units and adjacent units. The formula is as follows:

$$I\_i = \frac{n(\mathbf{x}\_i - \overline{\mathbf{x}}) \sum\_{j=1}^{n} \mathcal{W} i\_j (\mathbf{x}\_j - \overline{\mathbf{x}})}{\sum\_{i=1}^{n} (\mathbf{x}\_i - \overline{\mathbf{x}})^2} \tag{2}$$

When *I<sup>i</sup>* > 0, high-high/low-low means that the spatial unit value is higher/lower than all the surrounding units and that the integrated spatial difference is smaller. When *I<sup>i</sup>* < 0, then low-high/high-low means that the lower/higher spatial unit value is higher/lower than the surrounding units and that the integrated spatial difference is smaller.

#### 3.3.3. Standard Deviation Ellipse

The standard deviation ellipse (SDE) is used to quantitatively describe the spatial characteristics of the elements. The azimuth of the ellipse represents the main trend direction, the long axis represents the dispersion of the geospatial elements in its direction, and the center of gravity represents the relative position. The results of the SDE calculation can reflect the spatial change in the CLUT (Equations (3)–(5)).

$$\tan \theta = \frac{\left(\sum\_{i=1}^{n} w\_i^2 \mathbf{x}\_i'^2 - \sum\_{i=1}^{n} w\_i^2 \mathbf{y}\_i'^2\right) + \sqrt{\left(\sum\_{i=1}^{n} w\_i^2 \mathbf{x}\_i'^2 - \sum\_{i=1}^{n} w\_i^2 \mathbf{y}\_i'^2\right)^2 + 4 \sum\_{i=1}^{n} \left(w\_i^2 \mathbf{x}\_i'^2 \mathbf{y}\_i'^2\right)}}{2 \sum\_{i=1}^{n} w\_i^2 \mathbf{x}\_i' \mathbf{y}\_i'} \tag{3}$$

$$\overline{X}\_{\overline{w}} = \frac{\sum\_{i=1}^{n} w\_i \mathbf{x}\_i}{\sum\_{i=1}^{n} w\_i} ; \overline{Y}\_{\overline{w}} = \frac{\sum\_{i=1}^{n} w\_i y\_i}{\sum\_{i=1}^{n} w\_i} \tag{4}$$

$$\sigma\_{\mathbf{x}} = \sqrt{\frac{\sum\_{i=1}^{n} (w\_i \mathbf{x}\_i^\prime \cos \theta - w\_i y\_i^\prime \sin \theta)^2}{\sum\_{i=1}^{n} w\_i^2}};\\ \sigma\_{\mathbf{y}} = \sqrt{\frac{\sum\_{i=1}^{n} (w\_i \mathbf{x}\_i^\prime \sin \theta - w\_i y\_i^\prime \cos \theta)^2}{\sum\_{i=1}^{n} w\_i^2}}\tag{5}$$

where tan *θ* is the azimuth angle of the ellipse, i.e., the angle formed by the clockwise rotation from due north to the long axis of the ellipse; (*Xw*,*Yw*) are the center of gravity coordinates, and *X<sup>i</sup>* and *Y<sup>i</sup>* are the spatial location elements, *W<sup>i</sup>* represents the weight, *x* 0 *i* , *y* 0 *i* represents the deviation of the coordinates of the elements at different points from the mean center, and *σ<sup>x</sup>* and *σ<sup>y</sup>* are the standard deviations along the x- and y-axes, respectively.

#### 3.3.4. Data Collection

The socioeconomic data come from the provincial and municipal statistical yearbooks of Hubei, Hunan, Jiangxi, Anhui, and Jiangsu in 2002, 2008, 2014, and 2020. Since ESA's land-use cover data has the advantages of authority, continuity, openness, and includes research time-point data, the land-use data in this research are derived from the ESA's global 300-m land cover data in 2001, 2007, 2013, and 2019. The administrative zoning data were obtained from the 1:4 million dataset of the China National Basic Geographic Information Center. To preprocess the land-use data, we first classified the land-use types into six types: namely, cultivated land, forestland, grassland, water bodies, construction land, and unused land. Afterward, the different land-use type data were recoded and assigned CLT1 to CLT6. On this basis, Fragstats software was then used to measure the fragmentation, aggregation, and landscape morphology indices of the cultivated land.

### **4. Results**
