2.4.1. Trend Analysis

The method used in this study is shown in Figure 3. The spatiotemporal trends of LSP during 2001–2019 were estimated using *Sen's slope* method, also known as the Theil-Sen median method. The method is a robust nonparametric statistical method for trend calculation that is insensitive to measurement errors and is far more accurate than nonrobust simple linear regression [36]. *Sen's slope* was calculated using Equation (2):

$$Sen's\,slope = Median\left(\frac{\mathbf{x}\_j - \mathbf{x}\_i}{j - i}\right) \tag{2}$$

where the median is the mean value of all the slopes, and *xi* and *xj* represent the LSP dates of years *i* and *j*. A negative *Sen's slope* indicates an advancing trend, whereas a positive *Sen's slope* indicates a delaying trend.

**Figure 3.** Flowchart of the method used in this study. MODIS 13A2 NDVI (2001–2019), in which the spatiotemporal distributions of NDVI for 2001, 2010, 2015 and 2019 are presented as examples of NDVI time series used as extract phenology metrics. Land cover (2001–2019), in which the spatiotemporal distribution of land cover for 2001, 2010, 2015 and 2019, used as an example of the changes in different vegetation types. Environmental factors, the first column indicates the drivers for SOS and the second column indicates the drivers for EOS, and shows the spatial distribution of each driver separately, which is used to example the predictors used for RF models.

Then, we used the Mann-Kendall (MK) method to test the significance of time series trends, which is a nonparametric statistical test and is robust to outliers [37]. We used the normal cumulative distribution function to determine the *p*-value of the MK test statistic with a significant confidence level of *p* < 0.1. In this study, we used *Sen's slope* and MK test to trend analysis and significance test the spatial distribution (average growth) and interannual variation (interannual growth) of LSP for different vegetation types from 2001 to 2019, all using MATLAB 2017a were completed.

#### 2.4.2. Change Pattern and Relative Attribution Analysis

To further understand the seasonal changes of vegetation growth in the QMs in the past two decades, we divided the trend changes of LSP (SOS, EOS, and LOS) into six combinations, where each combination represents a pattern of vegetation growth (Table 2). We used spatial analysis to count the proportion and significance of different vegetation for each pattern and to derive the dominant pattern of seasonal changes in the growth of each vegetation. All analysis was accomplished using ArcGIS 10.4.


**Table 2.** Six change patterns of plant growing seasons, i.e., six combination types of SOS, EOS, and LOS change trends. The plus and minus signs represent the trend direction corresponding to SOS, EOS, or LOS, respectively.

To evaluate the symmetry of SOS and EOS for LOS changes, we used the C-index proposed by Garonna et al. [38] to calculate the relative contribution of trends in SOS and EOS for LOS changes. It was calculated as follows:

$$C = -1 + \frac{2 \cdot \left| SOS\_{slope} \right|}{\left| SOS\_{slope} \right| + \left| EOS\_{slope} \right|} \tag{3}$$

where *SOSslope* and *EOSslope* are the *Sen's slope* of SOS and EOS, respectively. A positive *C* value indicates that the trend in LOS is mainly attributable to changes in EOS, and a negative *C* value indicates that the trend in LOS is mainly attributable to changes in SOS. The variation of *C* value is from −1 to 1.

#### 2.4.3. Analysis of the Relative Importance of Different Drivers

Based on previous studies on the drivers of interannual variation in vegetation phenology [18–24], we selected drivers such as Table 3 to simulate SOS and EOS. These drivers are divided into three main categories: meteorological, soil and biological factors, which are further divided into preseason cumulative values and cumulative values throughout the growing season.

**Table 3.** Predictive variables used in the modeling of the LSP dates. The 12 predictive variables were classified into three categories: meteorological factors, soil factors, and biological factors. Meteorological factors include TP, TG, PP, PG, SWP, and SWG (6 in total). Soil factors include STP, STG, SMP, and SMG (4 in total). Biological factors include MN and MD (2 in total). Growing season is defined as the days between SOS and EOS.


\* Predicted over a period of 30 days. \*\* Predicted over a period of 60 days.

In this study, the RF model was used to assess the relative importance of the drivers affecting interannual variations in LSP. First, for each pixel, we calculated the partial correlation coefficients between environmental factors (TP, TG, PP, PG, SWP, SWG, STP, STG, SMP, SMG, MN, and MD) and SOS and EOS during 0, 1, 2, ... *n* months before SOS

and EOS, and we separately derived the highest correlation with SOS and EOS for the time range of 30 days before SOS and 60 days before EOS for environmental data. Moreover, a subset of variables highly correlated with SOS and EOS was selected, and the values of the variables at selected years and locations (spatiotemporal models) were combined into a set of input feature vectors that are used as inputs for the RF algorithm. Then, these feature vectors were divided equally into two subsets, with 2/3 of the dataset used for model training (in bag) and the remaining 1/3 of the dataset used as an additional test of the RF internal computation (out of bag, i.e., OOB) to estimate the importance of each variable. Variable importance can also be measured by OOB, which compares the increases in OOB error with that variable randomly permuted and all others unchanged [39]. The importance score of a variable is as follows:

$$VI\left(X^{j}\right) = \frac{1}{ntrne} \sum\_{t} \left( err'OOB\_{t}^{j} - errOOB\_{t}^{j} \right) \tag{4}$$

where *Xj* is the *j*th variable, *ntree* is the number of trees, *errOOBjt* is the OOB error of each tree *t*, and *errOOBjt* is the OOB error when *Xj* is permuted, while all other variables remain unchanged among OOB data. For regression, the OOB error is the mean square error.

Finally, to optimize the model, the hyperparameter search was used to select the best tested hyperparameter set, and the optimal model was trained on the whole training set for accurate prediction of the LSP dates. Besides modeling multi-year-scale LSP variation for the entire region, subregional variation according to different vegetation types was also modeled to quantify the relative importance of different drivers. These models were constructed using the RF package in R statistical software.

To evaluate the predictiveness of the model and to further test the applicability of the model, we used a randomly selected subset (1/3 of the dataset) for model validation. Both the proportion of explained variance (R2) and the root mean square error (RMSE) were used to assess the performance of the model on the complete datasets. The stability of the model fit is explained by the mean absolute error (MAE) [40]. These statistics were calculated as follows:

$$\mathcal{R}^2 = 1 - \left[ \frac{\sum\_{i=1}^n \left( y\_i - \widehat{\overline{y}}\_i \right)^2}{\sum\_{i=1}^n \left( y\_i - \overline{y}\_i \right)^2} \right] \tag{5}$$

$$RMSE = \sqrt{\frac{\sum\_{i=1}^{n} \left(y\_i - \widehat{y}\_i\right)^2}{n-1}} \tag{6}$$

$$MAE = \sum\_{i=1}^{n} \left| \frac{y\_i - \stackrel{\frown}{y}\_i}{n} \right| \tag{7}$$

where *yi* is the observed satellite-based SOS or EOS values, is - *y i* the predicted RF-based SOS or EOS values, and *yi* is the mean satellite-observed SOS or EOS values of all selected test pixels for 2001–2019. *n* is the sum of all selected test pixels.
