**1. Introduction**

Vegetation is a critical component of terrestrial ecosystems and is very sensitive to climate change [1–3]. The global average surface temperature increased by 0.85 ◦C from 1880 to 2012 [4], which triggered phenological changes in different vegetation types in different regions. The increase in temperature, as one of the causes of variation in vegetation, has led to a significant overall change in vegetation, manifested by an increase in the Normalized Difference Vegetation Index (NDVI) during the vegetation growth season in the Northern Hemisphere [5], and the growth rate of NDVI in forests is greater than that of other vegetation types [6–8]. The community structure of snow-meadow vegetation has changed significantly as a result of climate change in Northern Japan over the last 40 years [9]. In the Siberian Mountains, the birch area has increased by 10%, and birch stands and the treeline boundary have moved upslope at a rate of 1.4 m yr<sup>−</sup><sup>1</sup> and 4.0 m yr<sup>−</sup>1, respectively, since the 1970s with the onset of warming [10]. In China, the zone of tundra vegetation of the Changbai Mountains has been invaded by herbaceous plants with the rising temperature over the last 30 years [11].

As the third pole of the earth, the Tibetan Plateau (TP) is highly sensitive to climate change and has been experiencing a rapid warming of 0.4◦ 10 yr<sup>−</sup><sup>1</sup> over the last 30 years [12,13] and with precipitation increasing by 1.96 mm 10 yr<sup>−</sup><sup>1</sup> in 1994–2015 [14].

**Citation:** Liu, J.; Lu, Y. How Well Do CMIP6 Models Simulate the Greening of the Tibetan Plateau? *Remote Sens.* **2022**, *14*, 4633. https://doi.org/10.3390/rs14184633

Academic Editor: Lunche Wang

Received: 1 July 2022 Accepted: 13 September 2022 Published: 16 September 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

This "warm–humid" trend has led to tremendous changes on the land surface, such as glaciers collapsing [15], permafrost thawing [16], and lakes expanding [17], as well as surface vegetation growth. Liu et al. [18] found that the vegetation coverage on the TP showed a trend of "overall increase and partial degradation" from 1981 to 2005, with the area of improvement much larger than the area of degradation. Wei et al. [19] found that "warm-humid" has a significant promoting effect on the improvement of vegetation on the TP, and Zhang et al. [20] found that the overall NDVI of grassland in the growing season of the TP also shows an increasing trend. Xu et al. [21] used the leaf area index inversion by NOAA–AVHRR to study the temporal and spatial changes in vegetation cover characteristics in the TP, and also found an overall increase in vegetation cover. Zhang et al. [13] found that the green-up dates with the alpine vegetation in the Plateau had a continuous advancing trend with a rate of ~1.04 <sup>d</sup>·y<sup>−</sup><sup>1</sup> from 1982 to 2011.

Remote sensing, as one of the major tools for studying vegetation's response to climate change [22], was used to study the vegetation on the TP, with various long-term vegetation leaf area index (LAI) datasets derived through satellite remote sensing, such as GLASS LAI [23], GLOBMAP LAI [24], GIMMS LAI [25], and MODIS LAI [5]. Hua et al. (2018) [26] used the GIMMS NDVI dataset (NDVI-3g) to study the temporal and spatial variations in vegetation dynamics controlled by climate on the Tibetan Plateau during 1982–2011 and found that the potential cause of the change in vegetation dynamics might be controlled by the climate, particularly the increasing precipitation and the significant temperature rise in the Central and Southeastern Tibetan Plateau. Although remote sensing products are very useful for understanding historical vegetation variations, satellite remote sensing could not directly measure future vegetation dynamics. Another powerful tool, the state-ofthe-art Earth System Models that incorporate a process-based vegetation growth module, can simulate not only historical variations in vegetation but also those in future climate. Zhu et al. [27] built the first pedotransfer function to simulate temporal variations in vegetation coverage (VC) and found that the pedotransfer function more accurately simulated temporal variation in VC than a multiple linear regression in an alpine meadow on the Tibetan Plateau. Lu et al. [28] found that net primary productivity (NPP) and LAI decreased from the southeast to the northwest of the Tibetan Plateau by using the atmosphere–vegetation interaction model (AVIM) to simulate the distribution of LAI and NPP over the Tibetan Plateau. The accuracy of the simulation results varies greatly due to the design and use of the model itself, so it is very important to evaluate the accuracy of the simulation data before using the simulations.

The International Coupled Model Comparison Program (CMIP), proposed by the World Climate Research Program Group, currently in the sixth generation (CMIP6), has been widely used for studying various environmental changes. Tian et al. [29] analyzed changes in the annual mean surface air temperature (SAT) and precipitation, and also the related uncertainties using historical simulations and future projections under the Representative Concentration Pathway scenarios (RCPs) from the CMIP5 models across China and in its seven sub-regions. Zhang et al. [30] demonstrated that there may be a basic spatial scale limit below which it may not be useful to further refine climate model predictions based on an integrated analysis of coupled model simulations and projections from CMIP3 and CMIP5. Using the established linear relationship and monthly temperature simulations from CMIP5 models over the Northern Hemisphere during the 21st century, Xia et al. [31] found the start of the vegetation growing season (SOS) will have advanced by 4.7 days under RCP2.6 (Representative Concentration Pathway) by 2040–2059. After CMIP5, more and more models have incorporated a dynamic vegetation growth module, and therefore evaluating CMIP vegetation simulations has drawn much attention. Anav et al. [32] assessed the ability of 18 Earth system models (ESMs) in CMIP5 and found that most models overestimated the global average LAI and half of the models also overestimated the LAI trend for 1986–2005. Zhao et al. [33] analyzed the changes in projected global LAI from 16 CMIP5 ESMs and 17 CMIP6 ESMs, and found that the CMIP6 models had a better ability to describe the global area-averaged LAI time series.

Lawrence et al. [34] did not evaluate the performance of the simulated global tree height of the CMIPs' ESMs but gave the biases of tree height for the offline simulations of CLM5BGC. Brovkin et al. [35] evaluated the performance of MPI-ESM, and Seller [36] evaluated UKESM1-0-LI in terms of vegetation distribution; both found that the two models overestimated the fraction of tree coverage. Most evaluations have focused on the global scale; few have focused on regional scales such as the Tibetan Plateau. Bao et al. [37] evaluated 12 CMIP5 ESMs for reproducing vegetation cover and LAI over the Tibetan Plateau in 1986–2005, and found that INMCM4, BCC-CSM-1.1M, MPI-ESM-LR, IPSL-CM5A-LR, HadGEM2-ES, and CCSM4 were the best six models for capturing vegetation among the 12 models. CMIP6 has had the largest participation since its implementation [38]. However, how well the CMIP6 models simulate vegetation growth, especially the recent greening of the Tibetan Plateau, is unknown.

LAI is usually defined as half of the total leaf surface area per unit of surface area [39], and NDVI is defined as the ratio of the difference between the near-infrared band (NIR) and the visible red band (R), and the sum of the two bands, NDVI = (NIR − R)/(NIR + R). NDVI is directly obtained from the satellites' reflection information and the real-time variation of vegetation after a simple calculation, which can quantitatively reflect the actual variation of vegetation, including the vegetation structure, the vegetation growth, and the vegetation coverage during the observation period, and is widely used in the field of vegetation remote sensing [40–42]. LAI and NDVI are both important indices for quantifying the vegetation variations, but only LAI could be validated because NDVI is not an output of the dynamic vegetation growth models in CMIP6. LAI, as a key indicator of vegetation growth [43], has been widely used in global climate models, ecological models, hydrological models, and ecosystem productivity models [44]. Therefore, we focused on LAI validations in our work rather than NDVI.

In recent decades, although greening is one of the most important changes in the Tibetan Plateau, few works have particularly focused on the performance of the model simulations on the greening of the Tibetan Plateau. We developed our own ranking method that considered the temporal and spatial simulations' abilities to give an overall assessment of CMIP6 models. We also quantified the growth of different vegetation types. Our goals with this work are to evaluate the performance to simulate the LAI trend and LAI of the CMIP6 model during the growing season and to provide a reference for the selection of simulation data of vegetation changes, aid the research into vegetation in the Tibetan Plateau, and analyze the sources of temporal and spatial error in each model, laying a foundation for model optimization.

### **2. Data and Methods**

### *2.1. Study Area*

The TP [45,46] is located at 26–39◦N latitude in Southwest China. Surrounded by high mountains on the edge of the area, the internal topography is complex, including plateaus, basins, glaciers, lakes, and swamps [47]. Its geographical features, such as the high altitude, and the complex and changeable topography, have created special climatic conditions and water and heat distribution in this area, and have also created its distinctive vegetation distribution. As the largest alpine grassland ecosystem in the world, the TP is dominated by meadows and grasslands (Figure 1), concentrated across a wide range of Central Tibet. The vegetation types in Tibet have spatial distribution characteristics that gradually change from southeast to northwest. From southeast to northwest in Tibet, the vegetation types are distributed in the order of forests, shrubs, meadows, grassland, and desert (Figure 1). The dataset is derived from the 1:1 million vegetation data set collected in China in 2001, and it is provided by the National Cryosphere Desert Data Center (http://www.ncdc.ac.cn) (accessed on 9 December 2021).

**Figure 1.** (**a**) The location of the Tibetan Plateau [48] in the world map, and the world map from ArcGIS. (**b**) Distribution of vegetation types on the Tibetan Plateau [49].

### *2.2. Satellite Data*

To evaluate the ability of the 35 models from the CMIP6 to reproduce the LAI over the Tibetan Plateau, the 1981–2018 LAI data from the Global Land Surface Satellite (GLASS) dataset with an eight-day temporal frequency and a 0.5◦ × 0.5◦ spatial resolution were

used as a benchmark in our study. GLASS LAI uses generalized regression neural networks (GRNNs) to invert LAI from Time-Series AVHRR Surface Reflectance data; the algorithm trains GRNNs using preprocessed AVHRR Time-Series AVHRR Surface Reflectance, and then uses rolling processing to produce time-continuous long-term GLASS LAI products from the preprocessed AVHRR Surface Reflectance [23]. Compared with other LAI datasets, GLASS LAI data have a long observation period, high quality, and good accuracy [50]. They have more complete trajectories than the MODIS LAI product and also show lower uncertainty than the MODIS and CYCLOPES LAI products compared with 20 ground-measured LAI reference maps. Many studies use GLASS LAI as a reference database for research or validation [51–55]. All these factors make it an ideal long-term dynamic LAI observation dataset in this study. The GLASS LAI product (V50) used in this study is available from the University of Maryland and the Center for Global Change Data Processing and Analysis of Beijing Normal University (http://www.glass.umd.edu/Download.html, accessed on 9 March 2021).

### *2.3. CMIP6 Model Simulations*

Thirty-five CMIP6 models with no missing data were selected in this study, and the LAI from outputs of historical simulations for 1850–2014 was used (https://esgf-node.llnl. gov/search/cmip6/, accessed on 16 August 2021).

In order to facilitate the comparison of the simulation and observational data, all simulations were downloaded and converted to a 0.5◦ × 0.5◦ spatial resolution by bilinear interpolation from low to high resolution. The overlaps of the GLASS datasets and CMIP6 were 1981–2014, so our analysis focused on 1981–2014. The model's information is shown in Table 1.

### *2.4. Evaluation Approach*

A series of evaluation indicators was applied to quantify the agreemen<sup>t</sup> between the observed and simulated LAI and the trend of the CMIP6 models. In this study, we calculated the average LAI during the growing season (May–September) for each year as the average LAI, a linear regression trend of the average LAI from 1981 to 2014 as the trend, and an increasing trend indicated TP greening. We also calculated the monthly average LAI for each month of the growing season, and the TP averaged monthly average LAI during 1981–2014 as the monthly LAI. Then, we calculated the linear regression trend of the monthly average LAI for each month during the growing season from 1981 to 2014, and the TP averaged trend of the monthly average LAI as the monthly LAI trend. We obtained monthly variations from the monthly LAI and the monthly LAI trend during the growing season. In the following, we further describe the metrics used for model evaluation and the method used for ranking the models.

### 2.4.1. Evaluation Metrics

The spatial correlation (pattern correlation) was used to quantify the correlation between the grid cell trend (or the grid cell average LAI from 1981 to 2014) distribution in the models and observations. Through a combination of the definitions of Bao et al. [37] and Chang et al. [80], the spatial correlation formula for the simulated and observed trends in this study was defined as follows:

$$\text{Pattern correlation} = \frac{\frac{1}{N} \sum\_{i}^{N} \mathcal{W}\_{i} \left( M\_{i} - \overline{M} \right) \left( O\_{i} - \overline{O} \right)}{\sqrt{\frac{1}{N} \sum\_{i}^{N} \mathcal{W}\_{i} \left( M\_{i} - \overline{M} \right)^{2}} \sqrt{\frac{1}{N} \sum\_{i}^{N} \mathcal{W}\_{i} \left( O\_{i} - \overline{O} \right)^{2}}} \,. \tag{1}$$

where *N* is the total number of grid cells under evaluation, *Mi* and *Oi* are the simulated and observed trend (or the average LAI from 1981 to 2014) from the CMIP6 models and the GLASS of the grid cell *i* , and *Wi* is the area weight of the grid cell *i* (all grid weights add up to 1) [37]. We calculated *Wi* in the Pearson correlation coefficient equation as the area of

each grid cell associated with the central geographic latitude of each grid cell [37]. In the TP, the variation in *Wi* is not obvious and the value of *Wi* can almost be neglected.



The bias between the simulated and observed grid cell trend (or the grid cell average LAI from 1981 to 2014) was calculated to quantify the main bias between the model simulations and GLASS observations. In our study, we subtracted the observed trend (or the average LAI from 1981 to 2014) from the simulated trend (or the average LAI from 1981 to 2014) to ge<sup>t</sup> trend (or the average LAI from 1981 to 2014) bias at the single grid cell *i* by Equation (2). We thus obtained a value of the bias at every grid cell and the distribution of the bias across the whole study region. The relative bias of grid cell trend (or the grid cell average LAI from 1981 to 2014) was calculated as the ratio of the trend (or the average LAI from 1981 to 2014) bias to the observed trend (or the average LAI from 1981 to 2014) at the grid cell *i* in Equation (3). We also calculated the TP averaged bias using Equation (4).

$$Bias = M\_i - O\_i \tag{2}$$

$$Relative\_{Bias} = \frac{Bias\_i}{O\_i} \tag{3}$$

$$Bias\_{\text{avg}} = \frac{\sum\_{1}^{N} |M\_{i} - O\_{i}|}{N} \tag{4}$$

The root-mean-square error (RMSE) was used to measure the difference between the simulations and observations. Similar to bias, we calculated the trend (the average LAI from 1981 to 2014) of the two datasets at grid cell *i* , and then aggregated the results over the entire TP. Next, we converted spatial two-dimensional data of trend (or the average LAI from 1981 to 2014) in simulations and observations into one dimension and calculated the RMSE of the two columns (the simulations and observations) of the one-dimensional data by Equation (5). This RMSE was used in the ranking. Moreover, we used RMSE to quantify the difference in the average LAI from 1981 to 2014 sequence between the simulation and observation during the growing season in 1981–2014 at single grid cell *i* , and then obtained the distribution of RMSE across the region.

$$RMSE = \sqrt{\frac{\sum\_{1}^{N} (M\_{\text{i}} - O\_{\text{i}})}{N}} \tag{5}$$

The ratio of the standard deviation (*Ratioδ*) was used to quantify the magnitude of the difference in variation between the simulation and the observation. Similar to RMSE, we first converted the spatial two-dimensional data of the grid cell trend (or the grid cell average LAI from 1981 to 2014) in simulations and observations into one dimension, and then calculated the standard deviation of the simulations and observations by Equation (6), and finally calculated the ratio of the two standard deviations. Furthermore, *δM* and *δO* were the standard deviations of the model simulations and the GLASS observations, respectively. The ratio of trend (*Ratiotrend*) was used to quantify the variation of the simulated trend and the observed trend as either overestimation or underestimation. We calculated the *Ratiotrend* by Equation (7); *trend M* was the simulated trend, and the *trendO* was the GLASS trend. A ratio less than 0 indicated that the trend was not captured, contrary to the trend in GLASS. A ratio greater than 0 but less than 1 indicated that the greening or declined trend was captured, but was underestimated. A ratio greater than 1 indicated an overestimation of the greening or declined trend.

$$Ratio\_{\delta} = \frac{\delta\_M}{\delta\_O} \tag{6}$$

$$Ratio\_{trend} = \frac{trend\_M}{trend\_O} \tag{7}$$

### 2.4.2. Significant Test Method

We used two methods for significance testing, the Student's *t*-test and the Mann– Kendall trend test. The Student's *t*-test was used for the significant difference test between simulations and observations. The Mann–Kendall trend test was used to detect whether a time series was steadily increasing/decreasing or unchanging.

### 2.4.3. Ranking Method

A ranking scheme was developed by Brunke et al. to score the multi-bulk aerodynamic algorithm for calculating the turbulence fluxes on the ocean surface [81]. Decker et al. [82] ranked the bias and standard deviation of error between reanalysis products and flux tower measurements using the same method as Brunke et al. On the basis of Decker et al., Wang et al. [83] extended this ranking approach and increased the statistical parameters to four, including the correlation coefficient (ρ), the standard deviation ratio (σr/σobs), the standard deviation error (σd), and the difference (bias) to rank the ability of six kinds of reanalysis data to reproduce climate characteristics over the Tibetan Plateau. Since then, this ranking approach, as a good example of model performance evaluation, has been

used in many studies [32,37]. In this study, we adjusted the ranking method used by Wang and Zeng, and the ranking metrics were changed into the spatial correlation (pattern correlation), the bias (Bias), the root mean square error (RMSE), and the ratio of standard deviation (*Ratioδ*).

In the context of this study, the simulation with the highest pattern correlation, the lowest bias and RMSE, and the closest ratio was considered to have the best performance for reproducing the trend (or the LAI) over the Tibetan Plateau. The models were ranked from 1 to 35, with 1 being the model with the lowest value in magnitude of bias, RMSE, or |ratio-1| (or the highest pattern correlation) and 35 being the model with the highest value in magnitude of bias, RMSE, or |ratio-1| (or the lowest pattern correlation) [82]. We then calculated the total score of the four metrics for a single model and defined the total score as the "error ranking". The higher the model's error ranking, the closer the relationship between the simulations and observations.
