3.1.1. Site-Specific Evaluation of MODIS GPP Products and MOD17 Algorithm

The eight-day EC flux tower GPP (GPP\_obs) was compared with the results of MOD17A2H GPP (GPP\_MODIS), GPP simulated with the in situ meteorology forcing data (GPP\_Insitu), and GPP simulated with optimized maximum LUE parameter (GPP\_LUEopt) and with optimized all five parameters (GPP\_Fiveopt). As illustrated in Figure 3a, the overall eight-day MOD17A2H GPP products were significantly underestimated when compared with the EC-observed GPP. The RMSE between MOD17 products and in situ flux observations of all sites was 1.80 gC/m2/day, while R2 was 0.71 and the slope of the model was 0.49, which means the model could only contribute 71% of the tower-observed GPP. As shown in Figure 3b, when we used the in situ meteorology data to simulate the MOD17 model, a better correlation between simulation and observation was found. The model could explain 79% of the observation (the slope was 0.43, R<sup>2</sup> was 0.79), with a large biases close to that of MOD17A2H products. However, the modeled GPP still underestimate as compared to the observed GPP, which means that the meteorology forcing data were not the main reasons for the underestimation of GPP. By contrast, when we optimized the maximum LUE parameter (Figure 3c), a significant improvement of model performance for all sites was seen, with R<sup>2</sup> = 0.86, RMSE = 1.01 gC/m2/day, rRMSE = 6.99%, and the slope of the regression lines was closer to the 1:1 line, which signifies the importance of the LUE parameter in GPP modelling. Furthermore, as we optimized all parameters, a better performance of the model occurred. Almost all the points were close to the 1:1 line (Figure 3d), with R<sup>2</sup> = 0.91, RMSE = 0.81 gC/m2/day, and rRMSE = 5.59%, which indicates the best performance in these simulations.

**Figure 3.** Comparisons of eight-day gross primary productivity (GPP) of MOD17A2H products and GPP simulations by MOD17 model with the flux tower GPP for all sites. Eight-day GPP scatter plots of the EC-observed GPP and (**a**) the original MOD17A2H products; (**b**) in situ meteorology forcing data; (**c**) only LUE optimized results; and (**d**) all parameters optimized results. The units of RMSE and rRMSE are gC/m2/day and %, respectively.

As we accumulated the observed and simulated GPP at a yearly timescale for every site, a significant underestimation of MOD17A2H GPP products also existed (Figure 4), which were similar to the results of the eight-day time scale. On an annual time scale, the simulated GPP showed a generally good agreement with the tower-observed GPP across all sites with R<sup>2</sup> = 0.69, RMSE = 347.31 gC/m2/y and rRMSE = 60.48% (Figure 4a) A better relationship was found between the modeled GPP and tower-observed GPP (R<sup>2</sup> = 0.73). The model was improved by using in situ climate forcing data. However, the modeled GPP was still underestimated as compared with observation. Moreover, the modeled GPP was significantly improved as we optimized the model parameters (Figure 4c,d). The modeled GPP was closer to the observed GPP (almost all points close to the 1:1 line), and all five parameters optimization results were better than for LUEmax parameter optimized only with R<sup>2</sup> of 0.87 and 0.92, respectively. The rRMSE was 23.93% and 19.55%, respectively, which signifies the importance of optimizing the temperature and water-constrained factors in arid regions.

**Figure 4.** Comparisons of annual GPP of MOD17A2H products and GPP simulations by MOD17 model with the flux tower GPP for all sites: (**a**) Original MOD17A2H products; (**b**) in situ meteorology forcing data; (**c**) only LUE optimized results; and (**d**) all parameters optimized results. The units of RMSE and rRMSE are gC/m2/y and %, respectively.

### 3.1.2. Biome-Specific Evaluation of MODIS GPP Product and MOD17 Algorithm

The MOD17A2H GPP products and the other three model estimated GPP based on MOD17 algorithm were compared with the flux-derived eight-day time scale of GPP values for various biome types (Figure 5). We divided the original grassland into two types, grasslands, and desert grasslands, because of the large diversities in species and climate conditions in these sites. As shown in Figure 5, the original MOD17A2H GPP products were significantly underestimated in grassland, cropland and forest ecosystems, but not in the desert ecosystems. A good correlation between MOD17A2H GPP products and EC-observed GPP is illustrated in grassland ecosystems (R2 = 0.82), followed by the cropland ecosystems (R2 = 0.80) and forest ecosystems (R<sup>2</sup> = 0.53), while the weakest was in desert ecosystems (R2 = 0.42). In addition, the slope of the linear regression for the scatter plot can also revealed the biases between MOD17A2H GPP and tower-observed GPP. We can see the slope of linear regression at the forest ecosystems is far from the 1:1 line, which demonstrates the largest biases between MOD17A2H GPP and the tower based GPP, followed by those of the cropland ecosystems, then the grassland and desert ecosystems. Therefore, it indicates that larger biases existed in most ecosystems in the arid regions of China, especially for the forest and cropland ecosystems. As we used the in situ forcing data, we did not find significant improvement for all biome types, and the simulations of GPP forced with in situ data in most ecosystems were still underestimated. However, as we optimized the parameters of the MOD17 model, the GPP simulation results were improved significantly in most ecosystems. The scatter points of modeled GPP and EC-measured GPP were distributed closely around the 1:1 line, indicating that the GPP simulation results can be improved after the parameter optimization of LUEmax and other parameters in most ecosystems in

the arid region. However, a larger bias still existed even after parameter optimization. The impacts of parameter optimization on GPP simulation of desert ecosystems were less, indicating that it is difficult to effectively simulate the GPP in desert ecosystems in the current MOD17 model.

**Figure 5.** Comparison between eight-day GPP of MOD17A2H products and GPP simulations by MOD17 model with the flux tower GPP for the major ecosystems including: (**a**) Grassland; (**b**) cropland; (**c**) desert steppe; and (**d**) forest. The unit of RMSE is gC/m2/day. The blue points represent original MOD17A2H products; green points represent in situ meteorology forcing data; pink points represent only LUE optimized results; and red points represent all parameters optimized results.
