**4. Discussion**

#### *4.1. Yg0 Is Partly Persistent*

It is infeasible to perfectly optimize the summer maize SDT across the entire study region to fully eliminate Yg0 because the optimum SDT depends on weather conditions during the growing season and the growing season of summer maize generally lasts ~100 days, but predicting weather conditions over such a long term (~100 days) precisely is not feasible at present. During the growing season, some unfavorable weather conditions (e.g., shifting of heat, radiation, and precipitation among different development stages) may cause SDT optimization to fail. Therefore, the contribution of suboptimum SDT, as a result of unfavored weather conditions, to Yg is non-persistent. To separate the persistent factors (primarily the knowledge and managemen<sup>t</sup> skills of farmers) affecting SDT from the non-persistent will help understand the likelihood of reducing Yg by optimizing SDT.

Here, we adopted the method (Supplementary Text S2) of Farmaha, Lobell, Boone, Cassman, Yang and Grassini [2] to assess persistent Yg0 based on both 1 km Yg0 and 5 km Yg0 time series (Supplementary Figure S3). The 1 km result was derived on the basis of the Yg0 time series in croplands continuously cropped with maize (Supplementary Figure S2). We showed only the results of persistent factor percentage (PFP) in Yg based on "small Yg group (SYg)," PFPSYg. Figure S3a,b show the percentage of persistent values in Yg0 based on 1 km Yg0 and 5 km Yg0, respectively. The 1 km result covers a smaller spatial extent, and a pixel-by-pixel comparison between the two results was performed over the overlapped region (Figure S3c). The spatial variations in the PFP of the two results were moderately correlated with each other with a correlation coefficient (R) of 0.59. However, the 5 km result shows an overall overestimation of PFP. Regional values of PFP of the 1 and 5 km results over the overlapped regions are 59% and 69%, respectively. The reason for the higher PFP estimated by 5 km Yg0 may be that some spatial dynamics in Yg0 were eliminated in 5 km Yg0. Nevertheless, both panels (Figure S3a,b) indicate non-negligible non-persistent components in Yg0 and significant variations in persistent Yg0 over space. Smaller percentages of persistent Yg0 are found in the south of the NCP. Figure S3 presents two examples of assessing persistent Yg0, implying that further studies are required to reveal the impact of climates on SDT and that assessing persistent Yg0 within smaller regions using high-resolution RS data is necessary to understand the likelihood of narrowing Yg0 on a local scale.

#### *4.2. Yp Derived from Ya and Yp0*

Ya derived from remotely sensed data is also useful for quantifying Yp and Yg [1,28]. Pixel-level Yp could be estimated as a high percentile (95th or 99th) of pixel-level Ya values within a small region around the pixel under investigation. However, as previously mentioned, this method assumes optimum field managements are achieved in some croplands (or pixels) within the domain under investigation. A novel approach avoiding the assumption in estimating Yp based on satellite data was proposed in this study, and the new method only assumes optimum SDT is achieved in on-farm managements. The "Potential yield" derived from Ya is equivalent to "potential farmers' yield (Ypf)," as defined in Liu, Yang, Lin, Hubbard, Lv and Wang [6]. In the maize belt of the US, where farmers' managemen<sup>t</sup> skills were maintained at a high level, Ypf was close to Yp; however, in other regions, where crop growth was strongly stressed, Yp was poorly represented by Ypf [1]. We investigated the differences between Ypf and Yp derived using the new method proposed in this study (Supplementary Figure S4). Modeled Ypf values were significantly smaller than modeled Yp. The regional-scale mean annual Ypf is 8.7 t hm−2, which is significantly less than the Yp estimated in this study as well as previous studies [11,67,68], whereas our method produced a result closer to previous estimates. This implies that large gaps exist between farmers' potential yield and the Yp of summer maize over the NCP. The newly developed method in this study provides a more reliable approach to estimating Yp and can improve the understanding of the Yg of summer maize in the study region.

#### *4.3. Limits of the Method in This Study*

PRYM–Maize is proved to perform reasonably well in reproducing regional crop yield, having comparable or even better performances than models in recent studies in terms of RMSD [2,69]. PRYM–Maize was then used to develop a new method in this study to quantify Yp, Yp0, Yg, and Yg0. This new method produced a Yp magnitude similar to that produced by the calibrated CGM (Section 3.2), and the spatial pattern in Yp over the NCP simulated using our model was also closed to the results of Li [68], who reproduced regional Yp across the NCP using CGM simulations at multiple meteorological sites. This new method can also be used in other regions, where farmers' potential yields are far below the potential levels. However, one should be careful with the value of PLAIopt. We used PLAIopt = 7.5 in this study, and this value was a conservative estimate for PLAIopt and was obtained by analyzing historical field trials over China. However, the value of PLAIopt may be reduced in higher latitude or altitude regions, where low temperatures and heat dominate the growth of maize.

The accuracy of an RS-based approach to estimate crop yield highly depends on the input RS data. The accuracy of Ya estimates over the NCP in this study was degraded by mixed pixels. There was an overall underestimation of the Ya over Shandong Province, where many pixels were mixed with plastic greenhouses. The plastic greenhouse is widely used across Shandong, and it weakens the vegetation information [70], reducing the yield estimated from mixed pixels. Future studies are required to resolve such issues. Using higher-resolution images may be feasible, but the temporal resolution of such data such as Landsat, Sentinel-2, and SPOT may become a new limitation. Alternative approaches, such as pixel downscaling, can also be useful for addressing the above issue. For example, we can merge RS data with coarse spatial resolution and high temporal resolution (e.g., MODIS), high-spatial-resolution panchromatic product [71], or other bands [72] to obtain high-spatial- and high-temporal-resolution data to drive the RS-based crop yield model, thereby reducing the effect of non-vegetation information. In addition, the pixel change detection method [73] is available for further removing the bad pixels or outliers in yield or Ygs produced by the RS-based model. We should consider these approaches in our future work to improve the quantification of regional crop yield and Ygs.

The WBM is a critical part of PRYM–Maize for crop photosynthesis modeling in the context of climate change in the future. Extreme climate events (e.g., drought and heatwave) have grea<sup>t</sup> impacts on crop water status and thus crop yield [74,75]. Thus, with the elevating intensification of these extreme climate events [76,77], water availability estimated using the WBM of PRYM–Maize will play a more important role in quantifying crop yield. The WBM consists of evapotranspiration (loss of water of croplands) and soil water balance processes; hence, water availability can, in turn, affect the water balance process through its impact on evapotranspiration [38,40]. Evapotranspiration modeling in the current PRYM– Maize does not explicitly include the effect of extreme climate events. Therefore, in future work, the improvement of PRYM–Maize with regard to better characterizing the water status of crops during extreme climate events will be required.
