*2.2. Overall Methodological Approach*

Satellite data for the village were obtained from the Landsat satellites (30 m pixels) for 2008–2018 using the USGS explorer website [21]. Level 2 processed imagery was requested and analyzed following the protocols described on the Landsat Explorer [21]. The Landsat data came from 2 different sensors having a different wavelength range in the near-infrared bands; one was used from 2008 to 2011 (Landsat 5 with 0.76–0.90 μm with Landsat Thematic Mapper - TM sensor [21]) and another from 2013 to 2018 (Landsat 8 with 0.85–0.88 μm with Operational Land Imager - OLI sensor [21]). Atmospherically-corrected cloud-free scenes from the end of April/beginning of May were selected to coincide with flowering in winter wheat. The choice of using images at that particular developmental stage was justified by the strong correlation between remote sensing data and grain yield [22–24]. In addition, in Northern China, the use of proximal and/or remote sensing as a proxy of canopy N and yield has been validated on several crops and in different geographical regions [25–28]. The Green Normalized Difference Vegetation Index (GNDVI) [29] was calculated for each scene. The GNDVI is more closely related to the photosynthetically absorbed radiation than NDVI and has shown a linear correlation with the Leaf Area Index (LAI) and biomass [29]. This makes the GNDVI more sensitive to changes in biomass and chlorophyll concentration compared to the NDVI. It was calculated as follows:

$$GNDVI = \frac{\rho\_{NIR} - \rho\_{Green}}{\rho\_{NIR} + \rho\_{Green}} \tag{1}$$

where ρ*NIR* and ρ*Green* are the reflectance in the Near Infrared and Green bands, respectively.

Soil Brightness (SOB) data at 3 m spatial resolution were purchased from Courtyard Agriculture Ltd. in the UK and were derived from RapidEye optical satellite images at a time in the season when the land was un-cropped (bare soil). Most of the fields were contiguous, with 'boundaries' being narrow ridges of bare soil. There was little other vegetation or paths/roads present. Therefore, most of the pixels corresponded to cropland (Figure 1).

Prior to sowing of wheat in 2015, soil sampling was carried out in the village. For each field, 10 samples were collected at 0–20 cm depth following "S" patterns. The samples were pooled together, mixed thoroughly and divided into four subsamples. One subsample of about 1–2 kg was kept for determining inorganic soil N and soil organic carbon (OC). The soil samples were later dried, ground to powder and analyzed for total soil N concentration using the Kjeldahl digestion method [30]. Soil OC

was analyzed following the wet combustion method [31]. Soil texture information was also available at selected points and used for this study and additional information is available elsewhere [32].

Spatial and temporal variability in GNDVI were quantified following the method of Basso et al. (2012). The spatial variability of GNDVI from 2008 to 2018 was calculated from the relative percentage difference of GNDVI at each 30 m pixel from the average GNDVI obtained for the whole cropped area of the village, according to Equation (2) [13]:

$$
\overline{\sigma}\_{si}^2 = \frac{1}{n} \sum\_{k=1}^n \left[ \frac{\overline{y}\_{ik} - \overline{y}\_k}{\overline{y}\_k} \times 100 \right] \tag{2}
$$

where *n* is the total number of available years, *k* = 1, ... , *n* is the integer corresponding to every year, σ2*si* is the average percentage difference at location *i*, *yk* is the average of the variable obtained for the whole village at year *k*, *yi*,*k* is the variable monitored at location *i* at year *k*. Pixels that have high values of σ2*si* are associated with high yield (under the assumption that GNDVI is directly proportional to yield) and pixels with low values are lower yielding.

The temporal variability, which is a measure of stability, was calculated as temporal variance to overcome the issues with varying stability over time [33]. The temporal variance of patterns in the GNDVI data was calculated using Equation (3):

$$
\overline{\sigma}\_{ti}^2 = \frac{1}{n} \sum\_{k=1}^n \left( y\_{i,k} - \overline{y}\_{i,n} \right)^2 \tag{3}
$$

where σ2*ti* is the temporal variance value at location *i*, *yi*,*k* is the value of the variable monitored at location *i* at year *k*, *yi*,*<sup>n</sup>* is the average variable value at location *i* over the *n* years. The higher the temporal variance, the more unstable the GNDVI measurement (and thus the yield) at that particular location over time. Threshold values have previously been used to identify stable zones within fields, however, the choice of the threshold for determining the stable zone can affect the result considerably [13]. To overcome this problem, a clustering algorithm was applied to the spatial and temporal variability layers, which is further described below. The GNDVI variability metrics and the soil brightness images were summarized at the field scale for consistency with the measured soil variables that were used for evaluation purposes.

For site-specific agronomic managemen<sup>t</sup> of any input, there is the need to develop MZs that will be subject to a unique combination of potential yield-limiting factors [12]. Management zones are most effective if the variation (of the factors under consideration) within them is minimized, and there are a manageable number of zones. Clustering is an important method in precision agriculture [34–38]. In all these studies it was found that the *k*-means was not among the best methods to define MZs, but the best algorithm differed. For example, in smallholder farming systems, Possiblistic Fuzzy C Mean worked best but in other conditions, the McQuitty method seemed to give more consistent results [36,37]. Although the *k*-means is still widely adopted, it was decided to use the partitioning around medoids (PAM) method [39] to derive the MZs. The PAM is a clustering algorithm that, like *k*-means clustering, aims to minimize the distances between points within a cluster and the point at the center of the cluster. It is a more robust alternative to *k*-means clustering when noisy data are expected, as is the case in this study due to unmeasured variation in growing conditions due to the soil spatial variability. The term medoid refers to an observation within a cluster for which the sum of the distances between it and all the other members of the cluster is a minimum. PAM requires a priori information of the number of clusters, but it does more computation than the more commonly used *k*-means clustering to ensure that the medoids that it finds are truly representative of the observations within a given cluster. It has been found that the PAM showed better results than the *k*-means in terms of execution time, sensitivity towards outliers, reduction of noise in the data due to minimization of the sum of dissimilarities within the dataset [36]. Field-scale variables for (i) spatial and temporal

variability in GNDVI; and (ii) mean soil brightness were normalised and used as variables with the PAM method, using R software, to define clusters.

Soil N and soil OC were used for MZ evaluation as reported in [4]. Relative variance (RV) of measured soil properties at the field scale was used to evaluate the accuracy of this approach for delineating MZs given by Equation (4):

$$RV = 1 - \frac{S\_w^2}{S\_T^2} \times 100\tag{4}$$

where *S*2 *w* is the total within-zone variance of the soil property and *S*2 *T* is the total village-level variance of the corresponding property. RV approximates the amount of variability explained by the MZ delineation and can be interpreted similarly to the *R*<sup>2</sup> value in regression analysis in terms of the percentage of variation explained. RV was calculated on a per-field basis for measured total N and OC because they best characterise areas that would benefit from di fferential management.
