*Technical Note* **Detection of Southern Beech Heavy Flowering Using Sentinel-2 Imagery**

**Ben Jolly 1,\*, John R. Dymond 1, James D. Shepherd 1, Terry Greene <sup>2</sup> and Jan Schindler <sup>1</sup>**


**\*** Correspondence: jollyb@landcareresearch.co.nz

**Abstract:** The southern beech (genus *Fuscospora* and *Lophozonia*) forest in New Zealand periodically has "mast" years, during which very large volumes of seeds are produced. This excessive seed production results in a population explosion of rodents and mustelids, which then puts pressure on native birds. To protect the birds, extra pest controls, costing in the order of NZD 20 million, are required in masting areas. To plan pest control and keep it cost-effective, it would be helpful to have a map of the masting areas. In this study, we developed a remote sensing method for the creation of a national beech flowering map. It used a temporal sequence of Sentinel-2 satellite imagery to determine areas in which a yellow index, which was based on red and green reflectance (red-green)/(red + green), was higher than normal in spring. The method was used to produce national maps of heavy beech flowering for the years 2017 to 2021. In 2018, which was a major beech masting year, of the 4.1 million ha of beech forest in New Zealand, 27.6% was observed to flower heavily. The overall classification accuracy of the map was 90.8%. The method is fully automated and could be used to help to identify areas of potentially excessive seed fall across the whole of New Zealand, several months in advance of when pest control would be required.

**Keywords:** southern beech; masting; beech flowering; seed fall; Fuscospora; Sentinel-2

#### **1. Introduction**

New Zealand southern beech (genus *Fuscospora* and *Lophozonia*, formerly *Nothofagus* [1]) forest dominates over 2 million hectares (ha) of New Zealand forest and features in almost 2 million ha more [2]. It comprises five species: mountain beech (*Fuscospora cliffortioides*); red beech (*Fuscospora fusca*); silver beech (*Lophozonia menziesii*); black beech (*Fuscospora solandri*); and hard beech (*Fuscospora truncata*). The trees reproduce almost yearly, with periodic highly productive seasons known as "mast" years that produce large volumes of seeds [3]. The seeds are a significant food source for a number of birds and mammals. During mast years, the rodent population increases significantly [4,5], providing an abundant food source for mustelids (especially stoats—*Mustela erminea*) [6]. All rodent and mustelid species have been introduced to New Zealand and also prey on native bird species. When seeds begin to run out, increasing pressure is put on native biota as additional food sources are sought. The control of populations of introduced predators is essential for preserving populations of native and endemic birds, reptiles, and invertebrates in New Zealand, especially during beech mast years [6].

The New Zealand Department of Conservation (DOC) is responsible for managing the forests on public land, preserving native species, and coordinating pest control. The ability to predict the extent and intensity of significant mast events is a critical component for planning pest control efforts to limit the explosion of predator populations [4]. A number of approaches are currently used to plan management interventions: (i) modeling; (ii) field observations; and (iii) the in situ sampling of developing seed crops in tree canopies. (i) The

**Citation:** Jolly, B.; Dymond, J.; Shepherd, J.; Greene, T.; Schindler, J. Detection of Southern Beech Heavy Flowering Using Sentinel-2 Imagery. *Remote Sens.* **2022**, *14*, 1573. https://doi.org/10.3390/rs14071573

Academic Editor: Brian Alan Johnson

Received: 17 February 2022 Accepted: 22 March 2022 Published: 24 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

"delta-*T*" (Δ*T*) model [7] uses the difference in mean temperature from the previous two summers (*Tn*−<sup>1</sup> − *Tn*−2) to predict likely seed fall for the following autumn at a national scale. However, it relies on temperature data, which are currently only available on a modeled 5 × 5 km grid, so it misses smaller scale micro-climate effects. While historical temperature is an important factor in synchronizing mast events [7], other factors, such as nutrient availability, also play a role [8]. New research suggests that rising temperatures caused by climate change may alter the mast cycle of beech trees [8,9], increasing the spatial and temporal complexity of mast patterns [8] and potentially de-synchronizing flowering and seeding, effectively reducing the impact and predictive power of temperature on the timing of the reproductive cycle [10]. (ii) Field staff from DOC are well placed to provide observations of beech flowering in certain areas as part of their normal duties. However, there are large areas of forest that remain unobserved and spatial extent is often difficult to define, especially at a regional or national scale. (iii) Extensive sampling campaigns are conducted during years in which a heavy mast is expected using helicopters so that staff can reach and clip the upper branches of trees in order to count the seeds. This task is expensive, labour intensive, and dangerous.

Remote sensing has proven to be an effective tool for monitoring vegetation phenology, particularly when a rich time series of imagery is available [11–16]. With sensors, such as Landsat 8 and Sentinel-2, it is possible to map phenology over millions of hectares and create national maps in great detail [11,17]. Phenological characteristics are usually derived by first fitting a curve to the time series of remote sensing data and then using either threshold-based methods, moving averages, inflection points or the time of maximum increase [18–22]. Seasonality is usually assumed [22]. As mast events are considered a deviation from the "median" annual cycle, it should be possible to use change detection techniques [23,24] to identify or even predict mast seasons when identifying features are visible from above [14,19,21].

Vegetation indices, such as the normalized difference vegetation index (NDVI) and the enhanced vegetation index (EVI), are often used to differentiate between flowers or seed pods/cones and to summarize data as one variable for analysis [14,19,22,25]. Often, multiple indices are combined in an effort to investigate multiple physical properties. For example, a study by Garcia et al. [21] investigating white spruce mast events found that vegetation indices targeting moisture were more effective than the traditional color-based indices, but they still struggled to reliably predict masting. Fernández-Martínez et al. [19] were more successful and used increasing EVI the winter before, along with weather data during spring, as an indicator of potential mast seasons for Mediterranean oaks. Neither study were able to find a usable signature for flowering or seed/cone production to map the mast events. Dixon et al. [14] and Chen et al. [25] both successfully used multi-scale imagery to map tree-scale flowering over landscapes: Dixon et al. [14] by training a random forest model with drone data from known events and Chen et al. [25] by developing an enhanced bloom index (EBI) from drone data and successfully translating that approach to CERES, PlanetScope, Sentinel-2, and Landsat data to increase coverage.

In this study, we investigated the use of freely available satellite remote sensing to identify large areas of significant southern beech flowering in New Zealand. We used imagery from the European Space Agency (ESA) Sentinel-2 "a" and "b" satellites to obtain a high rate of repeat passes in order to maximize the chances of multiple cloud-free observations and to produce detailed coverage at a national scale. Very little ground data on flowering are available. The irregular nature of the flowering events and the ruggedness and remoteness of much of the terrain made the planned collection of ground data for this study difficult. As significant flowering is an irregular phenological event, we modeled the phenology of beech forest per pixel and identified departures from this image-by-image during spring seasons to identify heavy beech flowering. Using this method, we produced national maps of heavy beech flowering for 5 years, 2017–2021, with a minimum mapping unit of 1 ha. We assessed the accuracy of the method by comparing it to a human operator at 1000 randomly selected sites. A national map of detectable heavy beech flowering could be produced at the end of every spring to assist DOC in identifying potential mast "hot spots" and thus, aid in the planning of pest control operations.

#### **2. Materials and Methods**

#### *2.1. Area of Interest*

This study covered the extent of known beech and mixed-beech forest in New Zealand, according to EcoSat Forests [2,26–28], and covered both North and South Islands from 36.1◦S, 178.0◦E to 46.4◦S, 166.4◦E. The terrain is largely mountainous with slopes of up to 45◦, as any habitat deemed suitable for agriculture was cleared in the 1800s and early 1900s. Due to the large variations in both the altitude and latitude of the study area, the mean annual temperatures ranged from 3.8 ◦C to 16.6 ◦C and the mean cumulative annual rainfall ranged from 465 mm to 9305 mm.

#### *2.2. Data*

This study used the Copernicus Sentinel-2 Level-1C calibrated top of atmosphere (TOA) reflectance values, as downloaded from the ESA archives, with the edges masked to remove pixels that did not contain data for every band. The Sentinel-2 satellite mission consists of two satellites (Sentinel-2a and -2b) moving in sun-synchronous orbits that repeat every 10 days. These orbits are 180◦ out of phase with each other, which produces a 5-day revisit period. The swath width of each pass is 290 km, with five "passes" required to cover the mainland of New Zealand, each on a different day. The overlap between passes means that some areas of the country have a higher revisit rate. Each Sentinel-2 satellite is equipped with a multi-spectral imaging sensor that captures wavelengths from ultraviolet to short-wave infrared.

Our analysis used all available Sentinel-2 data over New Zealand from 1 September 2016 to 31 December 2021, with images every 10 days before Sentinel-2b came online 8 July 2017 and images every 5 days thereafter. A "mega-mast" occurred during spring 2018 and autumn 2019 [29,30]. Other years showed some small flowering events or none at all. Cloudy pixels were excluded from the temporal sequence using the methodology outlined in Shepherd et al. [31].

#### *2.3. Methods*

Southern beech flowers produce a reddening of the forest canopy that is visible from the ground, especially during a significant mast season. At the 10 m pixel nominal spatial resolution of Sentinel-2, the reddening of the canopy appears to the human eye as a subtle yellowing in the "natural color" (RGB) image as the red flowers increase the red component of a pixel but not to the point of dominating the green. Higher-wavelength image bands (and the associated indices) do not respond to this flowering, with the exception of Band 5 (red edge); however, this is also associated with the "flushes" of new foliage in late spring.

The yellowing effect of the canopy is subtle in natural color renders of Sentinel-2 imagery, as demonstrated in Figure 1a–d. Sub-figures (e–h) feature an exaggerated "red" (B4) band, which highlights the effect at the expense of non-forested areas. Figure 1 covers one month of imagery in an area of overlap between orbits 29 and 72 and thus, has approximately double the number of overpasses compared to other areas of the country. In that 30-day period, there were 13 overpasses that resulted in 5 usable cloud-free images, 4 of which are displayed. The flowering event was just starting on October 14th (Figure 1a,e), with the lower western and southern slopes in full flower 7 days later (b,f) and the upper and eastern slopes in full flower 15 days later (c,g) as the areas in (b,f) gave way to fresh green foliage. Eight days later, on 13 November (d,h), most of the flowering had vanished. Generally, there was a two- to three-week period, at most, in which a cloud-free image was required in order to observe flowering; however, there was no way of reliably knowing when this window would occur as it varied with season and location.

In order to detect the yellowing that is associated with southern beech flowering, we applied two approaches: the calculation of a normalized difference yellowing index (NDYI) to describe the red–green band relationship and the modeling of this index over the time series of the image archive to detect variations from the expected (non-flowering) state. The effect is subtle enough that the technique required was very sensitive to cloud and shadow contamination. In addition to the cloud masking performed above, "invalid" pixels from the cloud mask (cloud, shadow, snow or water) were buffered by 30 pixels (300 m) and patches of "valid" pixels smaller than 100 ha were re-classified as "invalid". Extra spectral filtering was then used to mask the remaining pixels that were too bright or dark to be forest (or useful): B4red < 650, B3green < 900, B2blue < 1000, B8NIR > 1000, B4red − B5red edge < −1500. The result was then further buffered by 3 pixels.

**Figure 1.** Sentinel-2 imagery from the Hawdon Valley, South Island, New Zealand, showing a flowering event during the spring of 2018: (**a**–**d**) are natural color; (**e**–**h**) are natural color with an exaggerated stretch to amplify the "red" band (Band 4). The images are organized by column, e.g., (**a**,**e**) are the natural color and stretched color for 14 October 2018, respectively.

The NDYI is similar to the well-known normalized difference vegetation index (NDVI) [32], using the "red" (Band 4665 ± 15 nm) and "green" (Band 3560 ± 18 nm) Sentinel-2 bands instead of the "near-infrared" and "red" bands. It is also very similar to the green-red vegetation index (GRVI) [33,34], with the order of the bands merely reversed:

$$\text{NDYI} = \frac{(\text{B4}\_{\text{red}} - \text{B3}\_{\text{green}})}{(\text{B4}\_{\text{red}} + \text{B3}\_{\text{green}})} \tag{1}$$

The NDYI was calculated using the Level-1C TOA reflectance product re-projected to the New Zealand Transverse Mercator (NZTM) coordinate reference system (EPSG:2193) and with any invalid pixels masked. As the NDYI represents the ratio of red–green and both are affected similarly by transient atmospheric conditions, it was negative over most areas of forest most of the time (i.e., when a pixel was "green"), increasing to near or slightly above 0 during heavy flowering events.

Substantial annual variation was present in the TOA reflectance observations over forest due to climatic conditions, vegetation phenology (e.g., new leaves), and sun angle. The NDYI signal visually observed during flowering was subtle enough within the context of a year of data that setting simple thresholds was inadequate to produce a reliable result. Additionally, flowering can occur at different times during the spring season, depending on latitude and altitude [3,8]. The temporal sequence of NDYI for a pixel in the Hawdon Valley (South Island, New Zealand) is shown in Figure 2 as an example. The orange line is a modeled NDYI time series that is unique to that pixel, using an approach similar to that used in the TMASK methodology developed for cloud detection in Landsat 8 imagery [35]. The model used robust regression to calculate unique per pixel (*i*, *j*) coefficients for the sine

and cosine terms for intra- (*a*1,*i*,*j*, *a*2,*i*,*j*) and inter-annual (*a*3,*i*,*j*, *a*4,*i*,*j*) variability, as well as a constant term (*ci*,*j*). Where *x* is the number of days since the start of the temporal sequence, *Tyr* is the number of days per year, and *Tall* is the number of days in the sequence:

**Figure 2.** The observed normalized difference yellowing index (NDYI) time series for an example pixel from Figure 1 in the Hawdon Valley (blue) with superimposed modeled values (orange). The gray areas indicate austral spring seasons (September–November) where the NDYI is expected to peak during a mast. The red circle shows the NDYI that was higher than expected during the spring of 2018 (the "mega-mast" season).

Date

The observed NDYI for each pixel and date were then subtracted from the modeled value for that pixel and date to produce ΔNDYI and the maximum value was found for each flowering season (1 September to 10 December) using:

$$\text{NANDYI} = \text{NDYI} - \text{NDYI}\_{mod} \tag{3}$$

Finally, ΔNDYI values were converted into a map of "heavy flowering detected" vs. "heavy flowering not detected" by following a method similar to Shepherd et al. [36]. First, a high ΔNDYI threshold of 0.08 was chosen by assessing ΔNDYI against seed trap data [4] during the 2018 "mega-mast". This threshold was used to create "seed" areas, which were grown outward by progressively lowering the threshold to 0.04. The resulting "flowering" pixels were then buffered by 2 pixels, followed by a 5 × 5 majority filter that was then eroded by 2 pixels. "Heavy flowering not detected" patches that were smaller than 1 ha (minimum mapping unit) were removed by being re-coded as "heavy flowering detected" or "no data" (majority of surrounding pixels), then the "heavy flowering detected" patches that were smaller than 1 ha were re-coded as "heavy flowering not detected" to reduce small-scale noise.

As no reliable spatial dataset of beech flowering exists beyond occasional field reports from DOC staff, the national-scale map for the 2018 mast year was accuracy assessed by a human operator. At 1000 randomly selected sites, 500 in "heavy flowering detected" and 500 in "heavy flowering not detected", the operator determined whether heavy flowering was observed in the temporal sequence of 2018 cloud-free imagery in comparison to a median spring image (excluding 2018). Heavy flowering was easily observed in the temporal progression of spectral reflectance relative to the median image, especially when the spatial extent of the flowering moved upward in elevation as the season progressed (using the exaggerated red band stretch shown in Figure 1). A confusion matrix of proportions was estimated using the method of Card [37], from which precision and recall were calculated [38].

#### **3. Results**

The austral spring (September–November) of 2017 was a light flowering season for southern beech in New Zealand. This was followed by a "mega-mast" in 2018, with heavy flowering observed from the ground during spring and a corresponding heavy seed fall the following autumn. Maps of the maximum spring ΔNDYI were produced for each year of data, with the results for the 2018 season shown in Figure 3. The spatial patterns corresponded with anecdotal reports from DOC staff who were based at field offices around New Zealand. There was heavy flowering throughout most of the northwestern corner of the South Island and sporadic heavy flowering in eastern Fiordland. The inset of Figure 3 highlights the level of detail available and shows the heavy flowering on the lower slopes of the Hawdon and Poulter Valleys, dissipating as altitude increases up the valley walls (the black areas are not beech forest; they are either alpine or riverbed in this location).

**Figure 3.** The maximum ΔNDYI (from the model) for spring 2018 in areas of known southern beech forest in New Zealand. Green denotes areas of low (<0.02) maximum ΔNDYI, while red is high (>0.08) and indicates heavy beech flowering. The inset shows the Hawdon and Poulter Valleys near Arthur's Pass (1:250,000 at 42.95◦S, 171.82◦E; see white box).

Figure 4 shows the maps of heavy beech flowering as detected by our method for years 2017–2021. In spring 2018, a lot of heavy beech flowering was detected in the north-west of the South Island, which was synonymous with a "mega-mast" event in that region. In the North Island and the south-west of the South Island, some pockets of heavy flowering were detected. The following year, in spring 2019, the flowering was much reduced in the north-west of the South Island, but in the North Island, a lot of heavy flowering was detected, which was synonymous with another "mega-mast". The south-west of the South Island had pockets of heavy flowering, much the same as in 2018. In years 2020 and 2021, minimal beech flowering was detected in most areas.

**Figure 4.** The maps of heavy beech flowering during spring time for four years using Sentinel-2 imagery (2017–2021 inclusive). The classes are "heavy flowering detected" (red), "heavy flowering not detected" (green), and "no cloud-free imagery" (gray).

In the 2018 map, heavy flowering was detected in 27.6% (1,144,382 ha) of the 4.1 million ha of beech forest in New Zealand. Heavy flowering was not detected in 51.2% (2,122,201 ha) of the beech forest. In the remaining 21.2% of the beech forest (878,406 ha), there was no cloud-free imagery in the spring to enable a decision to be reached. In each of the two classes for "heavy flowering detected" and "heavy flowering not detected", we generated 500 random locations at which we compared reference data to map data. The reference data were determined from the visual interpretation of all cloud-free spring imagery for that year. Table 1 shows the confusion matrix of proportions. The overall classification accuracy was 90.8%. The precision (user's accuracy) scores indicate that 90.4% of the area mapped as "heavy flowering detected" was actually heavy flowering. The recall (producer's accuracy) scores indicate that 84.4% of actual "heavy flowering detected" was successfully mapped as "heavy flowering detected". The overall F1-score for "heavy flowering detected" was 0.873 and "heavy flowering not detected" was 0.928. Thus, the ΔNDYI method is likely to underestimate (slightly), rather than overestimate, areas of heavy flowering; however, the scores reflect well on the method overall.

**Table 1.** A confusion matrix showing detected and not detected heavy flowering (proportion of map) for the ΔNDYI method (Mapped), assessed against a human operator (Reference). The proportions were estimated from a random sample of 500 locations in the "heavy flowering detected" class (weighted by proportion of "heavy flowering detected" in map = 0.35) and a random sample of 500 locations in the "heavy flowering not detected" class (weighted by 0.65).


#### **4. Discussion**

We developed a method that can produce a national map of heavy beech flowering from a temporal sequence of Copernicus Sentinel-2 imagery (Figure 4). The method detects elevated values of a yellow index (NDYI) the are above those normally expected in spring. A ΔNDYI value greater than 0.08 indicates especially heavy flowering; however, these regions can be "grown" into adjacent pixels where ΔNDYI is greater than 0.04 to better capture all heavy flowering. The elevated yellow index is caused by the production of red flowers that obscure the green leaves. The national map of beech flowering can be produced at the end of spring, several months before the subsequent mast event actually occurs and seeds drop to the ground. It can then be provided to DOC, which is the national agency in charge of pest control, to aid with planning additional pest control to be implemented several months later. In the spring of 2018, a nationwide beech mast event was detected and mapped using this method. A manual accuracy assessment determined the heavy flowering map to have an overall accuracy of 90%. The spatial distribution of beech flowering, as mapped by the method, was also consistent with anecdotal observations from DOC field staff.

The national map of beech flowering can be used to provide extra detail to augment the existing Δ*T* model [7] as it provides a higher spatial resolution of 10 m, as opposed to 5 km. It is also a map of confirmed flowering, which is one less degree of separation from the actual seed fall than the Δ*T* model, as physical factors, such as carbon availability and soil moisture conditions, also affect flowering and seed productivity [39]. However, one issue with our method is the requirement for cloud-free satellite imagery at critical flowering times. This means that flowering may have been missed in some areas, which effectively makes the map a better indicator of "presence" rather than "absence".

Not all heavy beech flowering in spring results in heavy seed fall in the following autumn. Heavy frost or very wet weather can interfere with seed production [3]. Figure 5 shows how well the maximum ΔNDYI compared to seed counts in trays located on the floor of the beech forest (seed traps are spread throughout beech forests in New Zealand as part of a long-running monitoring program conducted by DOC [4]). Data were restricted to locations with at least eight valid observations in order to obtain representation over the majority of the spring season. There was noise in the data, nevertheless high seed counts generally corresponded with high maximum ΔNDYI (*r*<sup>2</sup> = 0.397). For reference, the Δ*T* model had *r*<sup>2</sup> values between 0.331 and 0.556 for the same species range [7] (although that study used the older genus name *Nothofagus*). Reasons for the mismatches between the ΔNDYI and seed count include: cloud coverage still obscuring the flowering event, despite the high number of revisits (low ΔNDYI vs. high seed count); inaccurate trap location (variable impact on relationship); trap location relative to flowering trees combined with wind direction during seed fall (high ΔNDYI vs. low seed count); different beech species (different relationship between ΔNDYI and seed count); climate, adverse weather events, and nutrient availability (lower seed count vs. higher ΔNDYI); and inaccuracies

in the method (addressed in accuracy assessment). We recommend that the national map of flowering/not flowering be regarded as a map of potential high seed fall for initial planning purposes, which is to be confirmed later with additional information, such as selected field observations.

**Figure 5.** The relationship between the maximum ΔNDYI (spring 2018) and the number of seeds collected from seed traps in the permanent trap network (autumn/winter 2019) for the 2018/2019 masting season. The locations are filtered to exclude those with fewer than eight valid satellite observations.

One way to address the paucity of valid observations would be to add more data sources. As the technique developed in this study relies only on red, green, blue, and NIR (for quality control) wavelengths, it should be possible to include data from commercial satellite constellations that have higher revisit rates but lower spectral ranges or resolutions, such as the Planet "Dove" constellation. Adding freely available Landsat 8 data could also increase the probability of obtaining valid observations at critical times. Targeted aerial imaging campaigns could also provide valuable information in areas of known data paucity, particularly when they are informed by observations from field staff. This study has shown that the resolution requirement is low, by aerial imaging standards, which would allow for higher flight altitudes and larger image footprints, thereby substantially reducing cost. Multiple studies have shown that the fusion of these separate data sources is useful in remote sensing [11,14,16,20,40,41], though the spatial complexity and rugged terrain of the beech forests in New Zealand is likely to reduce the utility of the coarse resolution MODIS optical imagery.

This study successfully mapped the presence of heavy flowering in beech trees at a large scale (greater than 4 million ha) using a visible change in canopy color. A similar study by Garcia et al. [21] was less successful, but did show that moisture-based indices in the lead-up to a flowering or seed/cone event could provide additional information. Fernández-Martínez et al. [19] were also successful in predicting mast events using a combination of the enhanced vegetation index (EVI) and weather data during spring. A number of challenges exist in the context of detecting mast events and the ΔNDYI approach attempts to minimize these. The NDYI index was chosen to specifically target the red and green image bands, avoiding the red edge and near-infrared bands that also respond to vegetation conditions and thus, increase noise. The effectiveness of the multi-year sine and cosine model for modeling the typical behaviour of NDYI, the utilization of extreme ΔNDYI values as "seeds" for regions that grow into areas of lower ΔNDYI values, and the ability to tune the spectral value constraints all contribute to the effectiveness of our approach. To further improve the performance of the ΔNDYI method, it would be worth investigating the use of supporting indices, such as Garcia et al. [21] and Fernández-Martínez et al. [19], in addition to adding extra data sources. Further work into distinguishing different beech

species would also add greater value for DOC as those with larger seeds (red and hard beech) have a disproportionately larger impact on rodent irruption.

The temporal analysis of Sentinel-2 satellite imagery has proven to be successful at detecting heavy flowering in New Zealand beech forests. To achieve this, cloud clearing had to be accurate (because the yellow index is sensitive to missed cloud) and automated (because many images are required). The automation of the cloud clearing [31] and other processing means that beech flowering maps can be produced in a timely and cost-effective way. In future, we plan to produce a national map of heavy beech flowering at the end of each spring. This would allow for several months of analysis to plan the extra pest control that would be required in the autumn, thereby improving targeted pest control in masting areas and leading to better outcomes for native fauna.

#### **5. Conclusions**

This study used Sentinel-2 top of atmosphere (TOA) imagery to detect and map atypical yellowing that was associated with the heavy flowering of southern beech (*Fuscospora* and *Lophozonia*) forests in New Zealand over 4.1 million ha at an unprecedented 10 m spatial resolution. This was achieved by modeling a normalized difference yellowing index (NDYI) over 5 years of observation and investigating any deviations from the expected values during spring months (September–November). A "threshold" ΔNDYI value of 0.08 was used to identify areas of heavy flowering, with connected areas of ΔNDYI > 0.04 also likely flowering. The method was automated and could be run for all of New Zealand in less than a day on a cluster of approximately 1000 CPU cores. Using Sentinel-2 imagery, the method typically provided information on heavy flowering for 80% of the beech forests in New Zealand with a high overall classification accuracy of 90.8%, resulting in useful information for the planning of national-scale pest control efforts.

**Author Contributions:** Conceptualization, B.J., J.R.D., T.G. and J.S.; data curation, B.J., J.D.S. and J.S.; formal analysis, B.J.; funding acquisition, J.R.D.; investigation, B.J., J.R.D. and T.G.; methodology, B.J., J.R.D., J.D.S. and T.G.; project administration, J.R.D.; resources, J.R.D.; software, B.J., J.D.S. and J.S.; supervision, J.R.D.; visualization, B.J.; writing—original draft, B.J.; writing—review and editing, J.R.D., J.S. and T.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the New Zealand Ministry of Business, Innovation and Employment (MBIE) Endeavour Fund, under the Advanced Remote Sensing of Aotearoa research program (C09X1709).

**Data Availability Statement:** Data available on reasonable request.

**Acknowledgments:** The authors would like to thank the New Zealand Department of Conservation (DOC) for providing advice and validation data.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study, in the collection, analyses or interpretation of data, in the writing of the manuscript or in the decision to publish the results.

#### **Abbreviations**

The following abbreviations are used in this manuscript:

DOC Department of Conservation EBI Enhanced Bloom Index ESA European Space Agency EVI Enhanced Vegetation Index GRVI Green-Red Vegetation Index NDVI Normalized Difference Vegetation Index NDYI Normalized Difference Yellowing Index NZTM New Zealand Transverse Mercator TOA Top of Atmosphere

#### **References**


**Xiaoping Sun <sup>1</sup> and Yang Xiao 2,\***


**Abstract:** Areas of grassland improvement and degradation were mapped and assessed to identify the driving forces of change in vegetation cover in the Three Rivers headwater region of Qinghai, China. Based on linear regression at the pixel level, we analyzed the vegetation dynamics of the grasslands of this region using MODIS NDVI data sets from 2000 to 2010. Correlation coefficients were computed to quantitatively characterize the long-term interrelationship between vegetation NDVI and precipitation/temperature variability during this period. The use of time series residuals of the NDVI/precipitation linear regression to normalize the effect of precipitation on vegetation productivity and to identify long-term degradation was extended to the local scale. Results showed that significant improvements occurred in 26.4% of the grassland area in the Three Rivers Headwater region between 2000 and 2010. The study area, which represents about 86.4% of the total grassland area of this headwater region, showed a general trend of improvement with no obvious trend of degradation.

**Keywords:** NDVI; grassland; MODIS; precipitation variability; human activities

#### **1. Introduction**

The impact of climate change is multi-scale, all-round and multi-level, with both positive and negative impacts, but its negative impacts are more concerned. After entering the 21st century, climate change has gradually become a comprehensive problem affecting the whole world. As a result of climate change, some grasslands around the world are already experiencing a decline in primary productivity and biodiversity, which is caused by man-made activities and also caused by the climate. These two causes interact and produce a superimposed effect. On the one hand, due to the dry climate, which affects the adaptability of the grassland ecosystem itself, human activities have largely changed the native grassland, which also leads to the decline of the adaptability of the grassland ecosystem itself. A large number of new varieties introduced by artificial and semi-artificial pastures need strict management and protection to create higher pasture yield. Under the climate change model, frequent pasture management measures exert great pressure on the grassland ecosystem. On the other hand, climate change makes the grassland ecosystem itself face more natural disturbances. Climate change affects precipitation and grassland temperature, which causes instability such as year-round drought, summer floods, and winter snowfall. For instance, Orusa and Mondino studied climate change effects on rangelands, and found that phenological and evapotranspiration-related processes and snowpack melting time have been dramatically changed in the last two decades in Aosta Valley (Orusa and Mondino, 2021). Located in the hinterland of the Qinghai-Tibet Plateau

**Citation:** Sun, X.; Xiao, Y. Vegetation Growth Trends of Grasslands and Impact Factors in the Three Rivers Headwater Region. *Land* **2022**, *11*, 2201. https://doi.org/10.3390/ land11122201

Academic Editors: Matteo Convertino and Jie Li

Received: 2 November 2022 Accepted: 29 November 2022 Published: 4 December 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

in southern Qinghai, the source area of the three rivers refers to the source areas of the Yangtze, Yellow, and Lancang Rivers [1]. This headwater area is the highest, largest, and most concentrated wetland in the world and has the most abundant biodiversity and the most sensitive ecosystem in China. The three rivers supply about 60 billion m3 of water each year, and the partial water of the Yangtze River, Yellow River, and Lancang River comes from the Three Rivers Headwater Region, thus establishing the area as a veritable "Chinese water tower". The Three Rivers headwater region is a unique area that affects the development of the west; once damaged, recovery is difficult because of the harsh conditions found there and the area's fragile ecosystems [2].

Grassland is the dominant ecosystem in the Three Rivers headwater region, and grassland animal husbandry is the leading industry. One of the main supply functions of this grassland ecosystem is providing herbage allowance for animal husbandry production, which then provides direct benefits to the people through the production of grass-fed livestock [3,4]. In recent decades, due to the impact of the uncontrolled exploitation and overuse by humans, serious grassland degradation has occurred [5]. By the early part of this century, degraded grassland occupied 26–46% of available grassland, and this has had severe impacts on the ecological environment, its security, and on the sustainable development of grassland animal husbandry of the region. To address this problem, the State Council in 2005 approved the Ecosystem Conservation project in the Three Rivers headwater region with the aim of implementing ecological restoration [6]. To design a scientific strategy for the recovery, management, and utilization of the grassland and to effectively evaluate the project [7], analyzing the dynamic change of grassland productivity before and after the start of regional restoration work is essential. This allows an assessment of the trends of the supply function of herbage allowance and the natural and cultural drivers leading to the changes of the grassland ecosystem [8].

A vegetation index refers to ground vegetation coverage gathered through satellite remote sensing; it is a comprehensive, abstract, and indirect method [9] that is accomplished either through empirical modeling or mixed-pixel decomposition [10,11]. An empirical model is applied to achieve vegetation coverage over a large area through the correspondence between the measured data of vegetation coverage in the sample and the vegetation index [12]; however, the application of this method is subject to temporal conditions. With mixed-pixel decomposition, pixel information gathered by remote sensing is simplified into information or non-information on vegetation, and vegetation coverage is estimated from the proportion of vegetation information [13,14]. Studies have shown that this method is not subject to the latest data, and so it is generally applied to dynamic monitoring of remote sensing of vegetation coverage [15].

In this paper, we integrate various remote sensing analytical techniques (trend analysis, correlation analysis, and residual analysis with consideration of hysteresis effect of rainfall on vegetation) to map and assess grassland improvement and degradation areas to identify the driving forces of change in vegetation cover in the Three Rivers headwater region.

#### **2. Study Area**

The Three Rivers headwater region (TRHR) refers to the source areas of the Yangtze River, the Yellow River, and the Lancang River [16,17]. The TRHR is the highest, the largest, and the most concentrated wetland in the world, while also being the region with the most abundant biodiversity and the most sensitive ecosystem owing to its varied topography, complex hydrographic network, and numerous lakes. Wetland and meadow plants are the main types of vegetation in the TRHR, and these play a vital role in water conservation, runoff mitigation, and biodiversity maintenance [18].

Additionally, as the primary water source of the Yangtze, Yellow, and Lancang rivers, the Three Rivers headwater region, named "China water tower", is vital for maintaining the water security of 548 million people on the downstream Yellow River basin, Yangtze River basin and Lancang River basin. However, since 1990s, the ecosystem of the Three Rivers headwater region has shown a severe trend of degradation, which seriously threatens

the production and living of the local residents in the downstream areas. Therefore, the Chinese government has started to implement a series of ecological policies to curb the degradation of the ecosystem (Figure 1).

**Figure 1.** Grassland distribution of the Three Rivers headwater region (The gray bold dashed lines represent the mountains with a name beside it; the blue lines are the rivers; spatial reference: Albers conical equal area; datum: D WGS 1984).

#### **3. Materials and Methods**

#### *3.1. Data Sets and Pre-Processing*

MODIS data: The normalized difference vegetation index (NDVI) is a normalized ratio of red and near-infrared reflectance. It is used as a terrestrial vegetation growth and conditions proxy, being sensitive to chlorophyll-related plant canopy structural variations and being closely correlated to the fraction of potential photosynthesis and the physiological status of vegetation [19,20]. In this study, we used the moderate resolution imaging spectroradiometer (MODIS) MOD13Q1 time series data set from 2000 to 2010 obtained from the Land Processes Distributed Active Archive Center (LP DAAC). This data set has a spatial resolution of 250 m and temporal resolution of 16 days, an improved sensitivity to vegetation, and a reduced influence of external factors (such as atmosphere, observation angle, sun azimuth, and cloud cover), which have been verified using stable desert control points.

The MODIS data set was geocoded to the universal transverse Mercator (UTM) coordinate system using the MODIS reprojection tool. As these products may be affected by cloud cover, atmosphere, or ice/snow cover, we first reconstructed the NDVI time series data set using the asymmetric Gaussian function filter in TIMESAT 2.3 software program to reduce noise and improve data quality during the data pre-processing procedure [21]. The 16-day MODIS-NDVI was then compiled into monthly NDVI data by applying maximum value compositing (by overlaying multiple raster maps, the value of each raster cell is then taken as the largest of the multiple maps), which was processed in the interactive data language (IDL, https://www.harrisgeospatial.com, 30 November 2022).

Meteorological data: Meteorological data from 2000 to 2010 (including monthly scale precipitation and temperature) were obtained from the Chinese National Metrological Information Center/China Meteorological Administration. Monthly meteorological data derived from station-based information were interpolated to the whole research area at a spatial resolution of 250 m using the kriging interpolation method. In addition, the monthly temperature was calibrated using the digital elevation model, and the coefficient was reduced by a 0.47 degree/100 m increase in elevation. This coefficient was obtained by linear regression between elevation and temperature.

Grassland data: Information related to grassland distribution in the study area was obtained from an ecosystem map of China that was generated and interpreted based on 1:100,000 Landsat TM satellite remote sensing products from the Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences. (Figure 1). Sources of cartographic data and statistics are listed in Table 1.


**Table 1.** Sources of principal data.

#### *3.2. Methods*

#### (1) Trend analysis

To assess variation trends of NDVI and climate (precipitation and temperature) throughout the 2000–2010 study period, we used a linear least-squares regression model to obtain the changing trends of every pixel by fitting a linear equation of NDVI or climate variables as a function of the variable of year to obtain an image of changing slopes [22]. The linear least-squares regression method, which is a commonly used method in trend analysis [23], was applied as follows:

$$y = a + b \times t + \varepsilon \tag{1}$$

where *y* represents NDVI or climate variables; *t* is year; *a* and *b* are fitted variables (*b* is the slope as a proxy of trend and a is the intercept); and ε is the residual error. If *b* > 0, there is an increasing trend of NDVI or climate; conversely, if *b* < 0, there is a decreasing trend. *p* < 0.05 was considered a significant change for both increasing and decreasing trends (Table 2).

**Table 2.** Evaluation standard of trend significance.


In addition, the estimation of parameters *a* and *b* uses the least square method, ε*<sup>i</sup>* is a random error, the fitting value of parameters *a* and *b* is expressed as:

$$\hat{\theta} = \frac{\sum\_{i=1}^{n} \left( t\_i - \overline{t} \right) (y\_i - \overline{y})}{\sum\_{i=1}^{n} \left( t\_i - \overline{t} \right)^2} \tag{2}$$

$$
\mathfrak{a} = \mathfrak{y} - \hat{\mathfrak{b}} \times \mathbb{F} \tag{3}
$$

The calculation process of changing slopes per pixel was programmed with interactive data language (IDL).

#### (2) Correlation analysis

We used the Pearson correlation coefficient to explore the relationship between trends of NDVI and climate as follows:

$$\tau = \frac{\sum\_{i=1}^{n} (\mathbf{x}\_i - \overline{\mathbf{x}})(y\_i - \overline{\mathbf{y}})}{\sqrt{\sum\_{i=1}^{n} (\mathbf{x}\_i - \overline{\mathbf{x}})^2} \sqrt{\sum\_{i=1}^{n} (y\_i - \overline{\mathbf{y}})^2}} \tag{4}$$

where *r* represents the linear correlation coefficient; *x* is the NDVI variable; *y* is the precipitation or temperature variable; and *n* is the number of variables [24].

To evaluate the influence of anthropogenic activities on the trend in NDVI, we used partial correlation between the slope of the NDVI and anthropogenic factors (i.e., population growth and a change in livestock numbers). We also employed stepwise regression between the trend in NDVI and the impact factors (precipitation, temperature, population, and livestock numbers) to assess their relative contribution in influencing spatial characteristics of the NDVI trend [25].

#### (3) Residual analysis

Residual NDVI is the difference between the observed value *yi* (observed NDVI) of the dependent variable and the predicted value *y*ˆ*<sup>i</sup>* (predicted NDVI) calculated according to the estimated regression equation, denoted by *e*. It reflects the error caused by using the estimated regression equation to predict the dependent variable *yi*. The residual of the *i* th observation is:

$$x\_i = y\_i - \mathcal{Y}\_i \tag{5}$$

Residual analysis is used to investigate trends in residual differences (residual NDVI) using a regression model involving. The trends of predicted NDVI were interpreted as climate-induced changes (rainfall as the explanatory variable [26]), while trends in residual NDVI (*ei*) were interpreted as human-induced changes [26].

The residual analysis mainly involved the following steps. First of all, a line regression model between observed NDVI and precipitation factor was used to calculate the residual difference (*e*) between observed NDVI and predicted NDVI (regression result). Then, the trend analysis of residuals as a function of time was processed to investigate the humaninduced vegetation degradation (Table 3). Pixels exhibiting marginal decreases (i.e., <5%) in 2000–2010 were excluded or considered as stable because they reflected the potential uncertainties caused by differences due to image dates within the season/month or image calibration processes.

**Table 3.** Indicators and definitions related to residual analysis methods.


Previous studies reported that annual maximum NDVI representing the growth of grasslands is strongly correlated with climatic variables [26]. Thus, NDVImax as the highest NDVI value can be used to gauge grassland vegetation growth in this study. Based the residual analysis, measured NDVImax values showed both positive and negative deviations from the fitting curve on the NDVImax/rainfall linear regression, suggesting that vegetation is not only responsive to rainfall, but also influenced by human activities represented by residuals.

As the rainfall period was most strongly correlated with grassland growth, we calculated the precipitation accumulation periods and lag periods from September to August

of next year. And analysis the correlation between the cumulative rainfall and NDVImax, to identify the optimal relationship between them in the study area [27]. By prolonging and changing the cumulative period, the process was repeated until all possible combinations were tested, and then the best relevant cumulative period was determined (with the strongest correlation), which is from 1 September to 1 August of next year.

#### **4. Results**

#### *4.1. Spatial and Temporal Characteristics of Grassland Variation*

Trend analysis showed overall positive NDVI trends for the grasslands (Figure 2). The study area, which is about 84.25% area of the total grassland in the Three Rivers headwater region, showed a continuous trend of improvement from 2000 to 2010.

**Figure 2.** Spatial distribution of NDVI trends in the grassland ecosystem. (**A**) NDVI trend from 2000 to 2005; (**B**) NDVI trend from 2006 to 2010; (**C**) NDVI trend from 2000 to 2010; the fork symbol in the figure indicates the significance of NDVI trend, where *p* value < 0.05. (Spatial reference: Albers conical equal area; datum: D WGS 1984).

Linear regression analysis of NDVI from 2000 to 2005 indicated that about 37.01% of the total grassland area experienced a declining trend in vegetation production (Figure 2A). The declining patches of vegetation were found mainly in the western and southern pastures; however, this change between 2001 and 2005 was not significant. On the contrary, the long-term productivity of vegetation shows a positive or stable trend in the northern high-altitude pastoral areas affected by fog and haze.

Correlation between grassland degradation and triggered factors:

The correlation between NDVI values and accumulated precipitation revealed a strong positive correlation (r = 0.5 to 0.98) for about 38.41% of the grassland area in the Three Rivers headwater region (Figure 3A). A negative correlation was found in the densely populated southern and eastern parts of the study area, which have low vegetation cover and high levels of human activity.

Moreover, there is little correlation between residual NDVI and rainfall, suggesting that residual analysis can effectively remove climatic factors (precipitation) and effectively reflect the impact of human activities on grassland degradation (Figure 3B).

**Figure 3.** Correlation results based on person correlation analysis. (**A**) Person correlation between NDVI and Rainfall; (**B**) Person correlation between residual NDVI and Rainfall. (spatial reference: Albers conical equal area; datum: D WGS 1984).

#### *4.2. Impacts of Human Activities on Grassland Degradation*

Analysis showed that approximately 25.85% of the study area did not show an increase in vegetation growth despite increasing precipitation from 2000 to 2010 (Table 4). The declining patches of vegetation, which can be attributed to human activities, were found mainly in the southern and eastern pastures (Figure 4A). When evaluating the first and second halves of the decade, linear regression analysis of residual NDVI from 2000 to 2005 showed that about 35.72% of the total grassland area experienced a declining trend in vegetation production (Figure 4B). In contrast, about 17.66% of the total grassland area showed a declining trend in vegetation production from 2006 to 2010 (Figure 4C).

**Table 4.** Proportion of area showing non-climate degradation (units %) and the trend of the residual mean value.

**2000–2010 2000–2005 2006–2010**

**Figure 4.** Grassland degradation trend due to human activities (**A**) residual NDVI from 2000 to 2005; (**B**) residual NDVI from 2006 to 2010; (**C**) residual NDVI from 2000 to 2010. (spatial reference: Albers conical equal area; datum: D WGS 1984).

#### **5. Discussion**

Our findings showed that an increasing NDVI occurred mainly in the northern plains, while southern and eastern grazing pastures, which are densely populated, showed a small positive trend in long-term vegetation productivity. Degraded patches of vegetation were

located on steep slopes (slope > 20◦), supporting the findings of Fan et al. (2010) [16]. In addition, our field observations suggested that high wind erosion and intensive human activities were impacting vegetation productivity. Grassland is the most important ecosystem type on the Qinghai Tibet Plateau and the basis of animal husbandry on the plateau. Though a series of grazing grassland ecological protection subsidy incentives such as grassland ecological protection in the construction project, the Qinghai Tibet Plateau grassland conservation effect appeared gradually, but the strength of the human activity factor is different, we also found that declining patches of vegetation were found mainly in the pastures [28].

A large proportion of the total grassland area (80.79%) showed an increasing vegetation trend between 2000 and 2005. Less than 20% of the total grassland showed degradation, and this was patchy rather than continuous. By 2006, the extent of vegetation coverage was even higher. We attribute this improvement to the 2005 approval of the Ecosystem Conservation project in the Three Rivers headwater region [29], which has led to less human disturbance and a focus on vegetation recovery [6,30]. Whether changes in precipitation variability or human activities led to this recovery is not yet clear. However, we found only a limited correlation between precipitation and residual NDVI (Figure 4b), indicating that the trends in vegetation cover were unlikely to be caused as a result of climatic conditions.

To analyze temporal trends in grassland NDVI, we concentrated on the development of the annual maximum NDVI, as proposed by Evans and Geerken (2004) [25]. Vegetation production in a cold alpine environment has been shown to fluctuate strongly according to interannual precipitation and temperature variability [16,31]. However, temperature is periodic and takes years for little change. Therefore, we assume that the correlation between temperature and vegetation growth is unlikely to be used to identify the temporal and spatial trends of grassland, so we ignore temperature. The NDVImax values are calculated and correlated with precipitation, as proposed by Evans and Geerken (2004) and Madonsela et al. (2018) [26,32]. We found that a decreasing NDVI was mainly concentrated in the western region due to enhancing human activity such as over grazing and an increasing NDVI was distributed throughout the rest of the study area due to increasing rainfall and ecological protection policies such as banning grazing [33].

Our findings document the benefits that have occurred since the initiation in 2006 of the ecological restoration project, which included returning grazing land to grassland, the ecological migration project, black soil land recovery (grassland with extreme degradation), and rodent control [16,34]. Non-climatic (human) activities led to serious grassland degradation of the Three Rivers headwater region up until 2005 (Figure 4B), but since 2006, the grassland degradation caused by these activities has decreased and an increase in vegetation cover has occurred. This finding combined with the decreasing residual NDVI indicates that the increasing vegetation cover over time is not related to precipitation variability but to better land management practices.

#### **6. Conclusions**

We found an overall positive NDVI trend between 2000 and 2010 for the grasslands in our study area, which cover about 84.25% of the total grassland area in the Three Rivers headwater region. The declining trend in vegetation production between 2000 and 2005, which affected about 37.01% of the total grassland area, appears to have been reversed between 2006 and 2010, most likely by the effectiveness of the government-approved program for ecosystem conservation.

The amount of precipitation in the Three Rivers headwater region has a strong positive correlation with NDVI in about 38.41% of the grassland area. Negative correlations were found in the southern and eastern regions, which have dense populations, low vegetation coverage, and intensive human activity. Our study shows that to promote grassland recovery in Three Rivers headwater region, a number of factors must be taken into account, especially human activities. Our findings also support the need for intensive management

of water resources to enhance vegetation and avoid ecological damage in this important ecological area.

**Author Contributions:** Y.X. conceived and designed the research; X.S. and Y.X. analyzed the data; X.S. and Y.X. wrote the main manuscript; X.S. prepared Figures and Tables. All authors reviewed the manuscript. X.S. conceived and designed the experiments, performed the experiments, contributed reagents/materials/analysis tools, analyzed the data, prepared figures and/or tables, and wrote the main manuscript. Y.X. conceived and designed the experiments, contributed reagents/materials/ analysis tools, wrote the main manuscript, authored or reviewed drafts of the paper, and approved the final draft. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded jointly by the National Natural Science Foundation of China (NSFC) (No: 71904060), the General Project of Natural Science Research of Jiangsu Province Higher Education Institutions (22KJD180005), the Natural Science Foundation of Jiangsu Province (BK20211363), opening grant of Jiangsu Key Laboratory for Bioresources of Saline Soils (JKLBZ202001).

**Data Availability Statement:** The raw/processed data required to reproduce these findings cannot be shared at this time as the data also form part of an ongoing study.

**Conflicts of Interest:** The authors declare no competing interest.

#### **References**

