*4.1. Yield Prediction*

Exactly 66% of the observed fields have a complete time series of Landsat NDVIs at the pixel level across the growing season. For this dataset, C-Crop performs the best with Landsat images at field (*R*<sup>2</sup> = 0.68; Figure 4a), 250-m (*R*<sup>2</sup> = 0.85; Figure 4d), and 25-m (*R*<sup>2</sup> = 0.48; Figure 4g) pixel resolutions for yield prediction pooled for wheat, barley, and canola. MODIS-based model yields were at least 10% less in terms of the *R*<sup>2</sup> than when using complete time-series Landsat data. The model performs almost identically when using MODIS and L–M blended data for the predictions at both field and pixel scales. However, it produces the lowest bias for field (*MBE* = −0.32 t/ha; Figure 4c), 250-m (*MBE* = −0.23; Figure 4f), and 25-m (*MBE* = −0.21; Figure 4i) pixels when using the L–M blended data.

**Figure 4.** Validation of C-Crop-predicted yield pooled for wheat, barley, and canola using Landsat, MODIS, and L–M blended data where the complete Landsat time series are observed (the fraction of missing Landsat series = 0). From top to bottom, the first (**a**–**c**), second (**d**–**f**), and third (**g**–**i**) rows show the comparison between observed (*x*-axis) and model-predicted yields (*y*-axis) on the field scale (*n* = 139), 250-m pixel level (*n* = 2367), and 25-m pixel level (n = 113,329), respectively, where *n* is the sample size. From left to right, the first (**a**,**d**,**g**), second (**b**,**e**,**h**), and third (**c**,**f**,**i**) columns delineate the validation using Landsat, MODIS, and L–M blended data, respectively. The solid black line is the line of best fit, the purple and the yellow lines represent the upper and lower bounds of the prediction confidence intervals (i.e., *p* = 0.01 and *p* = 0.05), and the black dashed line is the 1:1 line.

Figure 5 shows that 210 fields have incomplete time-series Landsat pixels during the growing season. These series are incomplete due to clouds in some of the Landsat images acquired in the specific growing season when the yield data are observed. Using this set of data, the L–M blended data-based model performs the same as the MODIS-based model when aggregating to 250-m (*R*<sup>2</sup> = 0.86, *RMSE* = 0.52 t/ha, *MBE* = −0.40 t/ha; Figure 5c,d) and field (*R*<sup>2</sup> = 0.66, *RMSE* = 0.82 t/ha, *MBE* = −0.49 t/ha; Figure 5a,b) scales. The L–M blended data-based model shows its advantages at the 25-m pixel level, which explains an extra 7% of the variability in the observed yields when the Landsat time-series pixels are incomplete (Figure 5e,f).

**Figure 5.** Validation of C-Crop-predicted yield pooled for wheat, barley, and canola using MODIS and L–M blended data when the complete Landsat time series are not available (the fraction of missing Landsat series ≥0.083). From top to bottom, the first (**a**,**b**), second (**c**,**d**), and third (**e**,**f**) rows show the comparison between observed (*x*-axis) and model-predicted yields (*y*-axis) on the field scale (*n* = 210), 250-m pixel level (*n* = 3978), and 25-m pixel level (*n* = 231,667), respectively, where *n* is the sample size. From left to right, the first (**a**,**c**,**e**) and second (**b**,**d**,**f**) columns delineate the validation using MODIS and L–M blended data, respectively. The solid black line is the line of best fit, the purple and the yellow lines represent the upper and lower bounds of the prediction confidence intervals (i.e., *p* = 0.01 and *p* = 0.05), and the black dashed line is the 1:1 line.

#### *4.2. Identification of the Threshold*

Figure 6a shows that the 25-m pixel-level yield prediction accuracy fluctuates between 0.35 and 0.75 using MODIS and *LLM* data in time and space, where time-series Landsat observations are incomplete. The *R*<sup>2</sup> derived from MODIS is steady (between 0.4 and 0.5) where the fraction of missing Landsat data ranged between 0.05 and 0.42, which is lower than the *R*<sup>2</sup> (0.62–0.5) derived from *LLM* data. The *RMSE* derived from both data sources remains approximately 0.9 t/ha for the same fraction of missing Landsat data (Figure 6b). When more incomplete time-series Landsat data are observed (from 0.42 to 0.75 on both Figure 6a,b), the *R*<sup>2</sup> based on MODIS increases to around 0.7 and the *RMSE* decreases to 0.4 t/ha (as MODIS is not as cloud-affected as Landsat due to imagery being acquired on more days), whereas the *R*<sup>2</sup> derived from the *LLM* fluctuates between 0.6 and 0.4 and the *RMSE* changes between 1.3 and 0.8 t/ha (Figure 6). Given this, up to 42% of missing Landsat data in the growing season is defined as the threshold for when L–M blending is optimal to implement. That is, the 25-m pixel-level yield prediction accuracy can be improved using *LLM* when the fraction of missing Landsat data at the coordinates is below this threshold. For instance, the *LLM*-based model increases *R*<sup>2</sup> by up to 20% when the fraction of the missing Landsat data is below 42% (Figure 6a).

**Figure 6.** The statistical analysis of missing data in Landsat for 25-m pixel-level yield prediction, by evaluating (**a**) *R*<sup>2</sup> and (**b**) *RMSE* against the fraction of Landsat missing data during the growing season. L–M blended data were used to fill the gaps (*LLM*) in the incomplete Landsat series.

#### *4.3. Evaluation of the Improvement in Yield Prediction Accuracy*

Given the availability of cloud-free Landsat and MODIS data across the wheatbelt (2000–2018), in concert with the previously determined threshold (Figure 6) and finding (Figure 5), we can demonstrate which imagery (i.e., either MODIS, Landsat, and *LLM*) is best suitable for 25-m pixel-level crop yield prediction (Figure 7). Figure 7 shows the selection when using multiple satellite products for crop yield estimates across the wheatbelt over the past two decades. For the wheatbelt north of 35◦ south (S), there is a higher probability of obtaining complete cloud-free Landsat observations over the growing season in the east–west overlapping areas of adjacent Landsat Path-Rows, whilst *LLM* is optimal elsewhere north of 35◦ S. South of 35◦ S, MODIS is optimal, with *LLM* being optimal in the east–west overlapping areas of adjacent Landsat Path-Rows (in Figure 7).

**Figure 7.** Multi-sensor optimal data selection across the wheatbelt (2000–2018) for 25-m pixel-level crop yield prediction, using the probability of MODIS, Landsat, and *LLM* images. Blue denotes regions where incomplete Landsat series have a fraction of missing data exceeding 42%, thus indicating where MODIS should be used for yield estimates because it provides more frequent observations than Landsat. Green areas show where adjacent Landsat orbits overlap and, thus, where a complete once every 16-day Landsat series over the whole growing season is available. Areas colored red are those where the fraction of Landsat missing data is below the 0.42 threshold identified previously in Figure 6 when L–M blended data improve the yield prediction accuracy.

For 25-m pixel-level yield prediction, blended data should be preferred on average for 33% of the Australian wheatbelt area (Figure 8a). MODIS and Landsat data remain the preferred data sources in 50% and 17% of cases, respectively. Figure 8b shows that the area percentage is positively correlated with annual precipitation for MODIS and negatively correlated for Landsat and the L-M blended data. MODIS has a larger scatter when annual precipitation is greater than 400 mm/year.

**Figure 8.** Yearly analysis of multi-sensor data selection for 25-m pixel-level crop yield prediction (2000–2018) by evaluating (**a**) the area percentage and (**b**) its correlation with the annual precipitation (mm/year). The white dashed line shows that the area percentage is 50. μ: mean of population values; σ: standard deviation; *r*: correlation coefficient. The symbols in (b) are labeled with the last two digits of the year.

Of the 53-million-ha Australia wheatbelt, complete Landsat time series were suitable for 28.6% of that area, MODIS for 31.5%, and *LLM* for 39.9% during the 2015 growing season (Figure 8a). In this season, there are 104 fields available for assessing the accuracy improvement in yield prediction accuracy for nationwide crop yield prediction (Table S1, Supplementary Materials). Within these observed yield data, 69 fields are located where L–M blended images can improve the prediction accuracy according to the previously defined (see Figure 6) 42% Landsat missing data threshold. Figure 9 shows the predicted yields against the observed values for 63 fields in Western Australia (i.e., WA), and six fields in eastern Australia (i.e., Queensland (QLD), New south Wales (NSW), Victoria (VIC), and South Australia (SA)).

**Figure 9.** Scattergram for C-Crop-predicted yield against observed values for 2015 at the field level for (**a**) Western Australia (*n* = 63) and (**b**) eastern Australia (*n* = 6) wheatbelt using MODIS and L–M blended data for gap-filling Landsat (*LLM*) when the fraction of incomplete Landsat series is below 42% for the growing season. The *RMSE* statistics have units of t/ha.

In these areas (see Figure 9), when the fraction of incomplete Landsat series across the growing season of 2015 falls below 42%, using *LLM* reduced the field-level yield prediction errors by 0.15 t/ha for Western Australia and 0.28 t/ha for eastern Australia. More specifically, the bias of 598 kilotons in grain production over the cropping area of 39,882 km2 in Western Australia and of 1672 kilotons in grain production over the cropping area of 59,716 km<sup>2</sup> in eastern Australia can be decreased when using the threshold determined herein to decide the areas where the blended data can improve the prediction accuracy.

#### **5. Discussion**

Globally, wheat-growing regions are distributed within areas that experience a relatively high frequency of clouds [59]. Persistent cloud cover limits the use of HSR sensors over large spatial extents due to the low frequency of the observations, whereas the other sensors that have HTF are constrained by the LSR. Although blending improves the monitoring of rapidly changing processes such as crop growth, it may introduce unforeseen spatial and temporal variances in the blended images when images are often not observed concurrently [10], and implementing the current suite of algorithms across large areas is currently computationally expensive. Therefore, it is important to systematically quantify and evaluate the utility of blending imagery across time and space before embarking on the development of nationwide blending capabilities.

This study developed a parsimonious approach to identify when and where blending is beneficial for crop yield prediction at the 25-m pixel resolution. When incomplete Landsat series falls below 42% of the possible imagery in the growing season, blending is recommended because it improves the accuracy of the yield estimates. When incomplete Landsat series exceeds 42% of data, the use of HTF LSR data (e.g., MODIS) for the 25-m pixel-level yield prediction is recommended, reducing the computational need for blending. LTF HSR data show benefits, for example, in providing detailed information on plant photosynthesis across space; thus, they should be solely used for crop yield prediction when enough cloud-free images are available throughout the growing season.

Annual precipitation, which is strongly associated with cloudiness [60,61] and, thus, partly governs the amount of missing Landsat data, indicates which data sources should be preferably used (Figure 8). For instance, MODIS should be used for the prediction for greater than 50% of the wheatbelt during wet years (e.g., 2003, 2010, 2011, and 2016). During dry years (e.g., 2006 and 2018), the proportion of the wheatbelt where Landsat is suitable for yield prediction increases by >10% compared to the average proportion (17%). For crop yield prediction, blended data can be optimally applied across approximately 33% of the wheatbelt during dry to normal years.

Crop production fluctuates from year to year due to the rapidly changing patterns of precipitation and temperature [42,62,63]. Over the last 50 years, the amount of precipitation received by the Australian wheatbelt declined [64]. Based on model projections, it was also suggested that the changing climatic conditions will affect both the frequency and intensity of the extreme events such as floods, heatwaves, and droughts, which will impact agricultural production [65,66]. Within the context of the present study, an increase in the number of clear sky days can be beneficial for the use of RS in crop yield forecasting. Moreover, a decline in the number of rain days in recent decades within the wheatbelt area can also be noted (see Figure S4, Supplementary Materials). Here, this decline in the number of rain days is assumed to correlate with the increase in the number of days with clear skies. However, a decrease in the number of rain days does not necessarily imply an absence of clouds, as there could be non-precipitating clouds. However, studies such as Norris et al. [67], using observations and Coupled Model Intercomparison Project Phase 5 (CMIP5) model outputs, showed that, on average, there is a decline in cloud amount across the region from 60◦ S–60◦ north (N). This decline was attributed to the poleward shift of mid-latitude storm tracks, among others.

The potential influence of climate change on grain production suggests a decline in world food supply [68] and an increase in the level of exposure of the global population to the risk of hunger [69]. It is, therefore, important to ensure the accuracy and efficiency of yield prediction at anytime, anywhere, and at any scale. A more precise description of grain weight patterns in time and space provides more accurate information for precision agriculture to improve its production and sustainability. The semi-empirical carbon turn-over model used for crop yield prediction is based on historical yield data, and, in the future, relevant drivers (e.g., precipitation and temperature) and their interactions may change under climate change, thus requiring model re-calibration [4,6,70,71]. Our approach was tested with C-Crop but could be extended to other semi-empirical models [72] and models based on machine learning [73].

Development of a national-scale yield prediction system requires time series of HSR and HTF imagery to describe the crop growth; therefore, next-generation HSR and HTF satellite data like Sentinel-2 [74,75] should be tested for nationwide yield estimates post the growing season of 2015. Harmonized Landsat and Sentinel-2 surface reflectance products should be tested for national-scale crop yield prediction [76] when observed yield data for more recent years are available. However, spatio-temporal fusion of MODIS and Landsat data is still necessary for long-term studies that involve historical satellite images collected before 2015 (back to the year 2000, when MODIS was launched). Future studies should also focus on using active RS technologies, such as radio detecting and ranging (Radar) [77,78] and light detection and ranging (LiDAR) [79], for a national yield estimation because of their ability to penetrate clouds and detect a plant's three-dimensional structural characteristics. Data blending between HSR optical imagery and active RS data (e.g., Radar and LiDAR) [80] warrants further study.

#### **6. Conclusions**

Sparse time series of satellite remote sensing, due to LTF and/or cloud contamination, represent one of the main barriers limiting accurate crop yield estimation at regional to national scales. Blending of HSR but LTF images with LSR but HTF images was proposed to increase the temporal resolution and maintain spatial details. In this study, the benefits of blending were tested for crop yield prediction across the Australian wheatbelt. We found that, when time series are gap-free, yield prediction from Landsat is the most accurate on both field and pixel scales. When Landsat time series contain <42% missing values during the growing season, blending is recommended for nationwide crop yield prediction at the 25-m pixel resolution. When Landsat time series contain ≥42% missing observations, MODIS outperforms blending. Across Australia, these recommendations would improve yield estimates by 0.15 t/ha for Western Australia and 0.28 t/ha for eastern Australia on average. By identifying where and when to blend, this work paves the way to more accurate monitoring of biophysical processes and yields, while keeping computational costs low.

*Remote Sens.* **2020**, *12*, 1653

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2072-4292/12/10/1653/s1, Table S1: Summary of observed data (the number of field-years) used herein; Figure S1. Long-term (1901–2018) average monthly accumulated precipitation (mm/month) (left) and the long-term average rain days (right) across the wheatbelt for (a, b) Queensland, (c, d) New South Wales, (e, f) Victoria, (g, h) South Australia, (i, j) Western Australia, and (k, l) Tasmania. The precipitation data are sourced from Jeffrey et al. [37], and the wheat belt mask was resampled from native 2-km<sup>2</sup> horizontal resolution onto the precipitation data grid with 5-km<sup>2</sup> horizontal resolution using nearest neighbor [40]. The vertical black lines (a–f) represent one standard deviation of the monthly wheatbelt specific precipitation for each state. The horizontal line (g–l) represents the median of the data, the box spans from 25th to 75th quartiles of the data, and the rain day threshold is 0 mm/day. It should be noted that Northern Territory is not included in the analysis due to its small cropping area; Figure S2. An example of MODIS and Landsat blended NDVI using ESTARFM. The *x*-axis and *y*-axis are time (*t*, DOY) and spatial resolution (m), respectively. Both MODIS and Landsat have valid observations at *t1* and *t2*. The image at the center of the top row is a valid observation of MODIS at *ts*, and that at the center of the bottom row is Landsat-like blended data at *ts* created using the 16-day MODIS composite images on the DOYs 257, 273, and 289, as well as the Landsat images acquired on the DOYs 257 and 289; Figure S3: The implementation flowchart of validation and identification of threshold for when to blend. The shaded areas indicate the original data sources; Figure S4: Long-term (1900–2018) average monthly accumulated precipitation (mm/month) across (a) eastern Australian, (b) Western Australia, and (c) the wheatbelt. The precipitation data are sourced from Jeffrey et al. [37].

**Author Contributions:** Conceptualization, Y.C., T.R.M., and R.J.D.; methodology, Y.C.; software, N.O. and N.G.; validation, Y.C.; formal analysis, Y.C.; investigation, Y.C. and F.W.; resources, L.L. and N.G.; writing—original draft preparation, Y.C.; writing—review and editing, T.R.M., R.J.D., F.W., N.G., and R.L.; visualization, Y.C. and N.G.; supervision, T.R.M. and R.J.D.; project administration, R.L.; funding acquisition, R.L. All authors read and agreed to the published version of the manuscript.

**Funding:** The authors acknowledged the support of the Digiscape Future Science Platform, funded by the CSIRO.

**Acknowledgments:** Appreciation is extended to numerous industry participants who provided training data in the form of yield maps, accessed by the project team. Randall J. Donohue and Tim R. McVicar acknowledge the support of the ARC Center of Excellence for Climate Extremes (Australian Research Council grant CE170100023). We thank the two anonymous reviewers and the Remote Sensing Academic Editor for helpful comments that helped us improve this paper.

**Conflicts of Interest:** The authors declare no conflicts of interest.
