*3.1. Classification Accuracy*

First, we examined the error rate of the RF classifier on the validation points we reserved before training the classifier, as described in Section 2.5. Accuracies for all years, dry years (6/15 to 7/31 precip. < 2001–2016 mean), and wet years (6/15 to 7/31 precip. > 2001–2016 mean) are 82%, 85% and 78%, respectively (Table 3). It is not surprising that the classification accuracy is lower in wet years that received plenty of precipitation. In wet years, the input indices describing the crop status may not be significantly different between rainfed and irrigated fields, inducing higher commission error and lower omission error than in dry years (Table 3). It is notable that the difference between the accuracies in dry and wet years is small. This suggests the utility of spatial anomaly and weather-sensitive remote sensing indices in distinguishing irrigated fields from rainfed even under humid conditions.

**Table 3.** Irrigation mapping accuracy evaluated using manually labeled data points. Omission error describes the percentage of irrigated training points that are classified as rainfed (false negative), while commission error describes the percentage of rainfed training points that were classified as irrigated (false positive). The accuracies of RF classifier and MIrAD-US [12] are compared for 2012 when both the manually labeled data points and MIrAD-US map are available.


We then compared the county irrigated area classified by RF with NASS Agricultural Census statistics [2] in 2002, 2007, 2012, as shown in Figure 4. For county statistics, there is a good overall agreemen<sup>t</sup> - *r*2 = 0.69. Figure 5 reports the annual total irrigated area in the study domain. While spread is noticeable in the county data (Figure 4), the total irrigated area agrees well with NASS statistics.

**Figure 4.** County irrigated area for 2002, 2007, 2012 according to the RF-based irrigation maps. Color encodes different counties. As explained in Section 3.1, RF significantly underestimates irrigated area in St. Joseph county due to widespread seed corn production in that area.

**Figure 5.** Empirical probability density function of green index following maximum dryspell (GI\_ppt) estimated from manually labelled dataset.

Next, we compared the error rate of RF classifier and MIrAD-US using the manually labeled data (Sections 2.5 and 2.6) in 2012 when both the manually labeled data points and MIrAD-US map are available. As shown in Table 3, the MIrAD-US error rate (defined as 1—accuracy) is 26%, and the RF classifier error rate is 16%. In addition, RF irrigation maps have lower omission and commission errors than MIrAD-US, suggesting a higher accuracy of RF-derived irrigation maps.

The high omission error of the irrigation classification (Table 3) may be due to the agricultural managemen<sup>t</sup> practice of deficit irrigation in the study area. Notably from Figure 4, the RF classifier significantly underestimated irrigated area in St. Joseph county where seed corn is the dominant crop [18], and deficit irrigation in late season (August) is commonly applied to dry up corn in plots. These locations would thus exhibit lower vegetation indices. The irrigation maps may underrepresent locations where a deficit irrigation strategy is applied in the rest part of the study domain.

The accuracy assessed on the validation points is lower than previous study that used a similar method to map irrigated area in a semi-arid to arid region [13]. In a more arid climate, the vegetation indices of rainfed crops are distinctively lower than those of irrigated crops. This is not the case in humid to subhumid climates. As shown in Figure 5, while the mean value of GI\_ppt is higher in irrigated fields, the distributions of the two classes largely overlap. Such mixing also occurs for other input variables, making separation of the two classes challenging in humid regions.

The accuracy of our irrigation maps is also subject to the uncertainties of the input data. As described in Section 2.2, the irrigation maps are developed based on crop masks derived from NLCD and CDL. Thus, misclassification of either product affects the validity of training points and the accuracy of irrigation maps. For years before 2007, the crop mask based on NLCD includes all row crop fields, thus the classified irrigated fields likely include irrigated fields other than corn and soybean. Furthermore, cloud coverage inevitably leads to missing scenes during the critical crop development phase. For locations with few available Landsat scenes, important information regarding the crop status may be missing, and resulting classification may be misled. The issue of cloud coverage may be alleviated using fusion of remote sensing products across recent platforms [41,42] such as radar imagery [15]. Finally, fields smaller than the 30-m resolution may not be well captured by the Landsat-based mapping method.

#### *3.2. Important Input Variables*

During the training process, the RF classifier calculates the variable importance scores as how much impurity (i.e., irrigated versus rainfed) can be explained by each input variable. Figure 6 depicts the 30 variables receiving the highest scores. Most of these higher-ranking variables are weather-sensitive remote sensing indices from Landsat scenes immediately following a dry period

identified by low soil water content, negative PDSI, or limited precipitation. This finding highlights the importance of selecting Landsat scenes that provide the most information for separating irrigated fields from rainfed ones. These critical scenes capture crop performance under a water stress condition; irrigated crops are expected to exhibit higher vegetation indices than rainfed crops. Such differences are not captured by peak vegetation indices, as rainfed crops may exhibit vegetation indices as high as irrigated crops during periods with sufficient precipitation. Simple remote sensing indices such as peak vegetation indices are not among the variables that explain most of irrigation spatiotemporal variability. In humid regions, maximum vegetation indices can be biased due to extensive cloud coverage.

Composites and some spatial anomaly indices receive high importance score, suggesting the utility of these variables to identify irrigated fields. Besides climate and remote sensing data, latitude is among the most important variables, likely due to the gradient of increasing fraction of seed corn from north to south. It is common to regularly irrigate seed corn as required in contracts. In addition, water supply indicators such as PDSI and pre-season precipitation (p\_early) receive high ranks because they can explain the interannual climate variability. Other climate variables received lower importance scores.

We found that soil properties and slope are not important factors for simulating the spatial distribution of irrigation in this region. This is not surprising because sandy soil with low AWC and mild terrain are prevalent in the agricultural lands of the study area. Other variables that do not appear to be important include soil water content, precipitation, aridity and extreme weather condition indices such as GDD, dryspell and heatwave, likely because the resolution of the meteorological data used to calculate these indices is too coarse to capture the fine-scale heterogeneity of irrigation. However, these variables portray large-scale water supply and demand, and we have shown that they can be used to select Landsat scenes that provide the best information for separating irrigated fields from rainfed ones. In particular, soil water content is simulated by NLDAS-Noah, which does not account for irrigation and estimates wetness under rainfed conditions.

**Figure 6.** Top 30 (of 98) important variables as identified by RF; variables are grouped into six categories as indicated by different colors (Sections 2.2 and 2.3, Tables 1 and 2, Table S2, Supplementary Material).

#### *3.3. Expansion of Irrigation*

From the RF classified irrigation maps we calculated the total irrigated area for the study region for 2001–2016 (Figure 7) and compared this to NASS statistics. Temporal fluctuation is noticeable, with limited irrigated area in 2009–2011 and high irrigated area in 2013–2014. The peak in 2014 is likely a combination of three factors. First, the critical crop development phase in 2014 had 21% higher than average (Figure 7) and more frequent precipitation (the dryspell of the study area is 13 days in 2014 and 17 averaged over 2001–2016), leading to robust rainfed crops and correspondingly high vegetation index values across the region. Thus, the RF classifier may overestimate the irrigated area in this year. Second, as described in Section 2.5, 2014 has 8.2% pixels with no Landsat scenes during the critical crop development phase. To fill in the gaps, we labeled pixels as irrigated in 2014 if they were classified as

irrigated in both 2012 and 2013. This may result in commission errors (i.e., classifying rainfed fields as irrigated) in those pixels. Third, farmers may have switched to irrigated agriculture after the crop losses in the 2012 drought.

**Figure 7.** Annual total irrigated area in SW MI according to the random forest-based irrigation maps (line with squares), NASS Consensus (triangles, [2]). Gray bars show precipitation during the 15 June–31 July period, averaged over the study domain. Linear regression with and without 2014 irrigated area is also shown.

There is a statistically significant (*p* < 0.05) increasing trend in irrigation in this study region despite notable interannual variability. Over the 16-year period, irrigated area tripled (increased by 204%), according to the linear regression shown in Figure 7. The estimated slope of 70.8 km2/year is approximately twice the estimate from NASS statistics in 2002, 2007 and 2012 (slope = 35.6 km2/year). In order to isolate the likely skewness due to high irrigated area estimated in 2014, we performed another linear regression excluding 2014 (Figure 7). An increasing trend is statistically significant (*p* < 0.05) with estimated slope of 49.1 km2/year.

We also calculated the change in irrigated area for corn and soybeans, respectively, for 2007–2016, when CDL is available for the study area. The irrigated area fractions for corn and soybeans increased from 19.1% in 2007 to 24.9% in 2016, and from 9.2% in 2007 to 17.9% in 2016, respectively.

To examine the spatial pattern, irrigation trends are calculated based on linear regression (Figure 8). To do this, we aggregated the 30-m irrigation maps to a larger grid to perform linear regression on the irrigated area through time. We chose the 9-km<sup>2</sup> grid to examine relatively fine-scale spatial patterns of irrigation trends across the region (vs. county level, for instance). Slight decreasing trends in lakeshore area suggests discontinuation of irrigation. The highest increase rate (up to 0.25 km2/year per 9-km<sup>2</sup> cell) was found in southern part of the study area. The irrigation expansion is believed to be associated with the promotion of seed corn in this area [16,18]. Seed corn is usually irrigated because irrigation is typically required by the contracts between farmers and seed corn companies.

**Figure 8.** Rate of change of irrigated area over time based on linear regression calculated for an aggregated 9 km<sup>2</sup> grid. Gray indicates non-significant trend (*α* > 0.1).

In addition, correlation analyses sugges<sup>t</sup> crop commodity price is another factor affecting irrigation decisions. The annual irrigated area of the study region is found to be correlated with previous year's corn price [2] (Figure 9, *r* = 0.66, *p* = 0.009). Irrigation may double corn yields and increase soybean yields by more than 66% in SW MI [16]. Given the easy accessibility to irrigation water, adoption of irrigation will likely increase farmers' revenue. Irrigation expansion may be further encouraged by devastating crop losses in the 2012 drought in fields without irrigation [43,44].

**Figure 9.** Commodity price of corn (**a**) and its correlation with irrigated area (**b**) 2001–2016.
