*Multilevel Models*

Multilevel models have the advantage of examining individual farms embedded within states and assess the variation at both farm and state levels. The multilevel regression model is commonly viewed as a hierarchical regression model [63]. A multilevel linear modeling technique is utilized to analyze the effects of influential factors on land allocation, water application, crop yield, and *EIWUE*.

For the research questions, we have *N* individual crop-specific farms (*i* = 1, ... , *Nj*) in *J* states (*j* = 1, ... , *J*). The *Xij* represent a set of independent variables at the farm level, and a series of state-level independent variables are represented by *Zj*. The model estimation includes two steps. For the first step, a separate regression equation can be specified in each state to predict the effects of independent variables on dependent variables.

$$\mathbf{Y}\_{i\circ} = \beta\_{0\circ} + \mathcal{B}\_{1\circ}\mathbf{X}\_{1\circ} \tag{5}$$

For the second step, the intercepts, *β*0*j*'s are considered parameters varying across states as a function of a grand mean (*γ*00) and a random term (*u*0*j*). The *β*1*j*'s are also assumed to be varying across states and are presented as a function of fixed parameters (*γ*10) and a random term (*u*1*j*).

$$\mathcal{B}\_{0j} = \gamma\_{00} + \gamma\_{01} \mathbf{Z}\_j + u\_{0j} \tag{6a}$$

And

$$
\mathcal{B}\_{1\dot{\jmath}} = \gamma\_{10} + \mu\_{1\dot{\jmath}} \tag{6b}
$$

Combining Equations (5), (6a) and (6b), we have

$$\mathbf{Y}\_{i\bar{\jmath}} = \gamma\_{00} + (\gamma\_{10} + \boldsymbol{\mu}\_{1\bar{\jmath}})\mathbf{X}\_{i\bar{\jmath}} + \gamma\_{01}\mathbf{Z}\_{\bar{\jmath}} + \boldsymbol{\mu}\_{0\bar{\jmath}} \tag{7}$$

The model is called a random-intercept and random-slope model, as the key features are not only that the intercept parameter in the Level-1 model, *β*0*j*, is assumed to vary at Level-2 (state) [64], but that the slope is also random with an error term *u*1*j*. The *γ*<sup>01</sup> coefficient captures the effects of the state-level variables (*Zj*) on the *β*0*j*'s, whereas *γ*<sup>10</sup> predicts the constant parameter, *β*1*j*, (with errors).

To analyze the multi-crop production, four sequential models are estimated for each decision due to their continuous nature, that is, a unconstrained two-level model with random effects for the intercept only and without any predictors (Model 1); random effects for the intercept and fixed effects for level 2 (Model 2); a random intercept as well as a fixed and random level 1 (Model 3); and a random intercept, fixed and random level 1 as well as a fixed level 2 (Model 4) (see Table S1 in the Supplementary Materials for specifications and comparisons of the four models). To determine how much of the variability in the responses is accounted for by factors at the state level, the intraclass correlation coefficient is usually computed from the null model (Model 1) [65] following:

$$ICC = \frac{\tau\_{00}}{\tau\_{00} + 3.29} \tag{8}$$

where *τ*<sup>00</sup> is the covariance parameter estimate for the intercept, and 3.29 is the estimated level-1 error variance [66].

The data were analyzed using the SAS package in the USDA data lab in St. Louis, Missouri, with official permission.

## **5. Data and Variables**

This study uses a national dataset from the 2013 USDA FRIS. Null models for all equations of 17 crops are estimated to calculate the intraclass correlation coefficient. However, only models in the further steps on land allocation [67], water application, crop yield, and *EIWUE* are estimated and presented in this paper focusing on corn and soybeans as they have the most observations but different distribution patterns across the five regions (specified below).

The lower 48 states are grouped into five regions according to the USDA National Agricultural Statistics Services (NASS) [68], including the Western, Plains, Midwestern, Southern, and Atlantic states [69]. The descriptive statistics of the corn and soybean farms [70] at the national level are presented in Table 1. Of the 19,272 irrigated farms, 6030 farms grow corn for grain with an average area of 357 acres, and 3933 farms grow soybeans with an average area of 341 acres [71]. For corn farms, the mean water application is 1.11 acre-feet/acre; the mean yield is 190 bu/acre; and *EIWUE* is 1311 USD/acre-foot on average. For soybean farms, the mean water application, yield, and *EIWUE* are 0.81 acre-foot/acre, 55 bu/acre, and 1221 USD/acre-foot, respectively.

The independent variables are at two levels. At the farm level, the explanatory variables are related to water sources, costs on surface water and energy, expenditures on irrigation equipment, labor payment, farm characteristics including the farming area, number of wells, irrigation systems, barriers for improvements to conserve water, and information sources related to irrigation. Variables related to water sources, federal assistance, barriers, and information sources are dummy variables (Yes = 1, No = 0), and all other independent variables are continuous.

At the state level, in addition to the dummy variables related to the five regions, six explanatory variables on state-wide weather conditions are included using the data from the United States National Oceanic and Atmospheric Administration. The variables are state average precipitation changes in 2011, 2012, and 2013, and the temperature changes in 2011, 2012, and 2013.


**Table 1.** The summary statistics of crop-specific dependent variables and state-level weather-related independent variables.
