*2.2. Analytic Method*

We employed difference in differences (DID) analysis to consider whether the FAFSA completion intervention had an effect independent of other state-level factors or the announcement of the Say Yes to Education scholarship. The DID method is an econometric tool that allows researchers to more closely approximate experimental treatment and control groups when random assignment is impossible. A simple time series difference analysis considered the outcome before and after the policy implementation and assumed any measurable difference was a consequence of the policy. However, there may be something unique about the schools where the intervention is implemented. The difference in differences approach allows researchers to use non-treated groups as a counterfactual. The difference between the treatment and control groups before the treatment was removed and the observed difference post-treatment was attributed to the intervention. The DID model assumed that the trajectories of the treatment and control groups were parallel in the absence of the intervention and that differences between the two groups after the pre-treatment differences were removed attributable to the program. We were unable to test the parallel trends assumption, given our data constraints, and so, we added several controls that may account for sources of variation, including school size, percent of students eligible for free or reduced-price lunch, the percentage of students of color enrolled in the school, and the percentage of students suspended in the school in a year. Each school's controls were available through the New York State Report Card.

The difference in differences model takes the form:

$$\mathbf{y}\_{\text{it}} = \boldsymbol{\beta}\_0 + \boldsymbol{\beta}\_1 \boldsymbol{\chi}\_{\text{i}} + \boldsymbol{\beta}\_2 \ \mathbf{T}\_{\text{t}} + \boldsymbol{\beta}\_3 \ \boldsymbol{\chi}\_{\text{i}} \times \ \mathbf{T}\_{\text{t}} + \boldsymbol{\varepsilon}\_{\text{it}} \tag{1}$$

In Formula (1), X is a dummy variable for assignment to the treatment group, T indicates the time, where a value of 1 is assigned to the post-treatment period, and the coefficient of interest (β3) is the interaction of treatment group assignment and the treatment period. We added several school characteristics to the analysis to account for observable characteristics that might account for differences in FAFSA completion rates. The complete model takes the form:

$$\mathbf{y}\_{\text{fit}} = \beta\_0 + \beta\_1 \mathbf{X}\_{\text{i}} + \beta\_2 \mathbf{T}\_{\text{l}} + \beta\_3 \mathbf{X}\_{\text{i}} \times \mathbf{T}\_{\text{l}} + \beta\_{4 \text{(class)}} \mathbf{T}\_{\text{l}} + \beta\_{5 \text{(FELE)}} \mathbf{T}\_{\text{l}} + \beta\_{6 \text{(UERA)}} \mathbf{T}\_{\text{l}} + \beta\_{7 \text{(susp)}} \mathbf{T}\_{\text{l}} + \varepsilon\_{\text{it}} \tag{2}$$

In Formula (2), the size of the graduating class, the proportion of free or reduced lunch eligibility, the percentage of underrepresented minority students enrolled, and the percentage of students suspended each year are included as controls. DID was used to analyze the effects of several higher education policies [33], notably to evaluate the Georgia HOPE scholarship and the federal tax credit [34] and the adoption of state high school graduation requirement policies [35].
