*2.1. Data Source*

The year we began the project, the U.S. Department of Education published FAFSA completion numbers by high school and state bi-weekly. These data reports included the number of FAFSAs submitted and the number of FAFSAs completed (accepted by the federal government) during the current filing year by the date the data were made available. They provided comparable data for the same school and date from the prior year. The Education Department (ED) continued making these data available each year so that counselors, schools, and college-access programs can track their progress at the school level. The bi-weekly data remain useful to examine completion trends throughout the project and compare them to the same time points in the previous year for Buffalo Schools. For the regression analyses, the data from the April 12 data file were used because it was the final period available at the end of the project.

The data are publicly available through Federal Student Aid [32], and they were extracted from submitted FAFSA forms by linking the student to the high school from which they graduated, identifying the first time a student was ever enrolling in college, and limiting the age of students to no greater than 19. These data are appropriate for comparison purposes across schools in the aggregate, but they were likely to under-report total numbers for two reasons. First, at the time of this project, students could type in the name of their high school and submit without verifying the school from the list of identified schools in the application. Those applications cannot be assigned to the appropriate school or districts and do not appear in the data. Second, it is not uncommon for a student in BPS to graduate after age 19. Students older than 19 will not appear in the aggregated high school numbers. This is a limitation of the data, but we expect similar patterns across all high schools and districts included in the study because we are comparing schools and districts serving similar demographics of students. Each school/year was one case in this analysis, and there were a total of 44 high schools across the four urban centers, resulting in 88 school-year cases. Only two years of data were used for two reasons. First, we only had access to one year of data before the implementation of the project, and so, we wanted the sample to be balanced in terms of treatment and control group sizes. Second, it was the first year of the intervention, which was ideal for assessing the effects of the intervention, independent of the ways a system may adapt over time.
