*2.5. Ensemble Assessment*

The ability of the multi-model averages to estimate observed VLF frequencies was quantified over the temporal range of the extant MTBS fire occurrence record at three non-overlapping time periods. Two of the three time periods correspond to the tuning (1984–2005), training (2006–2015) datasets that were used to bias correct and fit the initial suite of PETs respectively. Additionally, a testing dataset independent of the information used to build the multi-model averages was constructed using 2016 MTBS occurrence data. For each time period, a sample of 100,000 probability time series were drawn from the relevant multi-model average posterior, which were then used to simulate the distribution of VLF counts predicted during that time period. Note that the 2016 fire data used to independently validate the multi-model averages represent an updated version of the MTBS data that was unavailable during the PET training and tuning stages, and that slight differences in the total number of large (>404 hectares) incidents between 1984–2015 were observed in the two versions. Specifically, the original MTBS dataset reported 10,295 large incidents between 1984–2015, while the updated version reported 10,298 large incidents during that same period.
