*2.3. Probability Estimation Trees*

Two quantities are estimated for each month and region, the probability that at least one LF (>404 hectares) occurs and the probability that a VLF occurs conditional on the occurrence of at least one LF. These probabilities are estimated using multi-model averages of a flexible and powerful type of binary classifier known as a probability estimation tree (PET). PETs use decision-tree structures to recursively divide the data with binary splits, eventually grouping all the data into mutually exclusive categories or leaves. With respect to the response, the splits create increasingly homogeneous clusters of observations, which also occupy an increasingly specific portion of the covariate space. Within the context of this analysis, we have 12 meteorological predictors available to form these categories, so that months—in which certain fire events did or did not occur—can be grouped into categories describing broadly similar environmental conditions. Prediction is performed by using the relevant covariates to identify the appropriate category, and taking the empirical frequency of the binary responses in that category as a probability estimate (Provost and Domingos 2000 [49]). While it is well known that predictions based on individual decision tree algorithms can be highly variable with significant levels of structural instability (Wang et al., 2016 [50]), these pathologies are often lessened through the use of model averaging (Provost and Domingos 2000 [49]). To that end, a suite of 100 PETs are generated for each region and for both probabilities of interest; and we will hereafter refer to each collection of 100 PETs as a 'forest'. Each individual PET within a forest is generated stochastically by applying the C4.5 learning algorithm without pruning (Quinlan 1993 [51]; Provost and Domingos 2000 [49]) to a random sample of the training dataset via the Roughly Balanced Bootstrapping algorithm (Hido et al., 2009 [52]). The LF forests and the conditional VLF forests are constructed somewhat differently in that the LF forests sample from all months in the training dataset, while the VLF forests are based only on samples of months in which at least one LF has occurred. In other words, the LF forests will discriminate between LF and no-fire months, and the VLF forests will discriminate between LF and VLF months. Identification of important predictors within each forest are assessed using two summary statistics: (1) the frequency that a predictor is present in the PETs; and (2) the frequency that a predictor is used in the first split of the PETs. The former identifies the frequency with which a given meteorological predictor is used at all within the forest, and the latter identifies the frequency with which a meteorological predictor is the best determinant of the response on a randomly generated dataset.
