*2.2. DNA Barcoding and Sequence Alignment*

The set of silica dried samples used for this study comprised 255 samples and 130 samples from other collections made from a variety of previous collections undertaken by the Shapcott lab and held at the University the Sunshine Coast [16,17,67]. DNA was extracted from 385 samples following the methods used by Shapcott [17]. The PCR amplification and sequencing of three accepted plastid DNA barcode markers, rbcL, matK, and psbA-trnH, used established methods [68]. The PCR product was purified with exosap, and forward and reverse primers were used along with the Big Dye Terminator v3.1 cycle sequencing kit (ThermosFisher Scientific, Waltham, MA, USA) in a cycle sequence reaction to attach dyes in preparation for sequencing [68]. This was followed by a sephadex purification and rehydration with HiDi formamide to prepare samples for sequencing on an AB3500 Genetic Analyser (Applied Biosystems, Foster City, CA, USA) at the University of the Sunshine Coast. Any unsuccessful samples were reprocessed. Contigs were made using the forward and reverse sequences in Geneious version 10.2.6 (Biomatters, Auckland, New Zealand) (https://www.geneious.com (accessed on 27 March 2022) and were edited for accuracy and checked for quality and length. Contigs were exported to consensus sequences under the following quality control guidelines: a HQ score of a minimum of 65%, a sequence length of a minimum of 300 base pairs, and a minimal number of ambiguous base calls. Alignments of rbcL were completed using MUSCLE and the matK alignment was performed using MAFFT, in Geneious. The psbA-trnH makers were aligned using SATe [69]. All alignments were examined and manually adjusted to correct for homologies. Preliminary Trees were constructed in Geneious 10.2.6 for each marker to check the phylogenetic placement of species and any species that were clearly incorrectly placed on the tree were discarded, either as being contaminated DNA or a misidentification. In rare instances, sequences of less than HQ 65%, or less than 300 base pairs, were kept where they were placed correctly on the phylogenetic tree and there was no alternative sequence to use. Some samples were sequenced again for one or more loci to improve quality. For each plant species, at least two makers were used to construct the "barcode". Missing sequences were retrieved from the public database GENBANK (www.ncbi.nlm.nih.gov/genbank/ (accessed on 27 March 2022)). In the few instances where no markers were procured for the species, a congener was used.

To further improve the robustness of the phylogeny, the data for the 366 heath species were aligned with an existing dataset of south-east and central Queensland rainforest species using the same three gene markers [16]. The final alignments for rbcL, matK, and psbA-trnH were trimmed and concatenated to create a three gene alignment for the heath and rainforest species of south-east Queensland, resulting in a dataset for 1576 species.
