**Conservation Genetics for Management of Threatened Plant and Animal Species**

Editors

**Kym Ottewell Margaret Byrne**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editors* Kym Ottewell Conservation and Attractions Australia

Margaret Byrne Conservation and Attractions Australia

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Diversity* (ISSN 1424-2818) (available at: https://www.mdpi.com/journal/diversity/special issues/ threatened plant animal).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-4441-0 (Hbk) ISBN 978-3-0365-4442-7 (PDF)**

Cover image courtesy of Rujiporn Thavornkanlapachai

© 2022 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**



## **About the Editors**

#### **Kym Ottewell**

Kym Ottewell (Dr) is a Senior Research Scientist in the Department of Biodiversity, Conservation and Attractions, Western Australia. Her work focuses on the application of genetic research to conservation of threatened species, with a particular focus on mammals. Her work encompasses temperate and arid landscapes, with current projects covering a range of threatened marsupial, rodent and bat species investigating natural and anthropogenic impacts on genetic connectivity and genetic diversity.

#### **Margaret Byrne**

Margaret Byrne (Dr) is Executive Director, Biodiversity and Conservation Science in the Western Australian Department of Biodiversity, Conservation and Attractions where she leads a strong science group providing an evidence based approach to conservation management and policy. Her work informs biodiversity conservation strategies for management of landscapes and of rare and threatened species, and her phylogeographic studies have provided a greater understanding of the evolutionary history of the Australian biota, and its influence on current distributions, patterns of genetic diversity and location of refugia. Her current research is focused on applications of genomics in plant conservation and climate adaptation strategies.

## **Preface to "Conservation Genetics for Management of Threatened Plant and Animal Species"**

Genetic diversity is fundamental to the maintenance of species diversity and ecosystem resilience, and especially the capacity of species to adapt to changing and challenging environmental conditions. Globally, species and ecosystems continue to decline as more species are added to threatened species lists every year. The field of conservation genetics offers a range of techniques and statistical approaches that enable us to describe and monitor various aspects of genetic diversity and make inferences about the underlying ecological and evolutionary processes driving these patterns in threatened species. While conservation genetic analytical approaches are sophisticated and well-advanced, there is now growing interest from managers to incorporate management of genetic diversity into conservation programs.

In this book, we highlight conservation genetic (and genomic) papers that demonstrate applied outcomes that inform practical threatened species management. We cover a broad range of species and genetic approaches, but focus on how conservation genetic information is used to underpin management actions for species recovery. Through the exposition of a diversity of approaches, we aim to demonstrate to conservation managers and researchers how conservation genetics can inform on-ground species management.

We cover topics including:


#### **Kym Ottewell and Margaret Byrne** *Editors*

### *Editorial* **Conservation Genetics for Management of Threatened Plant and Animal Species**

**Kym Ottewell \* and Margaret Byrne \***

Biodiversity and Conservation Science, Department of Biodiversity, Conservation and Attractions, Locked Bag 104, Bentley Delivery Centre, Perth, WA 6983, Australia

**\*** Correspondence: kym.ottewell@dbca.wa.gov.au (K.O.); margaret.byrne@dbca.wa.gov.au (M.B.)

#### **1. Introduction**

Globally, species and ecosystems continue to decline, and the impact on threatened species is increasing. The ongoing loss of intraspecific genetic diversity is contributing to the erosion of species' adaptive potential and can hasten population declines, especially in the face of increasing ecological and anthropogenic disturbance [1]. The field of conservation genetics offers a range of techniques and statistical approaches that enable us to describe and monitor various aspects of genetic diversity and make inferences about the underlying ecological and evolutionary processes driving these patterns, in turn, informing approaches to conservation management [2–4]. In addition, given the strong empirical support for the relationship between small population size, reduced genetic diversity and reduced population fitness [5], it is clear that ongoing monitoring of the genetic diversity in threatened species is going to be crucial for sustaining species' health into the future. Research in this field has progressed enormously in recent years, alongside growing knowledge on the application of genetics in conservation and interest from managers in incorporating the management of genetic diversity into conservation programs.

There are many areas of conservation where genetic data can provide direct information to guide management actions for threatened species. In this Special Issue, we highlight conservation genetic studies that demonstrate applied outcomes that inform practical threatened species management. These studies present a diversity of research approaches, both in genetic methodology (microsatellites, chloroplast and mitochondrial DNA sequencing, single nucleotide polymorphisms) and in combination with complementary statistical approaches, taken from allied disciplines (e.g., population viability analysis, climate niche modelling), with application to specific questions relating to on-the-ground species management.

#### *1.1. Identification of Conservation Units*

Fundamentally, species are the primary units of conservation and delineation of species and subspecies boundaries, and the identification of management units within species, is important to ensure appropriate conservation listings are in place and that conservation actions are properly targeted. This is highlighted in a study of *Banksia nivea*, where Sampson and Byrne [6] found that genetic relationships among subspecies were consistent with the existing taxonomy for two subspecies (one common, one endangered), but not for a third (previously considered rare), indicating further taxonomic assessment is required for *B. nivea* subsp. Morangup. Phylogenetic analyses revealed evidence for a more recent divergence of the localised subspecies, associated with expansion from the dryer sandy soils inhabited by the widespread subspecies into the winter-wet ironstone soils in the southwest of Western Australia, consistent with progressive long-term climatic drying.

Understanding species delimitation is also critical to identifying hybridisation and putative taxa of hybrid origin, with a view to assessing their conservation value. A study by van Dijik et al. [7] investigated the status of a conservation-listed tree that is restricted to the

**Citation:** Ottewell, K.; Byrne, M. Conservation Genetics for Management of Threatened Plant and Animal Species. *Diversity* **2022**, *14*, 251. https://doi.org/10.3390/ d14040251

Received: 14 March 2022 Accepted: 23 March 2022 Published: 28 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Fleurieu Peninsula and Kangaroo Island of South Australia and suspected to be of hybrid origin. Genetic analysis of *Eucalyptus paludicola* and its putative parental species identified two genetically distinct clusters, comprising *E. ovata* and *E. cosmophylla*, while *E. paludicola* individuals were admixed between these two species, consistent with a hybrid origin. Given hybrid class assignment tests indicate that the majority of *E. paludicola* individuals are F1 hybrids with a low incidence of backcrossing, these data support the hypothesis that *E. paludicola* is a transient hybrid entity rather than a distinct hybrid species. As such, the authors find little support for the ongoing conservation listing of *E. paludicola* or, indeed, recognition as a distinct species.

#### *1.2. Evaluating Population Structure and Genetic Diversity to Inform Management Approaches*

Dispersal is important for maintaining genetic connectivity amongst populations and, consequently, understanding patterns of gene flow and genetic differentiation is critical if managers are to use admixture to maximise diversity and adaptive potential in reintroductions and restoration projects. This is explored by Amor et al. [8], who used a genetic analysis to investigate the genetic relationship among disjunct groups of remnant populations of *Sclerolaena napiformis*, a perennial chenopod endemic to southeast Australia. They found genetic differentiation among the three regions, with low genetic diversity within populations and high levels of inbreeding. A decline in abundance through habitat fragmentation is compounded by climate modelling that predicts a reduction in suitable habitat for the species, under even the most conservative climate change scenario. The study shows the benefits of applying the knowledge of genetic diversity in restoration and recommended an admixed provenance approach to souring of seed for restoration, both within and across regions, to maximise genetic diversity and maintain dynamic evolutionary processes driven by individual plant fitness in response to the novel environmental conditions.

Assessment of gene flow and genetic diversity amongst remnant populations can assist in focusing conservation attention to those most in need. Thavornkanlapachai et al. [9] provided an example of this in an investigation into the genetic relationships amongst a complex of closely related bandicoot species (genus *Isoodon*), which are variously threatened by ongoing habitat loss and predation by introduced predators. Analysis of mtDNA identified three major clades that largely resolved existing taxa, although, with a pattern of 'intermediate polyphyly' [10] observed between South Australian (SA) and Western Australian (WA) populations. This highlighted ancestral connections between these groups that were resolved as distinct entities in analyses with nuclear markers. SA and Victorian populations of *Isoodon* bandicoots were identified as suffering genetic erosion, emphasising the prioritisation of these populations for specific conservation efforts to reduce further loss of genetic diversity.

#### *1.3. Managing Genetic Diversity in Translocations*

Genetic analysis provides a basis to guide translocations and inform options for undertaking genetic rescue. White et al. [11] demonstrated how genetic data can be used in an investigation into the genetic effects of past translocations in a once widespread mammal species that is now restricted to islands and fenced enclosures in Australia, to inform future translocations of the species. Genetic analysis of the banded hare-wallaby (*Lagostrophus fasciatus*) showed serial translocation from a single source population has led to a loss of genetic diversity in a translocated fenced reserve, whilst inbreeding is of concern in the translocated island population. Population viability analysis and gene retention modelling indicated founder population sizes of ~100 individuals and mixing of two source populations were optimal to maximise demographic resilience and genetic richness in a planned translocation to facilitate persistence in the face of various stochastic environmental events.

#### *1.4. Implementing Genetic Rescue to Manage Inbreeding*

The introduction of new genetic diversity into an inbred population to assist in genetic rescue must be balanced against the risk of 'genetic swamping', leading to the loss or disruption of local adaptive genotypes. Zilko et al. [12] combined genetic data with population viability analysis to guide genetic rescue in the inbred lowland population of Leadbeater's possum (*Gymnobelideus leadbeateri*), a threatened species in the forest of south-eastern Australia, whilst balancing the maintenance of local adaptation. Translocated animals were sourced from the outbred and genetically diverse highland population and this resulted in higher retention of local alleles in the supplemented lowland population due to the reduction in genetic drift. Nevertheless, carrying capacity in the lowland population is currently insufficient to enable population recovery. Consequently, the authors recommended the establishment of a new population of lowland possums, in a high-quality habitat, and gene exchange with highland populations, to alleviate inbreeding depression and maximise the retention of locally unique neutral genetic variation.

#### *1.5. Managing Genetic Diversity in Captive Breeding Populations*

Captive breeding populations can be important sources of diversity to support reintroduction efforts for threatened animals, although ongoing assessment is required to ensure diversity is captured in the founding populations and breeding is managed to maximise retention of diversity. In a study of the captive Mesoamerican scarlet macaw (*Ara macao cyanoptera*) population at Xcaret Park, Escalante-Pliego et al. [13] revealed that founding and current breeders showed high retention of genetic diversity, low inbreeding and low relatedness, and that the captive population has a similar level of genetic diversity to the wild population in the Mayan Forest. As a consequence, the captive breeding population at Xcaret is an important source of birds for the reintroduction program of this subspecies.

#### *1.6. Genetic Approaches to Managing Wildlife Disease*

Managing in-situ populations exposed to disease threats, based on genetic principles, is an integral component of recovery programs for highly vulnerable species. Glassock et al. [14] reviewed the underlying principles of genetic management and highlighted how managing gene flow, diversity and inbreeding assists in reducing the extinction risk of Tasmanian devil (*Sarcophilus harrisii*) populations, threatened by the deadly devil facial tumour disease. This disease poses a significant risk to the persistence of the species. The supplementation of populations establishes gene flow and genetic analysis and can inform how best to increase adaptive potential, whilst minimising any potential for outbreeding depression or loss of local adaptation. This review provides timely discussion on the issues faced by conservation managers for managing emerging diseases in threatened species, where disease eradication is not possible.

Similarly, Palmas et al. [15] used genetic analyses to investigate whether there was an underlying genetic cause of a reported record of pug-headedness in a critically endangered population of native Mediterranean trout *Salmo trutta* from Sardinia, Italy. The genetic analysis suggested that inbreeding or outbreeding depression are not contributing factors in the instance of the deformity in this population, and variation in environmental factors during larval development seemed the most likely factors influencing the deformity.

#### **2. Conclusions**

These studies are just a few examples of the ways in which genetic analysis can inform effective conservation actions for the management of threatened species. Importantly, in these exemplar papers, authors provide explicit recommendations to enable positive impact on the management of their species of interest. We hope that conservation managers not so familiar with genetic tools and techniques find value in the studies provided here, in demonstrating the link between the conservation questions of interest, analytical approaches and the ensuing conservation management outcomes. Further information and

exploration of the application of genetics and genomics for the conservation of threatened species is available in the book Conservation and the Genomics of Populations [16].

**Author Contributions:** Writing—original draft preparation, M.B. and K.O.; writing—review and editing, K.O. and M.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Genetic Differentiation among Subspecies of** *Banksia nivea* **(Proteaceae) Associated with Expansion and Habitat Specialization**

**Jane Sampson and Margaret Byrne \***

Biodiversity and Conservation Science, Department of Biodiversity, Conservation and Attractions, Locked Bag 104, Bentley Delivery Centre, Perth, WA 6983, Australia; jane.sampson@dbca.wa.gov.au **\*** Correspondence: margaret.byrne@dbca.wa.gov.au

**Abstract:** Subspecies are traditionally defined using phenotypic differences associated with different geographical areas. Yet patterns of morphological and genetic variation may not coincide and thereby fail to reflect species' evolutionary history. The division of the shrub *Banksia nivea* Labill. into one widespread (*B. nivea* subsp. *nivea*) and two geographically localized subspecies (*B. nivea* subsp. *uliginosa* (A.S. George) A.R. Mast & K.R. Thiele and *B. nivea* subsp. Morangup (M. Pieroni 94/2)) in south-west Australia has been based mainly on variation in leaf shape and pistil length, although flowering time and habitat differences are also evident, and subsp. *uliginosa* occurs on a different substrate. To assess the genetic divergence of *B. nivea* subspecies, we genotyped representatives from each subspecies for nuclear microsatellite and non-coding chloroplast sequence variation. We used distance and parsimony-based methods to assess genetic relatedness. Patterns were consistent with the existing taxonomy of subsp. *nivea* and *uliginosa* but not subsp. Morangup. Phylogenetic analyses revealed evidence for a more recent divergence of subsp. *uliginosa* associated with expansion from dryer sandy soils into the winter-wet ironstone soils in the southwest of Western Australia, consistent with progressive long-term climatic drying. Nuclear microsatellites showed low to moderate diversity, high population differentiation overall, and genetic structuring of subspecies in different biogeographical areas. We propose this pattern reflects the predicted impact of a patchy distribution, small populations, and restrictions to gene flow driven by both distance and biogeographic differences in subspecies' habitats.

**Keywords:** climate drying; cpDNA; ecotype; Evolutionarily Significant Units; gene flow; geographic expansion; patchy abundance; phylogeography

#### **1. Introduction**

Many recognized species are not genetically uniform and may be highly structured into historically isolated populations that may warrant consideration as intraspecific units [1]. Taxonomically recognized subspecies are often based on geographically discontinuous morphological differences [2,3] or ecotypic differences [4]. Yet natural phenotypic or ecotypic diversity within species over wide distributions may not be consistent with genetic divergence representing the evolutionary processes within species [5,6]. Genetic divergence within species is influenced by gene flow [7] and affected by geographic (e.g., topography, distance) and environmental (habitat, climate, pollen, and seed dispersal) factors [8]. Although reciprocal monophyly is not expected for subspecies, some evidence of restricted gene flow between diverging taxa is expected in patterns of neutral genetic variation [6]. Given there are examples of lineage divergence associated with habitat specialization [9–12], widespread species containing subspecies that occupy different habitats might be expected to show genetic differentiation among habitats.

In Australian plant species, distinct population groups and divergent lineages have been identified within species of several genera that reflect disjunct and historically isolated

**Citation:** Sampson, J.; Byrne, M. Genetic Differentiation among Subspecies of *Banksia nivea* (Proteaceae) Associated with Expansion and Habitat Specialization. *Diversity* **2022**, *14*, 98. https:// doi.org/10.3390/d14020098

Academic Editor: Hong-Hu Meng

Received: 19 December 2021 Accepted: 27 January 2022 Published: 30 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

population systems, geological and edaphic complexities, and contrasting habitats in terms of vegetation and climate [13]. Genetically distinct populations are particularly evident within species from the South-West Australia Floristic Region (SWAFR) [14,15] in a range of plant genera and families, including several in genera in Proteaceae, e.g., *Banksia* [12,16] and *Hakea* [17]. Taxonomic resolution can be challenging in this region as phylogenetic lineages often differ from phenotypic variation [17], and the prevalence of highly structured populations and divergent lineages highlights a need to recognize organized layers of genetic diversity below the species level [1].

*Banksia nivea* Labill. is a common and widespread non-lignotuberous, woody, evergreen, flowering shrub endemic to south-western Australia. There is considerable morphological variation in *B. nivea* mainly in pistil length, leaf size, and shape, and three subspecies are recognized with differing geographic extent [18]. The more common *Banksia nivea* Labill. subsp *nivea* is widespread but patchily distributed in dry sandy soil on sandplains, forests, and mallee from Eneabba in the north, southeast to Cape Arid. Subspecies Morangup (M. Pieroni 94/2) is an informally named subspecies, that is hypothesized to be a unique subspecies but has not been formally assessed or described. It has a highly restricted distribution as it is found at one location in the center of subsp. *nivea*'s range. The rare subspecies *uliginosa* (A.S. George) A.R. Mast & K.R. Thiele is also found in patchy populations in shrublands, woodlands, and forests, and is isolated from subsp. *nivea* and Morangup by >100 km. This subspecies has a limited distribution within a relatively rare, specific edaphic habitat of winter-wet ironstone in two areas on the coastal plain around Busselton and the Scott River in the southwest corner of Western Australia. It has a conservation status of endangered. A dated molecular phylogeny [19] suggests divergence of subsp. *nivea* and *uliginosa* during the Pleistocene (<2.5 Mya).

Unusually for Western Australia flora, *B. nivea* has adaptation to promote gene flow through seed dispersal. Infructescences of *B. nivea* are serotinous containing seeds with a delicate papery wing [20]. Although some follicles open to release seeds periodically, many remain on the plant for a long period of time. If plants die after a disturbance, mass recruitment occurs from the store of genetically diverse seeds held on the plant. Most recruitment occurs after a fire, and the removal of the foliage by fire enables effective wind dispersal of the released seeds [20].

The ecological characteristics of *B. nivea* have been well described [18,21], yet less is known of the species' reproductive biology. Flowers are yellowish-brown in subsp. *uliginosa* and cream-yellow-orange-pink/red-brown in subsp. *nivea* and subsp. Morangup, with a maroon-colored style and green pollen presenter; they have a mousey odor, and are well hidden within the plant [18,22,23]. Recent studies of subsp. *uliginosa* found that plants are self-fertile, and the mating system is predominantly outcrossed with pollination primarily by small non-flying mammals, e.g., honey possums, with some contribution from birds and insects [20,23]. Subspecies *nivea* and *uliginosa* flower between July and September and, although records are few, subspecies Morangup has been recorded to flower in April, June, August, and September [24]. Current flowering periods reported for all subspecies overlap (Western Australian Herbarium) suggesting that any gene flow between subsp. *uliginosa* and the other subspecies is more likely to be restricted by distance and environment rather than temporal isolation. A review of the Australian flora [25] found abundance and population disjunction to be strong determinants of the distribution of contemporary genetic diversity. Given the patchy abundance of *B. nivea* subspecies and the allopatric distributions of ssp. *nivea* and *uliginosa* in different biogeographic areas, we predict that there will be a significant genetic divergence between populations and subspecies. Analysis of the general pattern of genetic variation would provide a phylogenetic context for understanding subspecies relationships and their morphological and ecotypic variation, assist in consideration of the status of subspecies Morangup and provide information for the conservation of the rare subspecies.

Here, we surveyed four cpDNA sequences and 10 nuclear microsatellite loci from multiple individuals of each of the three subspecies of *B. nivea* in populations representative of the species' range to describe the pattern of genetic variation and assess whether it is reflected in the current taxonomy based on the phenotypic and ecotypic variation. Specifically, we address the following questions: (1) is there evidence of genetic structure in the range of *B. nivea*, and (2) are the phenotypically and ecotypic based subspecies reflected in the genetic relationships among populations within the species?

#### **2. Materials and Methods**

#### *2.1. Sampling and Genotyping*

Leaves were sampled from four to 20 adult plants in seven (subsp. *nivea*), one (subsp. Morangup), and 10 (subsp. *uliginosa*) populations from the ranges of the subspecies (Figure 1). Genomic DNA was extracted from lysed, freeze-dried leaf material following the CTAB-PVP method [26]. The chloroplast *psb*A-*trn*H, ndhF, trnV, and *trn*Q-*rps*16 intergenic spacer regions were selected for amplification and sequencing in three random samples from all study populations. Sequence amplification and analysis were conducted according to [27] and sequenced via Macrogen Inc. (Seoul, Korea). SEQUENCHER 5.0 (Genecodes Corp., Ann Arbor, MI, USA) was used to edit miscalls and to align and trim sequences. All four cpDNA regions were concatenated in MESQUITE 3.04 [28] to a total sequence length of 2457 bp, and two indels were identified and coded. Chloroplast haplotypes were identified using DNASP 5.1.1 [29]).

**Figure 1.** Map showing: (**a**) the distribution of *Banksia nivea* in southwestern Australia. Dots indicate the locations of known populations based on records of the Atlas of Living Australia [30]. Subspecies indicated by color; red, subsp. *nivea*; green, subsp. *uliginosa*; blue, subsp. Morangup. (**b**) Distribution of cpDNA (*psb*A-*trn*H, ndhF, trnV, and *trn*Q-*rps*16) haplotypes overlaid on a geographical map of sampling sites. Pie charts show the proportion of sampled individuals with a given haplotype.

Genotypes at 10 nuclear microsatellites were determined in seven subsp. *nivea*, one subsp. Morangup, and seven subsp. *uliginosa* populations (Table 1). Microsatellite loci were amplified as previously described [23,31]. One additional microsatellite locus was also amplified: DnB003; forward primer sequence 5- -AAGCCCAATATGACCAATAACC-3- and reverse primer sequence 5- -GTCGGCTATATGACTGCATCAC-3- . Modifications to the cited methods were made for MgCl2 concentration and adjusted to 1.5 mM for DnC010 and 1.0 mM for DnB003.

**Table 1.** Diversity statistics based on 10 nuclear microsatellite loci for populations of *Banksia nivea* from south-western Australia.


*N*, mean sample size per locus; *A*, mean number of alleles per locus; *A*<sup>R</sup> allele number adjusted by rarefaction; *H*O, observed heterozygosity; *UH*e, unbiased expected heterozygosity; *F*, Wright's Inbreeding coefficient; standard errors in parentheses; \* Significantly different from zero, *p* < 0.05.

Microsatellite loci were separated on an Applied Biosystems (Foster City, CA, USA) 3730 capillary sequencer, and 260 individuals (115 subsp. *nivea*, 20 subsp. Morangup, and 125 subsp. *uliginosa*) were genotyped using GENEMAPPER version 5.0 (Applied Biosystems, Foster City, CA, USA). Tests for stutter bands, large allele dropout, and null alleles were conducted using MICROCHECKER 2.2.3 [32]. Tests of linkage disequilibrium among pairs of loci were performed with GENEPOP 4.2 [33].

#### *2.2. Chloroplast DNA Diversity and Relatedness*

We used ARLEQUIN 3.5.2.2 [34] to estimate cpDNA genetic diversity as nucleotide diversity (*π*), haplotype diversity (*H*D), population differentiation as *F*ST, and differentiation between subspecies (global and pairwise) as *F*ST using pooled data. Partitioning of diversity between subspecies, populations within subspecies, and within populations was examined by analysis of molecular variance (AMOVA). Estimates of population differentiation (*G*ST, *N*ST) and presence of phylogeographical structure (*N*ST > *G*ST) were estimated using Per-MUT 2.0 [35]. If *N*ST is significantly greater than *G*ST, haplotypes within populations are more likely to be closely related than haplotypes among populations.

In ARLEQUIN, tests for neutrality and population expansion were estimated using Tajima's *D* [36] and Fu's *F*<sup>s</sup> [37], and mismatch distribution analyses were also made to infer spatial and demographic history. Goodness-of-fit to models of spatial or demographic expansion were tested with Harpending's raggedness index (*H*Rag) and the sum of squared differences (SSD). These models test the deviation of observed values from distributions expected under population expansion.

To examine the evolutionary relatedness of chloroplast haplotypes, we constructed a median-joining maximum parsimony (MJMP) network in NETWORK 5.0 [38].

#### *2.3. Nuclear SSR Diversity and Structure*

We measured nSSR genetic variation for each subspecies as mean multi-locus parameters per population (number of alleles per locus, *A*; observed heterozygosity, *H*o, unbiased expected heterozygosity, *UH*e; Wright's inbreeding coefficient, *F*) using GENALEX v6.501 [39], and as allele number adjusted by rarefaction *A*R using HP-RARE [40]. We compared parameters between species using ANOVA.

We measured overall differentiation among populations (*F*ST) using FREENA with and without the Excluding Null Alleles method (ENA) that corrects for null alleles, with 1000 bootstraps to generate 95% confidence intervals [41]. Partitioning of diversity between subspecies, populations within subspecies, and within populations was examined by analysis of molecular variance (AMOVA) in GENALEX v6.501 [39]. We also estimated global and pairwise differentiation between populations and between subspecies from pooled data as *F*ST using GENALEX with statistical testing by random permutations. We used a Mantel procedure to test for a correlation between log10 pairwise geographic distances and linearized pairwise genetic distances estimated in GENALEX.

Genetic structure was examined using both direct phenetic and model-based Bayesian analyses. For phenetic analysis, we used PHYLIP 3.69 [42] to construct an unrooted neighbor-joining (NJ) tree based on CS Chord genetic distance calculated in MSA 4.05 [43] with clustering patterns validated with 1000 bootstraps. For Bayesian analyses, we used STRUCTURE 2.3.4 [44] to identify genetically similar clusters (*K*) and the proportions of individuals' genotypes belonging to clusters (*q*). To identify the optimum number of clusters (*K*) and the likelihood of sub-clusters, we used both the Δ*K* statistic of [45] and the median of estimated Ln probabilities of *K* values using CLUMPAK [46]. Two or more optimal *K* may be found if samples are taken from hierarchically structured groups such as species containing subspecies ([45,47]. We, therefore, used hierarchical analyses for (1) all samples and (2) two identified population groups (subsp. *nivea* together with subsp. Morangup, and subsp. *uliginosa* separate). We ran 20 replicates with a burn-in of 100,000 with 500,000 iterations for Markov chain Monte Carlo parameters for *K* = 1–15 possible clusters. We used parameter recommendations [47] including no prior knowledge, the alternative ancestry before separate alphas for each population, an initial ALPHA value of 0.1, and the correlated allele frequency models.

#### **3. Results**

#### *3.1. Chloroplast Diversity and Divergence*

The concatenated, aligned cpDNA sequence data (2457 bp) revealed 15 haplotypes from 54 individuals from 18 populations: eight haplotypes from subsp. *nivea*, one from subsp. Morangup, and six from subsp. *uliginosa* (Figure 1). Only one population had more than one haplotype; a subsp. *nivea* population (E) was found near the eastern margin of the species' distribution (Figure 1b) with two H06 samples, and one H07 sample. No haplotypes were found in more than one subspecies (Figure 1b) and there was no haplotype diversity in subsp. Morangup. Within the population, haplotype diversity was very low, and most haplotypes were population-specific (86.7%) although two haplotypes were found in multiple subsp. *uliginosa* populations (Figure 1b). Measures of haplotype diversity were therefore high overall, and high in subsp. *nivea*, although lower, although not significantly so, in subsp. *uliginosa* (Table 2). Nucleotide diversity was very low overall and within subspecies. Estimates of overall population differentiation were very high (*F*ST = 0.992, *G*ST = 0.961; Table 2). Tests revealed significant phylogenetic structure in subsp. *nivea* (*N*ST > *G*ST) but not in subsp. *uliginosa* or overall. AMOVA indicated there was significant differentiation among subspecies (23.42%) although most variation was

between populations (75.82%) with a very small proportion (0.76%) within populations (Table 3). Pooled pairwise differentiation was highest for subsp. *uliginosa* compared to subsp. Morangup (*F*ST = 0.678). Differentiation between subsps. *nivea* and Morangup (*F*ST = 0.225), and between subsps. *nivea* and *uliginosa* (*F*ST = 0.270) were lower and similar. This pattern reflects the prevalence of population-specific haplotypes and singlehaplotype populations.

**Table 2.** Diversity statistics, tests of neutrality, and mismatch analyses based on sequences *psb*A-*trn*H, ndhF, trnV, and *trn*Q-*rps*16 of chloroplast intergenic spacers regions in *Banksia nivea*.


Standard deviations are in parentheses. SSD = sum of squared differences. *H*Rag = Harpending's raggedness index. NS = not significant.

**Table 3.** Analysis of molecular variance (AMOVA) of *Banksia nivea* subspecies based on chloroplast haplotypes and nuclear microsatellite loci.


The haplotype network was an asymmetrical star structure and contained two closed loops indicative of homoplasy (Figure 2). One side of the network corresponds to subsps. *nivea* and Morangup and showed longer branches with more divergent haplotypes connected through the H01 haplotype to weakly diverged haplotypes found in subsp. *uliginosa*. The haplotype found in subsp. Morangup was divergent but no more so than other haplotypes from subsp. *nivea*. Only H01 was inferred to give rise to several other haplotypes. Therefore, there was a weak geographical pattern concordant with the taxonomy of the described subspecies with haplotypes from subsp. *uliginosa* found in the southwest of the species' range is closely related to haplotypes from subsp. *nivea* from the central part of the species range.

**Figure 2.** Median-joining maximum parsimony (MJMP) network of chloroplast haplotypes observed in *Banksia nivea* from south-western Australia. Circle sizes are proportional to haplotype frequency among samples. Black dashes represent a single nucleotide substitution (not bold) or indel (bold). Black boxes represent inferred unsampled haplotypes. Haplotype numbers and colors correspond to those in Figure 1.

Estimates of neutrality (Tajima's *D*, Fu's *F*s) that can predict population size changes were not significant for either the species overall or for the separate subspecies (Table 2). However, mismatch analysis revealed evidence of demographic and spatial expansion, both overall and for subsp. *nivea* and *uliginosa* separately, as SSD and HRAG values that did not deviate from models of sudden expansion (Table 2). Non-significant *p*-values do not deviate from the null model and therefore support an expansion scenario.

#### *3.2. Nuclear Diversity and Structure*

No evidence of stutter or large allele dropout was detected for nSSR loci and there was no evidence of significant linkage disequilibrium. We detected significant frequencies of null alleles at four loci (DnA011, DnC010, DnB003, and DnD007) but a comparison of *F*ST (95% CI) estimates with and without ENA adjustment (reported here) showed that null alleles did not cause significant bias and therefore loci were not excluded from analyses.

The average number of alleles per population was low (2.3–8.3; Table 1) even after rarefaction (2.3–12.63). Other diversity measures were moderate and although levels were generally lower in subsp. *uliginosa* than in subsp. *nivea*, there were no significant differences between subspecies. Significant inbreeding was not detected in subsp. *uliginosa* or Morangup although Wright's inbreeding coefficients (*F*) were positive and significant in four of seven populations of subsp. *nivea* (NN, W, BR, A; Table 1).

Genetic differentiation was high overall (*F*ST = 0.306). Differentiation levels between populations within subspecies were similar (ssp. *nivea F*ST = 0.254; ssp. *uliginosa F*ST = 0.295). Partitioning by AMOVA found only a small amount of variation between subspecies (3%; Table 3). Most variation was within populations (70.3%) with 26.7% between populations. This is reflected in low pooled global differentiation among subspecies (*F*ST = 0.095). Pairwise pooled comparisons showed the highest differentiation between subsp. Morangup and *uliginosa* (*F*ST = 0.178), while differentiation between subsp. *nivea* and both Morangup (*F*ST = 0.094) and *uliginosa* (*F*ST = 0.082) was lower and similar.

The phenetic analysis showed separation of subsp. *nivea* and subsp. *uliginosa* populations that, although weak, was concordant with taxonomy while subsp. Morangup

was nested within subsp. *nivea* (Figure 3a). Grouping was not strong except for the most northern (J, NN) and most south-western (B, GB) populations. There was a significant signal of increasing genetic distance with geographic distance (IBD) across all populations (*r*<sup>2</sup> = 0.128, *p* < 0.05) and also within subspecies (subsp. *nivea*, *r*<sup>2</sup> = 0.296, *p* < 0.01; subsp. *uliginosa*, *r*<sup>2</sup> = 0.308, *p* < 0.05).

**Figure 3.** The genetic structure of sampled populations of *Banksia nivea* in south-western Australia, based on individual nuclear microsatellite genotypes. (**a**) A neighbor-joining (NJ) tree of CS Chord distance. Support is shown on the branches as the number of bootstraps out of 1000. Values > 500 are shown. (**b**–**d**) Bar graphs inferred using Bayesian assignment in STRUCTURE 2.3.4 showing (**b**) structure at *K =* 2 for all subspecies (**c**) structure at *K* = 2 for subspecies *nivea* and Morangup, and (**d**) structure at *K* = 5 for subspecies *uliginosa*. Each individual is represented as a single line with colored segments representing the proportion of ancestry from *k* clusters (*q*). Results are the optimal alignment of replicates.

At the highest level, STRUCTURE analysis identified an optimum of two genetic clusters (Figure 3b) that reflected a geographic pattern and were partly aligned with the morphological subsp. *nivea* and *uliginosa*. As in the distance-based analysis, subsp. Morangup clustered with subsp. *nivea*. Admixture was evident in populations in the geographically central (A and BR) and eastern (CA, E) range while populations in the far north and south-west were more clearly differentiated. This may reflect IBD in the center of the range. At a lower level of structure, clusters reflected populations or groups of geographically proximal populations (Figure 3c).

#### **4. Discussion**

Patterns of genetic variation between subsp. *nivea* and *uliginosa* were concordant with the current taxonomy, while subsp. Morangup could not be distinguished from subsp. *nivea* in this study. Phylogeographic analyses suggested divergence of subspecies may have been associated with expansion into the southwest corner of Western Australia, a biogeographic area characterized by different substrates, climate, and vegetation. As predicted, patchy abundance was associated with high differentiation between populations and low to moderate nuclear variation reflecting the impact of small population size and restrictions to gene flow. Although genetic diversity was generally lower in the localized subsp. *uliginosa* than in the widespread subsp. *nivea*, differences associated with range were not significant, contrary to expectations based on meta-analysis across Australian plants [25]. This may be due to the stronger effects of patchy distribution in both subspecies.

#### *4.1. Distinction between Subspecies*

The morphological differentiation of allopatric subsp. *nivea* and *uliginosa* were reflected in patterns of cpDNA and nuclear microsatellite variation, while subsp. Morangup was not genetically differentiated from subsp. *nivea*. Evolutionary forces act on phenotypic traits differently from neutral genetic markers, and thus morphological divergence is not always associated with genomic divergence. Subspecies have traditionally been defined as phenotypically distinct, allopatric sets of populations that may intergrade at geographic boundaries [2,3] and are widely adopted in plant taxonomy, primarily using geographical and ecological differences to distinguish them [4]. Given the lack of genetic differentiation, we suggest a review of the morphological variation in subsp. Morangup compared to subsp. *nivea* is required to inform a taxonomic revision. A review of approaches to dealing with species-population continuums of genetic diversity for conservation in the age of genomics [1] proposed that, although they may not represent historically isolated populations and satisfy criteria for Evolutionarily Significant Units (ESUs), the recognition of phenotypically defined subspecies may be warranted.

Although differentiated by patterns in the haplotype network and structure plots of nuclear variation, the divergence between subsp. *nivea* and *uliginosa* was not strong. Investigations into congruence between genetic structure and morphological variation in widespread plant species in the SWAFR vary. For example, studies of two widespread and morphologically variable species complexes, *Melaleuca uncinata* R.Br. [48] and *Calothamnus quadrifidu*s R.Br. [49] have found general agreement between morphological variation and genetic structure. In contrast, deep lineage divergence in *B. sessilis* [12] was not aligned with morphological varieties and was associated with differences in habitat and substrate.

#### *4.2. Biogeographic Expansion*

Differentiation of subsp. *nivea* and *uliginosa* was consistent with other genetic and phylogeographic studies in Australian plants that have identified distinct lineages that reflect the geological and edaphic complexities of habitats that vary in terms of vegetation and climate [14,17,25]. In *B. nivea*, the patterns of geographic separation in different habitats (dry sandy soils contrasting with winter-wet ironstone soils), combined with haplotype diversity and morphological differences provide support for hypotheses of divergence driven by expansion into a different biogeographic area. Both subsp. *nivea* and *uliginosa* showed high haplotype diversity with low nucleotide diversity, signals of expansion, and a phylogeographic pattern consistent with subspecies differentiation. The phylogeographic structure was also observed in the more widespread subsp. *nivea*. These traits suggest that diversification associated with geographic isolation and habitat specificity are likely to have contributed to the divergence of subspecies following expansion into the distinct southwest Australian ironstone habitat. A similar scenario was proposed [49] for another ironstone endemic, the woody shrub *Calothamnus quadrifidus* subsp. *teretifolius* A.S. George & N. Gibson, which does not share haplotypes with other subspecies of *C. quadrifidus* found outside the ironstone habitat. Similar genetic differentiation of populations occurring in the specific ironstone habitat has also been observed in *Hakea oldfieldii* Benth. [50]. Expansion to occupy habitats with different substrate likely leads to differentiation through adaptation, and divergence associated with substrate has also been identified in other species in different ecosystems, e.g., [9–12].

In *B nivea*, a more ancient origin for subsp. *nivea* compared to subsp. *uliginosa* is suggested by higher diversity and mutational divergence of haplotypes compared to subsp. *uliginosa*. This would be consistent with branching patterns in the dryandra clade of the *Banksia* phylogeny [19]. Although major spatial contraction-expansion dynamics appear to have been rare in the mesic biota of Australia [51,52], range expansions have been associated with acceleration of the progressive drying of mesic environments that began in the late Pliocene (c. 3 Mya) [52,53]. Range expansions have also been associated with the southward progress of increasing aridity in the SWAFR from the mid-Pleistocene that opened up habitats within the wetter forests allowing for the expansion of species to these areas [49,54,55]. Expansion and divergence in *B. nivea* are consistent with this pattern providing another example of expansion among widespread, woody shrubs. Indeed, the pattern of southwestern expansion, as reflected in haplotype network relationships and divergence, is most similar to that found in the widespread wind-pollinated shrub, the dwarf sheoak *Allocasuarina humilis* (Otto & A. Dietr.) L.A.S. Johnson [54], although on a smaller scale. This phylogeographic pattern in *A. humilis* was also best explained by south-west expansion from populations with a more ancient history of persistence in dry shrublands (300–600 mm year<sup>−</sup>1) into areas previously occupied by higher rainfall forests (600–1400 mm year<sup>−</sup>1) as the climate dried progressively from the late Miocene [54].

#### *4.3. Contemporary Genetic Diversity and Population Structure*

Many reviews of nuclear genetic variation have found that genetic structure is influenced by mating systems, life-history traits, chromosomal variation, population distribution, and other ecological traits related to gene flow [7,56–59]. A specific analysis of diversity and population differentiation in the Australian flora [25] found strong associations with abundance, where patchy populations were significantly associated with low diversity and high differentiation. We found patterns consistent with this in *B. nivea* but, contrary to the strong association also expected for range and diversity, both localized and widespread subspecies of *B. nivea* had diversity levels that are expected for localized species. Over the widespread distribution of the species, low to moderate nuclear genetic diversity and moderate to high differentiation among populations is not unexpected. Patchy populations are often small, and population genetic theory predicts they may be prone to genetic drift and inbreeding leading to loss of genetic diversity and differentiation [56,60]. We found no significant evidence of drift or bottlenecks in *B. nivea*, although our ability to detect these impacts was limited by sample size. We did identify significant inbreeding in this study in some populations of subsp. *nivea* but not in subsp. *uliginosa*. More detailed studies of the mating system in subsp. *uliginosa* showed high outcrossing (95%) and little relatedness amongst adult plants within populations [23], and high production of seeds that was unaffected by population size [20].

Analysis of contemporary genetic relationships revealed evidence of restricted gene flow between populations across the species' range with geographic clustering, high pairwise *F*ST values, a significant proportion of genetic variation apportioned between populations, and significant IBD. We detected a genetic pattern in cluster analysis and in the distance-based tree that related to the current subspecies taxonomy and different biogeographic areas. This, together with significant IBD, is likely to reflect restricted gene flow across heterogeneous landscapes. Cluster analysis also detected a geographic pattern in genetic sub-structuring within both subsp. *nivea* and *uliginosa* that are likely to reflect gene flow patterns. Gene flow via pollen dispersal is achieved in *Banksia* species by birds, mammals, and insects [61,62]. Specific pollinators for *B. nivea* have not been determined and it is likely pollination is achieved primarily by small non-flying mammals that move within a small range (<30 m), even over several months, and are likely to achieve nearneighbor pollination, and by birds such as honeyeaters that are likely to facilitate some longer distance pollen dispersal between local populations. In a study on pollinators in

subsp. *uliginosa* [20] treatments open to birds and mammals produced high levels of fruit compared to those open to invertebrates only or closed to all pollinators, and treatments open to all pollinators produced 39% more fruit than those open to mammals but not birds. Effective pollination was also shown in a study of the mating system that found high levels of outcrossing (95%), and up to 30% of progeny produced in seed crops was attributed to mating with fathers outside small patchy populations likely due to bird pollination [23]. Unusually for *Banksia*, seeds of *B. nivea* have adaptations for dispersal, and the sharing of seed-dispersed haplotypes among geographically close subsp. *uliginosa* populations may reflect some localized inter-population dispersal, although this is difficult to distinguish from co-ancestry among local populations. The significant differentiation among populations and the biogeographic structure observed in *B. nivea* likely reflects generally localized pollen dispersal associated with habitat specificity of the predominant non-flying mammal pollinators [20,23], along with generally short-range gene dispersal by seeds.

#### **5. Conclusions**

Analysis of genetic relationships among the three subspecies of *B. nivea* in southwestern Australia supported the current taxonomy of subspecies *nivea* and *uliginosa*, and indicated clarification of the morphological traits and heritability in subsp. Morangup is warranted. The climatic history of the SWAFR appears to have had a significant influence on the genetic divergence within *B. nivea*. We found patterns of variation consistent with expansion into a new biogeographical area and onto different substrate followed by divergence into lineages concurrent with the taxonomic subspecies. The pattern of nuclear DNA diversity and differentiation likely reflects the influence of distribution and restricted gene flow between small and patchy populations.

**Author Contributions:** Conceptualization, M.B.; methodology, M.B. and J.S.; formal analysis, J.S.; writing—original draft preparation, J.S.; writing—review and editing, J.S. and M.B.; visualization, J.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to legislative protection for threatened species.

**Acknowledgments:** We thank Sarah Tapper, Ashney Shah, Shelley McArthur, Bronwyn Macdonald, and Margaret Hankinson for laboratory assistance. We thank Neil Gibson and Colin Yates for their assistance in the field and discussion on the biology of *Banksia nivea* and the habitat specialization of the flora.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Genomic Screening Reveals That the Endangered** *Eucalyptus paludicola* **(Myrtaceae) Is a Hybrid**

**Kor-jent van Dijk 1, Michelle Waycott 1,2,\*, Joe Quarmby 3,4, Doug Bickerton 3, Andrew H. Thornhill 1,2, Hugh Cross 1,5 and Edward Bi**ffi**n <sup>2</sup>**


Received: 12 November 2020; Accepted: 7 December 2020; Published: 10 December 2020

**Abstract:** A hybrid origin for a conservation listed taxon will influence its status and management options. Here, we investigate the genetic origins of a nationally endangered listed taxon—*Eucalyptus paludicola*—a tree that is restricted to the Fleurieu Peninsula and Kangaroo Island of South Australia. Since its description in 1995, there have been suggestions that this taxon may potentially be a stable hybrid species. Using a high throughput sequencing approach, we developed a panel of polymorphic loci that were screened across *E. paludicola* and its putative parental species *E. cosmophylla* and *E. ovata.* Bayesian clustering of the genotype data identified separate groups comprising *E. ovata* and *E. cosmophylla* while *E. paludicola* individuals were admixed between these two, consistent with a hybrid origin. Hybrid class assignment tests indicate that the majority of *E. paludicola* individuals (~70%) are F1 hybrids with a low incidence of backcrossing. Most of the post-F1 hybrids were associated with revegetation sites suggesting they may be maladapted and rarely reach maturity under natural conditions. These data support the hypothesis that *E. paludicola* is a transient hybrid entity rather than a distinct hybrid species. We briefly discuss the conservation implications of our findings.

**Keywords:** conservation; natural hybridisation; *Eucalyptus*; high throughput sequencing; NGS; South Australia

#### **1. Introduction**

Natural interspecific hybridisation is a common phenomenon in plants and is an important evolutionary process. The outcomes of hybridisation can be diverse, maintaining, reducing or increasing evolutionary divergence among taxa [1]. Hybridisation between well-differentiated species can lead to the origin of new species involving a change in base chromosome number (i.e., allopolyploid hybrid speciation) or without such a change (homoploid hybrid speciation). In the case of the former, a change in ploidy can lead to the spontaneous development of reproductive isolation, while in contrast successful homoploid hybrid species must overcome significant ecological, genetic and demographic obstacles and are thus considered rare [2]. Homoploid hybrid speciation has been suggested in a number of cases, although the majority of these have demonstrated a hybrid origin of taxa but lack conclusive evidence for hybrid speciation [1]. Key additional criteria include evidence of reproductive

isolation of hybrid lineages from the parental species and evidence that reproductive isolation arose as a consequence of hybridisation [2].

The genus *Eucalyptus L'Hér.* includes approximately 650 species of mostly trees that are endemic to Australia (with a few exceptions) and dominate the forests and woodlands of that continent. Taxonomists classify eucalypt species, including members of the three genera: *Angophora Cav.*, *Corymbia K.D.Hill & L.A.S.Johnson,* and *Eucalyptus*, into various subspecies, sections and series based on morphology and assumed relatedness (see Nicolle 2019 [3] for the most recent classification). Eucalypts are the worlds' most widely grown plantation hardwoods and there has been considerable interest in genetic improvement through manipulated hybridisation [4]. Natural interspecific hybridisation has also been widely reported within the group (e.g., [5–9]) and the propensity for hybridisation and its outcomes are strongly predicted by the degree of evolutionary divergence among species [5,10–13]. Natural hybridisation between species from different genera or subgenera is not believed to occur (but see [7] for a possible exception) and is relatively uncommon among sections within subgenera [5,13]. There are several well-documented cases of natural hybridisation amongst closely related eucalypts although according to Griffin et al. [5] only 15% of cases, where hybridisation was likely (intrasectional relatives with proximal distributions), resulted in hybrid formation. Known prezygotic barriers in eucalypts include pollen tube arrest, the frequency of which increases with evolutionary distance between parents [14] while species-specific variation in flower size presents a structural barrier to gene flow [15]. Postzygotic barriers may also contribute to reproductive isolation. In a recent study, there was strong evidence for outbreeding depression (negative epistasis) amongst two closely related *Eucalyptus* species [16], which is predicted to increase with evolutionary divergence because of "snowballing" epistatic interactions [17].

*Eucalyptus paludicola* D.Nicolle was described by Nicolle 1995 [18] and in light of its rarity and exposure to ongoing threatening processes [19] has been listed as endangered under the Environment Protection and Biodiversity Conservation Act of 1999 (Commonwealth of Australia) and the National Parks and Wildlife Act 1972 (State of South Australia). The known distribution includes approximately 34 populations with an estimated 720–750 individuals in total [20]. Its range is limited to the Fleurieu Peninsula and Kangaroo Island of South Australia, where it occurs on seasonal swampy sites, often in highly modified landscapes. *E. paludicola* is commonly coassociated with *Eucalyptus cosmophylla F. Muell*. and *E. ovata Labill*. and is in some respects morphologically and ecologically [18] as well as genetically (based on amplified fragment length polymorphisms [21]; and DART markers [22]) intermediate between these species, suggesting the likelihood of a hybrid origin. The relatively constant morphology of *E. paludicola* throughout its distributional range and evidence that the progeny breed true in an arboretum has been argued as supporting specific recognition [18,23]. While existing molecular evidence is consistent with a hybrid origin, a hypothesis of hybrid speciation is difficult to distinguish from alternatives [2], for instance, that *E. paludicola* individuals represent transient hybrids lacking reproductive isolation from the parents. Given that there are presently no known instances of hybrid speciation among eucalypts, a resolution of this issue may have significant bearing on our understanding of hybridisation and speciation within this ecologically and economically important lineage. Resolving the origins of *E. paludicola* is also important in clarifying its taxonomic and conservation status. In Australia, if a species is listed under the Federal Environment Protection and Biodiversity Conservation Act 1999 (EPBC Act), then it must be protected. Further, species that have arisen via hybridisation may be afforded full conservation protection while transient hybrid (i.e., lacking reproductive isolation from the parents and hybridisation not directly leading to the formation of a new taxon) entities are not presently recognised under relevant Commonwealth and State legislation. The EPBC Act does not, however, require a test to identify if that species might be a transient hybrid or hybrid species, meaning some currently listed taxa may be invalid.

To resolve the origins of *E. paludicola* we used a high throughput sequencing (HTS) approach to assess genetic diversity. Specifically, we leveraged the *E. grandis* genome sequence [24] and restriction-site based HTS to develop a panel of molecular markers. We used multiplex PCRs to screen across representatives of *E. paludicola* along with *E. ovata* and *E. cosmophylla* and applied Bayesian statistical approaches to hybrid assessment from these data. If *E. paludicola* is a hybrid species it will be indicated by factors including unique genetic diversity and coherence across its geographical range. The alternative hypothesis, that this taxon is a transient hybrid, would be supported by a predominance of early hybrid generations (e.g., F1 individuals) consistent with the absence of reproductive isolation.

#### **2. Materials and Methods**

#### *2.1. Study Species*

We sampled individuals of *E. paludicola, E. cosmophylla* and *E. ovata* across the region of geographic sympatry that includes the southern Fleurieu Peninsula and Kangaroo Island in the state of South Australia. Both *E. paludicola* and *E. cosmophylla* are endemic to this region, while *E. ovata* is widely distributed in temperate south-eastern Australia, with a disjunct range extending to our study region. All three species are placed within *Eucalyptus* subg. *Symphyomyrtus,* and *E. paludicola* and *E. cosmophylla* are considered to be close relatives (Section *Incognitae*; [18]) while *E. ovata* is placed within Section *Maidenaria. E. paludicola* and *E. cosmophylla* can be readily distinguished by their habit (taller and more upright, versus a shrubby to mallee habit, respectively) and inflorescence structure (usually seven flowered in *E. paludicola* versus consistently three flowered in *E. cosmophylla*), while *E. paludicola* has thinner leaves, longer peduncles and pedicels and smaller flowers and fruits. *E. ovata* has a tree habit, a consistently seven flowered inflorescence and has smaller flowers and fruit than the other two species. Both *E. paludicola* and *E. ovata* are associated with seasonally swampy sites, while *E. cosmophylla* grows on a range of soil types from infertile sands to poorly drained gravelly clays [18,23].

#### *2.2. Sample Collection and DNA Extraction*

Leaf material was collected from individuals of each species on both Fleurieu Peninsula and Kangaroo Island and dried using silica-gel. Genomic DNA was subsequently extracted from the leaf material using a commercial service (Australian Genome Research Facility, Adelaide). Sampling details are included in Table 1 (additional details in supplementary Table S1). Our sampling included two sites that were planted with *E. paludicola* as part of a restoration project (*Eucalyptus paludicola* recovery program, Department for Environment and Water).

#### *2.3. Marker Development and Sequencing*

We used methods described by Cross et al. [25] to generate a sequencing library for representative samples of *E. cosmophylla, E. ovata* and *E. paludicola.* The library was then sequenced using an Ion Torrent Personal Genome Machine (Ion PGM, Life Technologies, Carlsbad, California, USA) with 200 bp sequencing chemistry and a 318 v.1 sequencing chip. The raw sequences were imported into CLC Genomics Workbench 9 (Qiagen, Venlo, The Netherlands) for demultiplexing, adapter and quality trimming (ambiguous trim = 2, quality limit = 0.05, minimum read length = 50). The trimmed reads were then de novo assembled (indel and mismatch cost = 2; similarity fraction = 0.8, length fraction = 0.5) and we extracted contigs with coverage greater than 100×. The resulting contigs were compared to the *E. grandis* v2.0 genome sequence [24] using *blast-n* (default parameters in CLC) to identify regions that, from our sequencing library, had a single blast hit and could be considered as potential single copy regions in our target eucalypt species. From these, we targeted regions that had one or more SNPs in our assemblies and had suitable priming sites (based upon the *E. grandis* genome sequence; supplementary Table S2) to amplify a product of not more than 150 bp (excluding primer, adapter and barcode sequences), which is the approximate limit of the Ion Torrent PGM 200 bp sequencing chemistry. Primers were designed in Geneious v7 (Biomatters, Auckland, New Zealand) [26] using the Primer 3 [27] plug-in.


**Table 1.** Taxa and locations of tested samples.

*n* = number of samples; KI, Kangaroo Island; FL, Fleurieu Peninsula.

We used a fusion PCR approach to generate an amplicon sequencing library for each individual [28]. Briefly, in a first step, we amplified in multiplex two panels of primers (24 primer-pairs each at 2 μM per primer in mix, supplementary Table S2), In a second step, we added a sample specific barcode by fusion PCR using the universal adapters added to the locus specific PCR primers as priming sites.

In more detail; for the first round of multiplex PCR we used the Multiplex PCR kit of Qiagen (Venlo, Netherlands) using the suggested manufacturers cycling conditions, with annealing temperature at 60 ◦C and amplifying 20 cycles to avoid overamplification. The reactions were done in 10 μL reactions adding 1 μL of primer mix (2 μM equimolar primer concentrations, supplementary Table S2) and 1.5 μL undiluted DNA template. Each primer consisted of the original sequence obtained from the *E. grandis* genome and an adapter sequence extension on the 5 end to allow fusion PCR (*Eco*RI adaptor on forward and *Mse*I adapter on reverse, supplementary Table S4 in [25]). For the second round of PCR the same chemistry was used as in the first adding 1 μL EcoRI + CA and 1 μL *Mse*I + C fusion primers with internal barcodes (following supplementary Table S4 and Selective amplification in [25]) and 1μL of template (amplicons of first PCR). The reactions were done in 10 μL reaction with an annealing temperature of 60 ◦C and 15 cycles of amplification.

The resulting individual libraries were pooled, purified with AMPure XP (Agencourt, Beckman Coulter Inc., Brea, CA, USA) and then quantified using a 2200 TapeStation (Agilent, Santa Clara, CA, USA) with a high-sensitivity ScreenTape. The final library was diluted to 9.0 pMol and sequenced on an Ion PGM 318 v.1 chip. A total of 5 libraries were prepared to finalise the study.

#### *2.4. Data Processing*

The raw sequence data were imported into CLC Genomics Workbench 9 for demultiplexing, adapter and quality trimming (ambiguous trim = 2, quality limit = 0.05, minimum read length = 50). The resulting reads for each individual were then mapped to the reference sequences (indel and

mismatch score=2, similarity fraction=1.0, length fraction=0.5). With these mappings as input we used the fixed ploidy variant caller in CLC to identify SNPs (ploidy = 2, required variant probability = 0.95, minimum coverage = 10, minimum frequency = 0.2, filter homopolymer regions with minimum length = 3). Read mappings were then visually inspected and haplotypes were manually determined. For the final data we excluded loci that showed evidence of paralogy (more than 2 haplotypes per individual), amplified poorly (coverage <10 for >40% of individuals) or were invariant (minor allele frequencies <0.2). For all downstream data analyses, we treated each haplotype recovered as an allele for that locus.

#### *2.5. Data Analyses*

Summary statistics for species level genetic parameters including observed number of alleles (A) and allele frequencies, unbiased gene diversity (HE), observed heterozygosity (HO) and the fixation index (FIS) were calculated using the software GenAlEx version 6.5 [29]. For these analyses, individuals were assigned to species groups based upon their morphological identification with the exception of those which, based upon the STRUCTURE results with K = 2, had an assignment probability to a cluster of <0.9.

Bayesian clustering analysis implemented in the program STRUCTURE 2.3.4 [30] was used to identify the most likely number of genetic clusters (K) among the samples and to assign individuals to clusters (input data in supplementary Table S3). Using this approach, individual genotypes can assign to a single cluster or can have mixed assignment when ancestry is shared in more than one parental group due to hybridisation (i.e., admixture). For these analyses, the admixture model with correlated allele frequencies with no priors of individual identification (i.e., morphological species assignment) were used. Ten independent runs at K values one to eight were run with MCMC simulations having 900,000 steps following a burn-in period of 100,000. The delta-K statistic [31] as implemented in STRUCTURE HARVESTER [32] was used to determine the most likely number of clusters (K) in the data. We assessed the average proportion of membership (Q*i*) of samples to the inferred cluster and the individual membership proportion Q*<sup>i</sup>* of each sample to the K clusters.

The methods implemented in NEWHYBRIDS 1.1 [33] were used to assess the evidence of hybrid ancestry for individual samples (input data in supplementary Table S3). NEWHYBRIDS uses MCMC simulations to provide a posterior probability of an individual assignment to predefined genealogical classes. We ran these analyses assuming 2 generations of hybridisation resulting in six genealogical classes: one class for each parental type as well as first generation (F1), second generation (F2) and 2 backcross (F1 × parent 1; F1 × parent 2) hybrid classes. All analyses were run without prior information regarding individual membership. Analyses were run over 200,000 steps following 50,000 burn-in using "Jeffery's like priors" for mixing proportions and allele frequencies. Three replicate analyses were performed to assess consistency across runs. We used a posterior probability of 0.9 as a threshold for assigning an individual to a specific genealogical class.

Following Nielsen et al. [34] we conducted simulations for our empirical data in order to assess the power of these markers to distinguish among genealogical classes and assess the range of q values expected for admixed individuals. HYBRIDLAB [34] was used to simulate pure parental genotypes using real data from reference individuals that could be unambiguously assigned (individual assignment probability >0.95) to a genetic cluster in our STRUCTURE analyses (above) with K = 2. We simulated 200 genotypes for each parental type that were then used to successively generate F1, F2 and two backcross genotype classes comprising 200 simulated multilocus genotypes per genealogical class. The simulated data where then analysed using STRUCTURE, with K set to 2, in order to assess the efficiency and accuracy of these analyses to identify admixed genotypes. Similarly, simulated genotypes were analysed in NEWHYBRIDS to assess the efficiency with which this approach could allocate simulated individuals to the correct genotype class. For both sets of analyses, settings are as described above for the real genotype data. Finally, a Principal Coordinates Analysis was performed in

GenAlEx to visualise the separation between species and individuals relative to simulated data using 50 simulated genotypes per genotype class.

#### **3. Results**

#### *3.1. Molecular Data*

A suite of 30 gene regions that were consistently amplifying with high coverage were used from a set of 48 primer pairs developed. These 30 gene regions generated adequate sequence data, containing multiple SNP loci, for downstream analyses. Of the eighteen loci deemed not adequate, 13 amplified poorly or not at all, 4 were invariant, and for one locus more than 2 alleles were apparent in some individuals suggesting paralogy. From the 30 useable marker gene regions, the proportion of missing data per locus averaged <4%, with a single locus having a maximum of c. 40% while the remaining loci generally fell below 10% of missing values. Missing values per individual were on average <10% (range 0–70%). A final dataset of 151 individuals was used for further analyses (supplementary Table S1).

The number of alleles per locus ranged from 2 to a total of 21 alleles (supplementary Table S1). The average number of alleles across the 30 loci was highest for *E. paludicola* (7.17), followed by *E. cosmophylla* (3.1) and *E. ovata* (2.3). This presumably reflects differences in sample size included in this study across taxa, as well as a broader geographic sampling of *E. paludicola.* However, *E. paludicola* was heterozygous at all loci (observed heterozygosity, 0.017–0.968, average = 0.664) while *E. cosmophylla* and *E. ovata* were monomorphic at 5 and 9 loci, respectively, and had average observed heterozygosity of 0.319 (range, 0–0.833) and 0.301 (range, 0–1), respectively. Both *E. cosmophylla* and *E. ovata* showed complete segregation at 8 loci (c. 30%), while there were no markers that were diagnostic for *E. paludicola. E. cosmophylla* and *E. ovata* were strongly diverged with an FST of 0.453, in line with intersectional comparisons within subg. *Symphyomyrtus* [35], while FST values for *E. paludicola* were 0.131 (*E. cosmophylla*) and 0.167 (*E. ovata*).

#### *3.2. Admixture Analyses and Hybrid Class Assignment*

The delta-K statistic for the admixture analyses performed using STRUCTURE was clearly optimal for K = 2 (ΔK = 855.9) with the next best grouping being four clusters (ΔK = 10.9) (Figure S2 in supplementary Table S4). When 2 groupings were assumed, individuals morphologically identified as *E. cosmophylla* or *E. ovata* were each unambiguously assigned to a single distinct cluster. Individuals identified as *E. paludicola* were partially assigned to both clusters with individual membership proportions *Qi* < 0.9 (Figure 1). Significantly, when 3 clusters were assumed, the *E. paludicola* samples did not form a discrete group, as would be expected under the hypothesis that it is a distinct taxon (data not shown). Using simulated data with two parental and 4 hybrid categories, the average membership proportions for parental genotypes were >0.98 to a single K cluster while average assignment probabilities for each of the simulated hybrid classes was <0.91 (maximum individual assignment probability = 0.9 for *E. ovata* backcross) (supplementary Table S5). The *Qi* values of the actual genotype data for *E. paludicola* fell within the range of values found for simulated hybrids and below the values of simulated parental genotypes (Figure 1) and it is reasonable to conclude that most *E. paludicola* individuals can be classified as admixed.

The analyses of the simulated genotype data using NEWHYBRIDS assigned virtually all samples to their expected genealogical class with posterior probabilities exceeding 0.95 (supplementary Table S5) and in the few cases with lower confidence there were no misclassified individuals (i.e., probabilities were apportioned among classes). This suggests that the empirical genotype data contain adequate signal to correctly assign individuals to parental and various hybrid categories. Analyses of the empirical data using NEWHYBRIDS (Figure 2) closely reflect the results of the admixture analyses above (Figure 1) and in particular, individuals identified as *E. cosmophylla* were unambiguously assigned to a pure parental class (posterior probability >0.99), as were all but one *E. ovata* sample. The admixed

individuals (*E. paludicola* and one *E. ovata* individual) identified by STRUCTURE were assigned to a range of hybrid classes with most individuals being placed in a single category. Of the 126 *E. paludicola* individuals included in these analyses, 90 (*c.* 71%) were unambiguously assigned (posterior assignment probability >0.9, mean assignment probability >0.99) to an F1 class and 15 individuals (*c.* 12%) were identified as F2 hybrids. The three individuals (c. 2.5%) identified as backcrossed to *E. ovata* were associated with revegetation sites, as were c. 75% of the F2 individuals. Eighteen individuals (*c.* 14%) could not be confidently placed within a single genotypic class (assignment probabilities <0.9) but were fractioned between pure *E. cosmophylla* (2 individuals), F1, F2 and backcross classes (Figure 2). Assignment probabilities for each genealogical class were summed across all individuals to give an expected number of *E. paludicola* plants in each category (supplementary Table S3 Figure S3). These analyses indicate the predominance of F1 (~76%) and F2 genotypes (~17%) and a relatively low incidence of backcrossing (~3.5%), particularly in the direction of *E. cosmophylla* (~1.9%). Figure 3 plots the first two principal coordinates of 2 analyses (data in supplementary Table S1), one with the real data (all eucalypts used in study) and the 50 simulated genotypes for each genealogical class. The PCoA analysis showed a clear separation of *E. ovata, E. cosomophylla* and *E. paludicola*, which is placed in the intermediate position between them. For the simulated genotype data, the F1 and F2 classes cluster with *E. paludicola*, while the backcross classes are intermediate between *E. paludicola* and each parental taxon. More than 30% of the variation is explained by these two coordinates, the majority being on the horizontal axis (Figure 3). This supports the most important signal in the data being the relative placement of *E. ovata, E. cosomophylla* and *E. paludicola* and aligns the latter with the F1 hybrids as determined by simulations.

**Figure 1.** Proportion of ancestry for 151 *Eucalyptus* individuals with K = 2, inferred using STRUCTURE. Red and the green clusters correspond to *E. cosmophylla* and *E. ovata,* respectively, while *E. paludicola* shows admixture. The horizontal dashed lines indicate the range of *Q* values for simulated hybrid individuals (see text for details).

**Figure 2.** The proportion of individual assignment to 6 predefined genealogical classes for 151 *Eucalyptus* individuals, inferred using NEWHYBRIDS and 30 loci. Red and the green clusters correspond to "pure" *E. cosmophylla* and *E. ovata,* respectively, while *E. paludicola* is represented by a range of hybrid classes. The asterisks (**\***) correspond to *E. paludicola* individuals sampled from revegetation sites. BC in the figure stands for "back cross".

**Figure 3.** Principal Coordinates Analysis for real and simulated genetic data for *Eucalyptus paludicola*, *E. ovata*, *E cosmophylla*. First (F1) and second (F2) generation, back cross with *E. ovata* and back cross with *E. cosmophylla* were all simulated in HYBRIDLAB to depict the positioning of *E. paludicola*.

#### **4. Discussion**

The results of our study clearly indicate that *E. paludicola* is a hybrid taxon derived from *E. cosmophylla* and *E. ovata*, as has been previously suspected based upon morphological [18,23] and molecular evidence [21,22]. Despite the relatively high frequency of hybridisation among eucalypts, the formation of this hybrid would be unexpected given a relatively deep divergence between the progenitor species—*E. cosmophylla* and *E. ovata* are placed within different taxonomic sections of subg. *Symphyomyrtus*, which among eucalypts, strongly predicts the potential likelihood of successful hybrid formation [5,12,17]. Cases of natural intersectional hybridisation are relatively uncommon, and for example, Griffin et al. [5] reported intersectional hybrids within subg. *Symphyomyrtus* at a frequency of <5% amongst geographically proximal species. *E. paludicola* thus represents a rare case of natural

intersectional hybridisation. A recent study based upon crossing experiments between *E. globulus* and congeners spanning a broad range of evolutionary distances suggests that complete reproductive isolation among eucalypts may take in excess of 20 million years to develop [17] (divergence time estimates based upon [36], but see Thornhill et al. [37], who infer significantly younger divergence times for eucalypt lineages based on a more densely sampled phylogeny). According to Crisp et al. [36] the divergence of section *Maidenaria* (Clade I of Steane et al. [38]; *E. ovata*) and the closest relations of section *Incognitae* (*E. cosmophylla*) that were included in their study (sect. *Exsertaria* and *Bisectae,* Clade II of Steane et al. [38]) occurred at between 5–15 million years before present, providing an approximate timescale for the divergence of *E. cosmophylla* and *E. ovata.*

These results indicate that the proposal for a possible instance of *Eucalyptus paludicola* as a hybrid speciation, so far unknown among eucalypts, is unsupported. To confirm this hypothesis, it would need to be supported by evidence that hybridisation has led to the development of reproductive isolation [2]. In addition to a probable hybrid origin, circumstantial evidence that *E. paludicola* is reproductively isolated from the parental taxa [22] has included limited variation in adult and seedling morphology throughout its geographical range, arguing against introgression and the observation that progeny planted in an arboretum show levels of morphological variation within the range of that taxon [18,22]. However, our findings suggest a contrasting interpretation of the status and origins of *E. paludicola*. In particular, the predominance of F1 hybrids among *E. plaudicola* individuals included in this study (Figure 2) is consistent with a relatively uniform morphology that is intermediate to that of the parents but also with ongoing gene flow among the parental species, which occasionally results in hybrid formation. We have also found that hybridisation rarely proceeds beyond the F1 stage, as indicated by the low frequency of F2 and backcrossed individuals amongst our sample. This is despite the fact that many *E. paludicola* individuals are mature and produce viable seed, and herbarium collections observed at the State Herbarium of South Australia (AD) that are referred to *E. paludicola* date back to at least the 1920s, suggesting there has been opportunity for multiple generations of interspecific gene flow. Flowering asynchrony might limit the formation of later generation hybrids [39] although both parental species flower over a long period (autumn to spring i.e., April to November) that is coincident with *E. paludicola* (Spring i.e., August to November) [23] suggesting this is also unlikely. Interestingly, the majority of the F2 and all of the unambiguously assigned backcrossed individuals were associated with the revegetation sites that were included in our study. The production of these individuals included seed germination and on-growing seedlings before planting in restoration sites enabling survival of F2 recruits. In contrast, the low frequency of post-F1 hybrids among natural stands suggests low germination and/or survivorship under field conditions. Similar results have been found elsewhere among eucalypts, and for example, Shepherd and Lee [40] note an absence of mature post-F1 interspecific *Corymbia* hybrids in the field, despite the apparent vigour and fecundity of F1 individuals and their cultivated offspring.

Intrinsic outbreeding depression (OBD) provides a plausible explanation for the poor performance of hybrids reported in a number of controlled eucalypt crosses (e.g., [16,17,40–42]) and is manifested by factors including reduced seed viability, delayed germination and increased mortality throughout the life of a cohort. Compared to F1s, recombination and segregation in later generation crosses may result in advanced generation hybrid breakdown due to disruption of coadapted gene complexes, or loss or duplication of chromosomal segments [4]. Minor disadvantages in performance of the hybrids between seed formation and maturity may be expressed by low survivorship and reproductive isolation can be maintained between parent species despite weak barriers to gene flow [43]. A recent study of *E. globulus* × *E. nitens* controlled crosses found that OBD may not be evident in advanced generations until age 2 C4 years and increased with age at least up to 14 years [16]. Late acting postzygotic isolation has also been implicated as a barrier to the formation of later generation hybrids in the field among eucalypts [40] and other long-lived tree species. e.g., [43]. Importantly, the likelihood and intensity of OBD is strongly predicted by evolutionary/taxonomic distance between eucalypt species [11,17]. OBD may provide a reproductive barrier between *E. ovata* and *E. cosmophylla,* which is suggested by:

(1) the globally low numbers of *E. paludicola* individuals, indicating that hybrid formation is infrequent; and (2) the high proportion of F1 individuals within *E. paludicola,* consistent with selection against later generation hybrids.

Among eucalypts, the lack of evidence for hybrid speciation is perhaps surprising given their propensity for hybrid formation [22]. Eucalypts are remarkably constant in their base chromosome number [44–46] and this in part explains why so many natural hybrids occur, and why fertile synthetic interspecific hybrids are relatively easily obtained [45]. An implication of the above is that hybridisation leading to species formation would likely proceed without a change in chromosome number (i.e., homoploid hybrid speciation), a situation that has been considered vanishingly rare [2,47]. Hybrids can suffer reduced fertility and viability and may be intermediate in ecology and are therefore less fit than the parents in the parental habitat. Even in instances where the hybrid is fertile and locally adapted, weak barriers to gene flow between the hybrid and parental species present a substantial barrier to the development of reproductive isolation. The results of our study suggest that while reproductive isolation is incomplete, strong postmating isolation between *E. ovata* and *E. cosmophylla* is manifest by the production of a transient hybrid taxon, and there is no support for the recognition of a stable hybrid species.

#### *Implications for Conservation*

A hybrid origin for *E. paludicola,* which is currently listed under the EPBC Act as an endangered species, has implications for both its taxonomic and conservation status. Guidelines such as those outlined in the Biodiversity Conservation Act 2016 in the state of Western Australia [48] provide an objective basis to consider the status of *E. paludicola.* Under this scheme, hybrid entities must meet three criteria to warrant conservation listing: they are a distinct entity that is self-perpetuating and has arisen via natural processes. As a set of early generation, predominantly F1 hybrids that have arisen multiple times via rare spontaneous events, we find little support for the ongoing conservation listing of *E. paludicola* or indeed recognition as a distinct species. A better conservation action would be to provide ongoing possibilities for the formation of the hybrid offspring of the two progenitor species, between *E. ovata* and *E. cosmophylla*.

The outcomes of hybridisation are diverse, ranging from the generation of new stable entities through to the loss of diversity through the erosion of species boundaries [1]. While hybridisation may have negative impacts on biodiversity, it has also been argued that hybrids arising from natural processes—and that are not a threat to the integrity of the parental species—should be afforded protection on the basis of preserving genetic diversity and natural evolutionary processes [49,50]. We detected a low level of backcrossing in the *E. paludicola* hybrid system suggesting that hybridisation is unlikely to present a threat to the demographic viability or genetic integrity of either parental species, both of which are relatively common in the region. On the other hand, it has been suggested that hybridisation may have potential adaptive benefits in the context of rapidly changing environments [8] and may have important ecosystem consequences (e.g., [51]). As a relatively rare example of natural intersectional hybridisation within *Eucalyptus, E. paludicola* is of scientific interest, for example, in unravelling the nature of reproductive isolation and speciation within the genus (e.g., [35]). However, it is difficult to justify active and ongoing management for *E. paludicola* beyond the maintenance of the parental species and their habitat. Several conservation listed eucalypt taxa have previously been found to be hybrids in light of molecular evidence (e.g., [6,9]). The results of our study further highlight the limitations of morphological evidence alone in delimiting conservation priorities [48] and in particular for identifying eucalypt hybrids [14] and their systems [17]. It points to the need of thorough molecular genetic analyses as part of any conservation assessment process where hybridisation is likely.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/1424-2818/12/12/468/s1, Table S1: Samples and Genetic Diversity. Table S2: Primers developed for *Eucalyptus paludicola*. Table S3: Input data for the Structure and Newhybs analyses. Table S4: Structure and Newhybs final results. Table S5: NewHybrids simulated data.

*Diversity* **2020**, *12*, 468

**Author Contributions:** Conceptualization, M.W. and D.B. in consultation with the Department for Environment and Water *Eucalyptus paludicola* recovery team; methodology, K.-j.v.D., E.B., D.B., J.Q., H.C.; formal analysis, K.-j.v.D. and E.B.; investigation, E.B., M.W., K.-j.v.D., resources, M.W., D.B., J.Q.; data curation, K.-j.v.D. and M.W.; writing—original draft preparation, M.W., E.B. and K.-j.v.D.; writing—review and editing A.H.T., M.W., E.B. and K.-j.v.D.; visualization, K.-j.v.D.; supervision, M.W.; project administration, M.W., K.-j.v.D.; funding acquisition, M.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received was funded by the South Australian Department for Environment and Water (*Eucalyptus paludicola* recovery program) and The State Herbarium of South Australia.

**Acknowledgments:** Project support provided by Kylie Moritz (Natural Resources Adelaide and Mount Lofty and Natural Resources Murray Darlin Basin, Department for Environment and Water; current address Landscape South Australia—Murraylands and Riverland) for support with collection of samples, liaison with the *Eucalyptus paludicola* recovery team, advice on management actions. Technical support was provided by Ainsley Calladine (State Herbarium of South Australia), Martin O'Leary, samples were provided by Luke Price and Kym Ottewell. We thank the two anonymous reviewers whose comments helped improve the final manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Genetic Patterns and Climate Modelling Reveal Challenges for Conserving** *Sclerolaena napiformis* **(Amaranthaceae s.l.) an Endemic Chenopod of Southeast Australia**

#### **Michael D. Amor \*, Neville G. Walsh and Elizabeth A. James**

Royal Botanic Gardens Victoria, South Yarra 3141, Australia; neville.walsh@rbg.vic.gov.au (N.G.W.); elizabeth.james@rbg.vic.gov.au (E.A.J.)

**\*** Correspondence: michael.amor@rbg.vic.gov.au

Received: 14 October 2020; Accepted: 2 November 2020; Published: 4 November 2020

**Abstract:** *Sclerolaena napiformis* is a perennial chenopod endemic to southeast Australia. Humanmediated habitat loss and fragmentation over the past century has caused a rapid decline in abundance and exacerbated reduced connectivity between remnant populations across three disjunct regions. To assess conservation requirements, we measured the genetic structure of 27 populations using double digest RADseq). We combined our genetic data with habitat models under projected climate scenarios to identify changes in future habitat suitability. There was evidence of regional differentiation that may pre-date (but also may be compounded) by recent habitat fragmentation. We also found significant correlation between genetic and geographic distance when comparing sites across regions. Overall, *S. napiformis* showed low genetic diversity and a relatively high proportion of inbreeding/selfing. Climate modelling, based on current occupancy, predicts a reduction in suitable habitat for *S. napiformis* under the most conservative climate change scenario. We suggest that the best conservation approach is to maximise genetic variation across the entire species range to allow dynamic evolutionary processes to proceed. We recommend a conservation strategy that encourages mixing of germplasm within regions and permits mixed provenancing across regions to maximise genetic novelty. This will facilitate shifts in genetic composition driven by individual plant fitness in response to the novel environmental conditions this species will experience over the next 50 years.

**Keywords:** Camphorosmoideae; conservation genetics; disjunct distribution; population fragmentation; population structure; single nucleotide polymorphism (SNP) markers

#### **1. Introduction**

The International Union for the Conservation of Nature (IUCN) recently listed >1250 Australian species as being vulnerable to extinction, which places the risk to Australia's biodiversity as amongst the highest globally [1]. Currently, genetic factors are not used for IUCN listings, but their inclusion can improve the assessment of extinction risk and subsequent conservation of some species [2]. Developers of biodiversity conservation policy acknowledge the need to retain dynamic evolutionary processes [3] and thus the security of those processes remains a major goal for the management of genetic resources [4,5] including the conservation of population genetic diversity and individual fitness [6,7]. Reproductive strategies, including the breeding system and dispersal capacity of pollen and seed, influence the distribution of genetic diversity, which can have a strong effect on plant population fitness [8]. For many plant species, predictions of increased aridity and shifting seasonality compound the effects of recent human-mediated changes to their habitat [9]. These factors combine to escalate the extinction risk of threatened species by reducing habitat connectivity, thereby disrupting

patterns of gene-flow and potentially reducing effective population size and increasing levels of inbreeding [10].

Plant species respond to local changes in environmental conditions at the population level [11], ensuring ongoing homogeneity in genetic diversity across the landscape is increasingly important [12]. Ideally, spatial and temporal monitoring of intraspecific variation would inform species management where human impacts are likely to have negative impacts on phenotypic and genetic variation [13]. Collectively, evolutionary processes, high levels of heterozygosity [14], and allelic diversity [15] are understood to enable adaptation and resilience to environmental change; thus, conserving natural patterns of genetic variation and interaction should be prioritised to ensure species' longevity [5,16]—particularly for small plant populations subject to environmental stress [14]. If dispersal is insufficient for plants to track changing climate [17], human-assisted migration beyond the natural dispersal envelope may be justified [18]. Determining the amount and distribution of variation within a species and the minimum population size required to retain evolutionary potential are critical components in conservation planning [19,20], but the quantification of these factors remains elusive for many species.

The alluvial plains of the southern Riverina area of New South Wales and Victoria and the Victorian Wimmera riverine area of southeastern Australia [21] have been acknowledged for their productivity, both in cropping and livestock raising, since being established for European agriculture. The progressive introduction of a large-scale irrigation network from the late 1800s saw extensive conversion of the native shrublands and open woodlands to broad-scale agriculture, resulting in extreme reduction and fragmentation of the original vegetation. Subsequently, the overall community in this region was listed as Critically Endangered [22]—the direct result of weed incursion, salination from rising water tables, uncontrolled grazing, and human-induced climate change with associated drying and altered fire regimes [23]. Documentation of the original vegetation of this area (e.g., [24,25]) indicates that chenopods, particularly *Atriplex*, *Maireana*, and *Sclerolaena*, formed a significant component of the low shrub and ground layer.

The reduction in chenopod species in riverine habitats has led to the recommendation that ex situ germplasm and genetic restoration involving plant translocations should be undertaken to ensure their conservation in riverine habitats. Importantly, these actions are tempered by the phylogeographic history and distribution of intraspecific variation. Determining whether distinct genetic lineages exist within taxa [26] is necessary to avoid the unintended introduction of poorly adapted plants [27,28]. New, cryptic chenopod taxa are still being described from the region [29,30], which highlights a risk of accidental inclusion of non-target species in restoration actions. When planning these strategies, it is important to consider that artificially restricting the gene pool may have the unintended consequence of limiting population adaptability [31,32].

A chenopod of conservation concern in southeastern Australia, *Sclerolaena napiformis* Wilson (Turnip Copperburr) [33], is one of 66 species of *Sclerolaena*, an endemic genus restricted to mainland Australia. *Sclerolaena* is a member of the subfamily Camphorosmoideae within Amaranthaceae s.l., which includes the previously recognised families Amaranthaceae s.s. and Chenopodiaceae [34]. Australian Camphorosmeae comprises a monophyletic lineage of ≈150 perennial C3 species in contrast to the largely C4-dominant Amaranthaceae s.s [35]. The distribution of Australian plant species prior to colonisation is generally poorly documented, and due to large-scale habitat conversion, many extant species are believed to inhabit only a minute proportion of their former range [36]. It is likely that *S. napiformis* was widely distributed within its current extent of occurrence but through fragmentation is now confined to three disjunct areas. Extant populations are largely restricted to roadsides, travelling stock routes, and small reserves within an otherwise highly modified landscape [23]. The historical patterns of gene-flow within the metapopulation are now greatly disrupted and the species is considered nationally endangered [22].

There are two major concerns regarding the future of *S. napiformis*. One is the likely effect of population decline and loss of connectivity on the genetic health of the species. The second is the reduction in the availability of suitable habitat caused by changing climate and habitat conversion. Site degradation, a contributor to habitat loss, occurs primarily from weed invasion by species either better adapted to changing conditions and/or that can outcompete *S. napiformis* (e.g., overtopping by dense foliage of the non-native grasses *Phalaris aquaticum* and *Lophopyrum ponticum*). Some conservation work has been undertaken for the species, including translocation of plants into secure landholdings and augmentation of populations on road reserves. However, this work has been undertaken without knowledge of the distribution of genetic diversity across the geographic range of *S. napiformis*. A national recovery plan for the species [23] advocates in situ protection and the establishment of an ex situ seed collection, but provides no recommendation for assessing genetic attributes that may pinpoint important populations or guide further actions such as translocation, assisted migration, and genetic rescue.

In this study, we used a reduced representation genomic sequencing method (double digest restriction-site associated DNA sequencing-ddRADseq; [37]) to examine genetic patterns by comparing single nucleotide polymorphism (SNP) loci within and among populations across the geographic range of *S. napiformis*. We aimed to quantify the genetic structure and the partitioning of variation within *S. napiformis* at the population and regional scale, as the presence of genetic structure could influence the sourcing of germplasm for restoration. We also looked for the presence of allelic variants restricted to populations or regions and identified chloroplast haplotypes on the basis of ddRAD SNP loci. Finally, we modelled the future suitability of *S. napiformis* habitat using climate and soil variables. Using this combined approach, we recommend procedures for the collection and use of germplasm to refine the conservation strategy for *S. napiformis*.

#### **2. Materials and Methods**

#### *2.1. The Study Species*

*Sclerolaena napiformis* is restricted to three disjunct regions in the semi-arid Murray Darling Depression and Riverina regions of Victoria (Vic) and New South Wales (NSW) in southeast Australia. We designated the following regions: "northeast" (NSW only), where only two populations were found; "central", with populations north and south of the Murray River; and "southwest" (Victoria only), containing the greatest number and density of populations (Figure 1). The species grows as a procumbent multi-stemmed, perennial sub-shrub up to 40 cm high, mainly in native grassland and remnant Buloke (*Allocasuarina luehmannii*) woodland habitats. A thick rootstock develops within one year of germination and plants can re-shoot from the stem base following dieback in response to environmental conditions [23]. The floral morphology of *S. napiformis* is typical of wind pollination, and the fruiting perianth, a burr, has spines, suggesting dispersal by animals (zoochory) (Figure 2). The breeding system has not been studied but seed is produced if plants are bagged to exclude pollen from a different individual; thus, we assume *S. napiformis* to be self-compatible (James, unpublished data), as apomixis has not been recorded in the genus. At a few sites, the species commonly associated with at least low levels of salinity are present (e.g., *Atriplex semibaccata, Salsola tragus* subsp. *tragus*, *Spergularia brevifolia*), but overall, the associated vegetation carries little signal of salinity. All collection sites comprised primarily native species where weeds may have been present but did not dominate the site, suggesting that *S. napiformis* is not tolerant of significant modification of the original vegetation.

**Figure 1.** Distribution of *Sclerolaena napiformis* and location of sampled sites. (**a**) Geographic distribution based on herbarium records, indicated by blue dots, downloaded from Australasia's Virtual Herbarium (AVH 2019). (**b**) Location of the three disjunct regions with sampled sites labelled.

**Figure 2.** Habit, flowers, and fruit of *Sclerolaena napiformis*. (**a**) A thick rootstock develops within one year of germination and plants can re-shoot from base of stem following dieback. Developing fruit circled. The insignificant flowers, lack of petals, and exposure of stigmas and anthers to air currents are features typical of wind pollination. (**b**) Young flower with two stigmas (♀) and five unopened anthers (♂, two underdeveloped) visible. (**c**) An older flower with three stigmas (♀) and three dehisced anthers visible (♂). (**d**) Female reproductive organs excised from fruit showing stigma lobes (♀) and a single ovary (O). The timing of stigma receptivity is not known. (**e**,**f**) Developing fruit, with remnant of stigmas still visible (**f**). (**g**) Mature dry fruit collected from stems. Circular scar at base of one fruit shows the position of fruit attachment to the stem.

#### *2.2. Sampling*

For population analysis (ddRADseq and chloroplast haplotying), we visited 44 sites from February to April in 2018 (late summer to mid-autumn) following identification of sites using location information taken from herbarium specimens in the Australasian Virtual Herbarium (avh.ala.org.au) and observation records in the Atlas of Living Australia (ala.org.au). No plants were found at 15 sites and populations at two sites had insufficient leaf tissue to sample. The 27 sites sampled across the species' geographic range varied in area from 0.06 to 1.25 ha and contained discrete (sub)populations varying in size from approximately 40 to 400 individuals. Plants were sometimes cryptic, and thus these estimates, which were based on observed plants, were generally conservative. Sampled plants were geolocated and fresh leaf material was collected from 9–22 mature plants per population and desiccated in silica gel (Figure 1, Table 1).

**Table 1.** Site codes, region, sample numbers, estimated area, and number of plants per site (see Figure 1 for location of sites within regions).


In addition, two samples, grown from seed collected from Trevaskis Rd (SN18, central region) and South Corree Rd (SN20, northeast region), were used for whole genome Illumina shotgun sequencing to provide a genome scaffold for ddRADseq SNP loci. Chloroplast reference genomes [38] were assembled from the same sequence data and used for haplotyping.

#### *2.3. Chromosome Counts*

To assist in the analysis of ddRADseq data, we determined the ploidy of *S. napiformis* by chromosome counts. Actively growing root tips were collected from ex situ plants grown from seed collected at populations SN12, SN18, SN20 (at or very near the site of the type collection), and SN24. Root tips were prepared using a modification of the method of Murray and Young [39] by pre-treatment in a saturated 1,4-dichlorobenzene solution at 20 ◦C for 24 h, then fixation in 3:1 95% ethanol/glacial

acetic acid at 4 ◦C for 24 h. Root tips were rinsed several times in water and either stored in 70% ethanol at −20 ◦C or prepared immediately for counts by hydrolysing root tips in 10% HCl for 8 min at 60 ◦C, macerated in a drop of FLP orcein stain [40], then squashed and viewed under oil immersion. Chromosome counts were made from cells across several root tips per plant.

#### *2.4. DNA Isolation and Sequencing*

Leaf material was dried to maximise DNA preservation during extensive field surveys. Dried leaf material (≈20 mg) was ground to a fine powder in a QIAGEN TissueLyser II (two minutes) and genomic DNA was isolated using the CTAB protocol for ISOLATE II Plant DNA Kit (Bioline)—except that final elution was in a total volume of 40 μL elution buffer. DNA quality was confirmed via 1% agarose gel electrophoresis in 1xTBE buffer for 45 min at 100 V and stained with SybrSafe (Invitrogen, Thermo Fisher Scientific Australia). DNA isolations were quantified using a Qubit v3.0 fluorometer (Invitrogen Life Technologies) and stored at −20 ◦C.

#### 2.4.1. Whole Genome Sequencing

To improve identification of loci from the ddRADseq reads, we obtained a de novo reference scaffold using Illumina shotgun sequences. Genomic DNA (CTAB protocol for ISOLATE II Plant DNA Kit; Bioline Australia) was isolated separately from fresh leaf material sampled and vouchered from 2 cultivated plants grown using seed sourced from Jerilderie, New South Wales (MEL2470835A) and near Wyuna, Victoria (MEL2446779A). A genome library was prepared for each sample using a TruSeq Nano DNA Library Prep kit (Illumina, San Diego, USA). A minimum of 2.5 μg genomic DNA template of each sample was supplied to Australian Genome Research Facility (AGRF; Parkville, Aust) at 50 ng/μL and sequenced (150 bp PE on an Illumina NovaSeq 6000).

#### 2.4.2. ddRADseq

A modified version of the Peterson et al. ddRADseq protocol [37] was used to prepare DNA libraries—details are available at michaelamor.com/protocols. The final DNA library contained 480 samples, which included 13 technical replicate pairs. Individuals were randomly allocated to a plate per well and therefore received a random barcode and index. In summary, 100 ng of genomic DNA was digested for 18 h with EcoRI-HF and AseI restriction enzymes (New England Biolabs) and barcoded adapters were ligated to digested DNA fragments. Non-ligated adapters were removed before libraries were size-selected by magnetic bead purification using Jetseq Clean (Bioline)/PEG 8000 buffer solution at 0.5× then 0.9× of the DNA solution volume.

PCR-based indexing of the individual libraries was conducted using real-time PCR (rtPCR) in a Bio-Rad CFX96 thermocycler, with amplification stopped after 10 cycles. To assess whether amplification was adequate for sequencing whilst minimising PCR bias, we visualised individual fluorescence curves to ensure that they had not plateaued. Amplified libraries were pooled in equal concentrations based on relative fluorescence unit outputs from rtPCR and were concentrated/purified using Jetseq Clean beads/PEG 8000 buffer solution (1.8× DNA solution volume). The pooled library was size-selected at 300–500 bp in a Pippin Prep (Sage Science) using a 2% agarose (100–600 bp) cassette and quantified via qPCR using a Jetseq Library Quantification Hi-ROX kit (Bioline) on a Bio-Rad CFX96 thermocycler. Library QC and sequencing were performed at the AGRF.

#### *2.5. Quality Filtering and Bioinformatics*

#### 2.5.1. Nuclear DNA Assembly

Genome scaffolds for each sample were assembled as follows. Quality filtering of reads was performed using Trimmomatic 0.39 [41] with the following settings: LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36. Paired reads were then normalised to ≈100× coverage using BBnorm (settings: mindepth = 5 target = 100). Normalised reads were assembled into scaffolds using SPAdes v3.13.0 [42]. Benchmarking Universal Single-Copy Orthologs (BUSCO v3.1.0; [43]) statistics were applied to assess the completeness of the assembly. To estimate the recovery of gene-coding regions in the resulting assembly, we assessed scaffolds with BUSCO using the embryophyta\_odb10 database (settings: -m genome -sp arabidopsis).

#### 2.5.2. Population Samples (ddRADseq)

Raw paired-end reads were trimmed if the quality dropped below a score of phred20, based on a sliding-window of four bases using Trimmomatic v0.38 [41]. Reads below our minimum length requirement of 100 bases were discarded. Finally, reads were trimmed if Illumina adapters were present and filtered for microbial and fungal contaminants using Kraken v2.0.6 [44]. Reads were demultiplexed into individual sample read sets using the "process\_radtags" feature of STACKS v1.4.6 [45].

Assembly of reads into loci was performed on (i) reads 1 and 2, and (ii) only read 1 using ipyrad v0.7.29 [46]. Reads were assembled using a stepwise approach by mapping reads to our most reliable genome scaffold (determined by size (gb) and BUSCO score) and performing de novo assembly on the remaining reads. Inclusion of read 1 only resulted in more loci in all cases, and therefore assemblies based only on read 1 were used downstream. Further sequence quality filtering was performed to convert base calls with a score of <30 into Ns, whilst excluding reads with ≥15 Ns. Our complete assemblies required minimum depth of 10x reads per locus. The resulting assemblies were used to generate variant call format (VCF) files for population genomic analyses and alignment (phylip) files for chloroplast (cp) DNA-based haplotyping.

The occurrence of outlier loci, interpreted as candidates for loci under selection, was assessed for all assemblies using Bayescan v2.1 [47] and a Principal Component Analysis (PCA) approach implemented via the "pcadapt" package [48] in R 3.5.3 [49]. Only sites identified by both methods were considered true outliers. BayeScan (10,000 pilot runs and 200,000 generations, with 50,000 initial generations discarded as burn-in) identified zero outliers compared to 104 outliers detected by "pcadapt". As no outliers were common across both approaches, we considered those identified by "pcadapt" to be false positives, and the following analyses were conducted using all loci.

#### 2.5.3. Population Samples (cp Haplotyping)

Reads from the ddRADseq libraries were mapped to a consensus sequence of two *S. napiformis* individuals (GenBank Accessions MT027236 and MT027237 [38]) using ipyrad [46]. The clustering similarity threshold was set to a minimum of 80% and the presence of ≥50 individuals per locus. An alignment including the two reference genomes was visualised in Geneious Prime v2020.1.2 (https://www.geneious.com). A summary table of variable sites was created manually, and a haplotype network was constructed in R 3.5.3 using the "pegas" package [50].

#### *2.6. Analysis of Genetic Diversity and Structure*

Genetic structure was investigated among and within regions using a step-wise approach via the "fastStructure" algorithm [51]. VCF files generated by ipyrad were converted to ped and map files using VCFtools v0.1.15 [52], then further converted to bed, bim, and bam files using PLINK v1.90b4 [53]. To identify genetic clusters, we performed 8 to 10 replicate runs for each K-value (number of sites minus 1) based on our complete assembly using the logistic model with 10 cross-validation steps for each run. FastStructure outputs were summarised with Clustering Markov Packager Across K (CLUMPAK) "main pipeline – admixture" and "best K" algorithms via the online server (http://clumpak.tau.ac.il/). We attempted 150 fastStructure replicate runs for each K-value for our regional analysis, however, a high failure rate resulted in us retaining >10 replicates for K = 3 but only two completed replicates for both K = 2 and K = 4. Due to low (and uneven) replicate numbers, we were unable to summarise using CLUMPAK. Therefore, regional analyses were summarised using fastStructures with "chooseK.py" and "distruct.py" scripts to enable presentation of the single "best" graphical representation and its associated likelihood values. To investigate potential sub-structure between the two northeast

sites, we performed fastStructure analyses using an assembly generated from individuals from the northeast and central sites. We compared the Bayesian-based genetic clusters obtained from fastStructure with those identified using K-means clustering and Bayesian information criterion (BIC) as implemented for discriminant analysis of principal components (DAPC [54]). To support the above findings, we investigated correlation between genetic and geographic distance using Mantel's R statistic via the "mantel" function in the "vegan" R package [55]. Geographic (Euclidean) and genetic distance (dissimilarity) matrices were calculated with the "dist" and "bitwise.dist" functions in R, respectively, using the package "poppr" [56].

For population genomic level analyses within and among regions, we imported VCF files into R and converted them to "genind" and "hierfstat" objects. Estimates of genetic differentiation were calculated via pairwise GST and Jost's D among our identified regions using the "mmod" R package. The degree of genetic differentiation among regions was estimated via 1000 permutations of analysis of molecular variance (AMOVA) using the "poppr" R package. Euclidean genetic distance was input as a covariate using the "pegas" algorithm. Global and regional estimates of genetic diversity were calculated based on the observations resulting from our population structure analyses. We estimated observed heterozygosity (Ho) and gene diversity (He) using the "basic.stats" function in the "hierfstat" R package. Global FIS was estimated via the "boot.ppfis" function in the "hierfstat" R package [57] to determine the proportion of species-wide and region-wide genetic variation reflected in individuals. Global FST and GST was estimated via the "wc" and "Gst\_Hedrick" functions in "hierfstat" and "mmod" [58], respectively.

#### *2.7. Distribution Modelling*

We investigated the potential for habitat/environment suitability to shift under a conservative climate change scenario using 20 variables (Table S1). A total of 8 soil variables (3 seconds resolution) were downloaded from the CSIRO soil and landscape database (http://www.clw.csiro.au/aclep/ soilandlandscapegrid/index.html). A total of 12 bioclimatic variables (30 seconds resolution) were downloaded from the WorldClim database (www.worldclim.org/bioclim) as "current conditions" (interpolations of observed data, representative of 1960–1990) and "future conditions" (IPCC5 climate projections for 50 years assuming the most conservative "representative concentration pathway" (RCP = 2.6) for greenhouse gases). Collection coordinates were used as species occurrence data, which is the most extensive collection to date and was informed by >30 years of field surveys.

The extent of the models was set to 10 degrees surrounding occurrences (site locations). Each model was fitted using the maximum entropy (Maxent) algorithm via the *dismo* package in R [59]. Model fit was evaluated by simulating 1000 random pseudoabsences within the model extent and tested how well our collection data compared to "random guessing" using a receiver operating characteristic (ROC) curve (AUC = 0.994; Figure S1a). We considered this to be a realistic approach to obtaining absences, as true absence data are difficult to obtain for *S. napiformis* because plants can die down to ground level under stressful conditions [23]. Variable contributions were plotted for (i) current climate, (ii) soil, and (iii) current climate and soil combined (Figure S1b). We used the raster package in R [60] to predict the suitable habitat for *S. napiformis* in 2050 using our "future climate" variables and including soil data with the assumption that soil and landscape will remain constant.

#### **3. Results**

#### *3.1. Chromosome Counts*

All counts were consistently 2n = 18 chromosomes (Figure 3). A diploid number of 2n = 18 was reported for *Sclerolaena birchii* (recorded as *Bassia birchii*, [61]), the only other species of *Sclerolaena* for which chromosome numbers are available. Therefore, we considered *S. napiformis* diploid, and ploidy was specified as such where required for analysis.

**Figure 3.** Root tip squash showing 2n = 18 chromosomes for an individual of *Sclerolaena napiformis* grown from seed collected at Cope Cope, Victoria (site SN24), viewed under oil immersion (×1000). Scale bar = 10 μm.

#### *3.2. Data assembly Statistics*

Our nuclear genome assembly (built using short reads) was ≈6.5 million nucleotides in length and consisted of 733,636 contigs (average length = 888.27, largest length = 154,606). Overall, 1145 complete genes were identified (83% of the 1375 total recognised genes), 137 genes (10%) were partially identified (fragmented sequence), and 93 (6.7%) were missing.

For ddRAD data, the final analysed dataset consisted of 452 individuals. Summary statistics are provided in Table 2. After filtering loci and informative sites for each assembly, the number of unlinked SNPs ranged from 2837 to 3367 with a mean depth of ≈15 reads per individual at every retained locus. For all assemblies, approximately 31% of reads were mapped to our assembled genome scaffold.

**Table 2.** Summary statistics for *Sclerolaena napiformis* assemblies generated using ipyrad. Statistics reported for "unlinked SNPs" and "mean locus depth" were obtained after filtering variant call format (VCF) files with VCFtools. SNP: single nucleotide polymorphism.


#### *3.3. Regional Genetic Diversity and Structure*

Measures of regional genetic diversity are summarised in Table 3. Genetic structure among regions was supported by AMOVA (see Tables S2–S5), where 8.7% of the total variation was observed among southwest, central, and northeast regions (*p* = <0.001). AMOVA also provided support for differentiation among sites within the southwest and northeast regions (*p* = <0.02), but not among sites within the central region (*p* = 0.8). Correlation between genetic and geographic distance (Mantel's R) was significant among regions (*p* = 0.001), and among sites within the central region (*p* = 0.04). There were no noteworthy correlations among sites within the southwest or northeast region. Overall, *S. napiformis* was characterised by relatively low levels of heterozygosity (Ho = 0.017), with a high proportion of the overall genetic diversity captured within a given individual (FIS = 0.617), pointing to notable levels of inbreeding/selfing within the species. This trend was consistent when looking within regions where heterozygosity levels (Ho) were relatively low but diversity within individuals (*F*IS) was high. All sites exhibited *F*IS values consistent with substantial levels of selfing, except SN24, where the *F*IS value was negative (Table S6), suggesting clonal reproduction was greatest at this site. There was no correlation between *F*IS and population size, population area, or the number of samples included in the estimation of *F*IS.

**Table 3.** Measures of genetic diversity (observed heterozygosity (Ho) and gene diversity (He)) and fixation indices, calculated to determine the proportion of genetic variation contained in sites, relative to the total observed variation (FST and GST) and the proportion of the overall variation contained in an individual (FIS). Analyses were performed on the entire dataset (*n* = 452). N = number of sites, n = number of samples. Statistics based on assemblies of each region were also calculated. Mantel's R statistic is also reported with significant correlation between genetic and geographic distances marked by an asterisk.


Analyses among the three disjunct regions showed that sites within the southwest, central, and northeast all had greater gene diversity than heterozygosity and were characterised by a high degree of inbreeding/selfing (FIS = 0.517, 0.652, and 0.830, respectively). Population differentiation among sites was similar within the southwest and central regions (GST = 0.304 and 0.288, respectively) but higher for the small northeast region (GST = 0.369). Comparing pairwise genetic difference between regions, we found the central and northeast regions to be the most distinct (pairwise GST = 0.228, Jost's D = 0.017; Table 4), despite being geographically closer than either was to the southwest.

**Table 4.** Pairwise genetic differentiation (lower left: GST, upper right: Jost's D) among three regions of *Sclerolaena napiformis* from Victoria and New South Wales, Australia. We report the greatest difference occurring between central and northeast regions.


#### *3.4. Population Genetic Structure*

#### 3.4.1. Overview of All Regions

Population structure was examined for the species across the geographic range. Marginal likelihoods resulting from fastStructure runs of the regional assembly of *S. napiformis* favoured four distinct genetic clusters with clear differences shown among geographic regions (Figure 4a). For K = 4, individuals assigned to blue and purple clusters were widely dispersed geographically but most common in the southwest region, whereas individuals from green and yellow clusters were mostly found in the central and northwest regions, respectively. DAPC identified the optimal number of clusters as three (Figure S2), and also showed differences among groups of individuals on the basis of their geographic region, but samples from the northeast were less tightly clustered than the majority of samples from the central and southwest regions (Figure 4b).

#### *Diversity* **2020**, *12*, 417

**Figure 4.** Group assignment and clustering outputs produced via (**a**) fastStructure showing marginal likelihoods and individual assignment to genetic clusters for K = 2–4, and (**b**) discriminant analysis of principal components (DAPC). K = 3 highlights distinction among southwest, central, and northeast regions and is consistent with fastStructure where some southwest populations show admixture with the yellow cluster dominating the northeast. Analyses were based on 3093 unlinked double digest RADseq SNPs obtained from an ipyrad assembly of 452 *Sclerolaena napiformis* individuals. Individuals are coloured according to region. No a priori locality information was input into either analysis.

#### 3.4.2. Central and Northeast Regions

For the combined central and northeast regions, the optimal number of clusters identified from both fastStructure and DAPC was K = 4, although K = 7 was also a possibility (Figure S2). The distribution of clusters was not always well correlated with geographic location. Visualising the fastStructure output (Figure 5), at K = 4, the two northeast sites (19 and 20) were delimited from other sites but could not be distinguished from each other (Figure 5a). At K = 7, three groups were evident, with site 18 (central) and sites 19 and 20 (northeast) distinct from all other sites, which were dominated by a single genetic cluster (blue—Figure 5b). The remaining 1–4 clusters (K4 vs K7) were made up by very few individuals dispersed seemingly randomly across localities. DAPC was less successful at delimiting sites within the combined central and northeast region (Figure S3).

**Figure 5.** Clumpak output summarising 10 replicate fastStructure runs of (**a**) K = 4 and (**b**) K = 7 for *Sclerolaena napiformis* individuals from central (sites 15, 16, 17, 18, 21, 22, and 23) and northeast (sites 19 and 20) regions combined.

#### 3.4.3. Southwest Region

The optimal number of clusters within the southwest region as identified by fastStructure was K = 4 (Figure 6), with populations showing variable levels of admixture. There was a loose geographic association with the more westerly sites being mainly cluster 1 (blue) and more easterly sites (10, 11, 12, 24, 26, and 27) being mainly cluster 2 (purple). Sites 5 and 25 had high membership probability to cluster 3 (green) that was not explained by habitat or proximity. Cluster 4 (orange) showed no geographic association. Although not definitive, DAPC suggested six genetic clusters as the lowest

likely value of K and no substantial geographic structure with most individuals assigned to a single widespread genetic cluster (Figure S4).

**Figure 6.** Clumpak output summarising eight replicate fastStructure runs for K = 4, which was optimal across a range of K2–K18. Genetic structure is evident between *Sclerolaena napiformis* sites to the east (sites 1–9 and 13–14) and west (sites 10–12 and 24–27) of Marnoo, Victoria.

#### *3.5. Chloroplast Genomes and Haplotyping*

We obtained 17 loci comprising 7 informative sites by mapping the ddRADseq reads to two chloroplast reference genomes (GenBank accessions MT027236 and MT027237). None of the informative sites mapped to the most variable chloroplast regions identified in the alignment of the reference genomes. Two informative sites corresponded to partial sequences of the matK gene, a variable site when comparing GenBank sequences for other species of *Sclerolaena*. The remaining informative sites did not correspond to any chloroplast sequences available for *Sclerolaena*. Although eight haplotypes were recognised, only six were differentiated if variable positions containing "N" were ignored, and while 59% of populations had more than one haplotype, it was not possible to identify relationships to sites or regions (Table 5). The star-shaped haplotype network (Figure S5a) is consistent with a single maternal lineage dominated by one haplotype, H1, and a few minor variants derived by mutation [62,63]. Haplotype H1, possibly ancestral, dominated overall and was recovered from 351/390 individuals spread across the 27 sites. The second most common haplotype, H2 (21/390 individuals), was found at 10/27 sites (37%) and in most individuals (9/14) at site 25, the only site where H1 was not the most common haplotype (Figure S5b).

**Table 5.** Haplotypes recovered from mapping ddRADseq reads to two reference *Sclerolaena napiformis* cpDNA genomes, SN20 and SnTR1 (151,530 bases long, 814 variable sites in total between the references). Six haplotypes were differentiated using unambiguous variable sites recovered in 390 individuals. The position of variable sites is reported relevant to the reference alignment (available via Mendeley Data; http://dx.doi.org/10.17632/664sh75jgz.1) and whether the variation occurs within a gene/coding region.


<sup>1</sup> Haplotype 4A collapses to haplotype 4 due to unresolved nucleotide at position 134,376; \* haplotypes 5A and 5B collapse into a single haplotype (H5) due to unresolved nucleotides at positions 134,376, 115,143, and 115,263. When compared only at ddRADseq loci, Sn20 and SnTR1 were equivalent to H1 and H4, respectively.

#### *3.6. Distribution Modelling*

Investigating the variables most associated with the occurrence of *S. napiformis* revealed that most variation in the model was accounted for by the presence of clay in the soil (≈50%) and the annual mean temperature (≈15%). Carbon and other climate variables associated with precipitation and temperature all contributed >5% towards the total observed variation (Figure S1b). Overall, a decrease in available suitable environment was predicted for the next 50 years using both models: climate variables only and a combination of climate and soil variables. Combining soil types with climate variables decreased predicted present and future suitable environment substantially when compared to the climate-only model. Additionally, the climate-only model highlighted a more southerly area that increased in suitability at approximately −38 degrees latitude. Including soil type almost entirely excluded this area, as the presence of clay soils played a major role in predicting present day occurrences (Figure 7).

**Figure 7.** Predicted suitable habitat for *Sclerolaena napiformis* based on collections performed during this study. Modelling was based on (**a**) current climate, (**b**) current climate and soil composition, (**c**) future climate, (**d**) future climate and current soil composition. We also show a shift in climate suitability (**e**) excluding and (**f**) including current soil variables (increasing suitability shown by green and decreasing shown by red). Modelling was performed using 12 bioclimatic and 8 soil composition variables (Table S1). Future climate predictions are based on a conservative climate scenario (RCP 2.6) projected 50 years.

#### **4. Discussion**

We provide evidence for genetic structure among three disjunct regions of *S. napiformis* in southeast Australia. We did not find genetic differentiation among sites within central and northeast regions, but clear genetic structure was evident in the southwest region (east and west of Marnoo, Victoria). Our findings of regional differentiation of *S. napiformis* are consistent with limited long-distance dispersal (likely impacted by range disjunction), but also point to some historical dispersal beyond extant populations. Lower than expected heterozygosity, as well as high *F*IS values, also indicates substantial inbreeding within populations of *S. napiformis*, a self-compatible, wind-pollinated species. Effective gene-flow is unlikely to be maintained across the distribution of this species, particularly given that human-mediated landscape changes have created barriers to gene-flow and access to suitable habitat. Even if pollen-mediated gene-flow is maintained among existing sites, the probability of new sites being colonised is low without human intervention. Importantly, we did not find geographically correlated cryptic lineages of *S. napiformis*, in contrast to findings for other disjunct species (e.g., [26]). We suggest that conservation management of *S. napiformis* utilises the genetic diversity found in all populations from the northeast, central, and southwest (divided into east and west sites) to maximise the adaptability of the species in increasingly novel environments.

We also show that the current distribution of *S. napiformis* is restricted by soil type and tightly linked to temperature and precipitation. As such, climate projections suggest that over half the existing habitat will become unsuitable for *S. napiformis* over the next 50 years. Whilst we predict a decline in habitat suitability across the entire species distribution, our projections suggest that populations in the southwest region are the most likely to persist under future climate projections. Overall, our findings highlight the complexities of managing the competing demands of human utilisation and conservation of native species in the face of climate change, but we outline some viable options below.

#### *4.1. Fragmentation, Genetic Structure, and Gene-Flow*

We recorded a single dominant cpDNA haplotype across the distribution of *S. napiformis*, which is consistent with historical broad-scale seed dispersal, although low coverage obtained via our non-cpDNA targeted approach may limit our ability to distinguish finer-scale patterns [64]. Our findings based on nuclear DNA show clear genetic differentiation among regions and highlight finer-scale structure in the southwest region. Therefore, dispersal in *S. napiformis* appears most common at relatively small scales (i.e., within 20–80 km), despite seeds having morphological adaptations that promote long-distance dispersal by birds and mammals. Relatively localised dispersal is considered common among plants [65,66], although genetic studies confirming this assumption are lacking [67]. Here, we provide valuable genetic evidence that seed dispersal does not maintain genetic homogeneity over the entire (≈370 km) linear range of *S. napiformis*.

Human activity (land clearing and fragmentation) has reduced suitable/available grassland habitat in this area by as much as 99% [23], which has clearly contributed to the isolation of extant populations and sites. However, we identified two genetic clusters in the southwest region, which were separated by ≈11 kms and displayed no obvious landscape barrier. As many of our sites are isolated by greater distances, we believe proximity alone cannot account for reduced connectivity among *S. napiformis* populations, despite extreme alteration to the landscape. Similarly, a sympatric species, *Pimelea spinescens*, was considered to have genetic structure that pre-dated fragmentation resulting from agriculture post-colonisation [68], as also observed in other plant systems [69,70]. We therefore suggest that the genetic structure observed in *S. napiformis* (northeast, central, and southwest—east and west division) was present prior to colonisation, although recent land conversion will likely have downstream consequences on genetic patterns.

Certainly, habitat loss and fragmentation displace biodiversity and biomass, and this is particularly problematic for grassland systems [71]. There is no doubt that the extreme alteration to the landscape has displaced organisms that historically assisted dispersal in *S. napiformis*. For example, the endangered plains-wanderer *Pedionomus torquatus* consumes the seed and leaves of *Sclerolaena* spp. [72]. Despite a

small home range of 7–21 ha, this bird may have historically assisted in seed dispersal among sites when both *P. torquatus* and *S. napiformis* were more common. Ants also assist in the short-distance dispersal of a closely related (and sometimes co-occurring) species, *Sclerolaena diacantha* [73,74], although this has not been shown explicitly for *S. napiformis*. Furthermore, vehicles and grazing stock may contribute to the dispersal of this common "roadside" species, although we believe these roadside populations are more likely to reflect the remaining suitable habitat for this species.

#### *4.2. Genetic Diversity and Inbreeding*

We observed high levels of inbreeding and homozygosity in discrete *S. napiformis* regions (and 26/27 sites), which suggests a high degree of self-pollination. Indeed, a closely related species, *S. diacantha,* has an estimated selfing rate of 70% with strong selection for outcrossed progeny considered unlikely [73]. *Sclerolaena napiformis* may display similar selfing rates to *S. diacantha*, given that these species largely co-occur and thus experience similar environmental pressures. Mixed mating systems, whereby plants undergo both selfing and outcrossing, can minimise the accumulation of deleterious alleles [75], even following inbreeding depression [76] or under novel or changing environments [77]. This mechanism may be used by *S. napiformis* as a short-term counterbalance to the negative fitness consequences associated with selfing. It may also enable recruitment into new sites from a single or few related individual/s [13,69]. For example, the inbreeding species *Geum urbanum* showed high levels of homozygosity and selfing without any negative consequences for fitness [78]. We note, however, that negative genetic consequences associated with a reduced gene pool may still develop over a longer timeframe [13] and, considering the extreme habitat degradation in this system and the potential establishment of populations from few individuals, we believe there is a strong requirement for further genetic monitoring of *S. napiformis*.

#### *4.3. Considerations for Persistence of S. napiformis*

Our modelling suggests that climate change will substantially decrease the suitability of remaining habitat for *S. napiformis*, even under the most conservative projections. As the climate warms, organisms are required to adapt and/or shift their distribution to track suitable climate envelopes [79]. However, most plants are unlikely to migrate fast enough to keep pace with the current rate of climate change [17]. Furthermore, the reliance of *S. napiformis* on clay soil compounds this, as suitable soil is rare outside of its current distribution. We therefore believe it is unlikely for *S. napiformis* to undergo a range-shift naturally, leaving it exposed to increasingly unsuitable environmental conditions. Furthermore, climate change is predicted to favour an increase in weediness, particularly within an agricultural setting [80]. We expect this to further reduce habitat quality for *S. napiformis* as competition with invasive species increases. Therefore, we consider the persistence of *S. napiformis* into the future unlikely without human intervention, even if an effort is made to ensure remnant patches are protected. Active restoration and rehabilitation of converted land would thus go a long way towards ensuring the persistence of *S. napiformis* and the grasslands of southeast Australia.

The seeds of *S. napiformis* tolerate a wide range of temperatures, similar to other congeneric taxa (e.g., *Sclerolaena bicornis* [81] and *S. birchii* [61]). They are produced in high volume; may remain viable while dormant for several years [82,83]; and germinate rapidly once the physical barrier of the persistent, woody perianth is removed [81,84]. Furthermore, germination of *S. napiformis* seed is linked to sufficient rainfall and removal of the perianth, rather than being tied to a specific time of year [61,85]. This strategy of "germinating rapidly when most advantageous" may be favourable when rainfall is erratic, and might explain why *S. napiformis* alters its phenology to avoid water stress at sensitive stages of its life cycle [35]. *Sclerolaena napiformis* also appears to flower and reproduce year-round, and again, this is partly determined by the availability of water (personal observation). This finding, and the above-mentioned germination triggers, support the notion that water availability, rather than temperature, may have a greater influence on *S. napiformis* establishment and successful persistence upon encountering a new site. For example, drought sensitivity is known to limit plant

distribution and recruitment, particularly if it impacts early-life stages [86]. Recruitment into new areas is likely to be more restricted than germination because of the additional sensitivity during seedling establishment; however, the volume of seeds contained in the natural soil seedbank allows for mass germination when conditions are suitable [84,87]. This greatly increases the likelihood of establishing a viable population in a new area once a small number of recruits reach maturity.

Finally, we observed that the northeast region contained the lowest number and poorest quality of mature plants in terms of plant size and seed production. However, we observed good survival and healthy growth of northeast seedlings cultivated under optimal ex situ conditions. Throughout the distribution of *S. napiformis*, reduced available habitat, altered fire regimes, hotter/drier conditions with decreasing precipitation, and increased biomass of weeds threaten populations with local extinctions. This is particularly likely in the northeast region, and therefore retention of the genetic diversity in this area is particularly important as plants here may harbour adaptations to the predicted future conditions for central and southwest regions. We believe all populations of *S. napiformis* will require assisted restoration in the future, and therefore consider seed-banking from each region (particularly the northeast) a conservation priority.

#### **5. Conclusions and Implications for Conservation**

For *S. napiformis* to persist, conservation efforts must balance the rate of local extinction (measured as population loss) with the colonisation of new sites. Successful conservation will rely partly on management strategies that facilitate connectivity among *S. napiformis* sites and consider the projected effect of climate change when identifying candidate localities. High fecundity of mature plants and long-term storage capacity of germplasm will benefit conservation, and even though the development of an ex situ seedbank is underway, continued collections from diverse populations over multiple time periods is recommended.

Small populations are more vulnerable if their genetics and evolutionary biology are not integrated into the conservation strategy [88]. Although most remaining populations of *S. napiformis* have <200 individuals, they also have an unusually high proportion of reproductive plants [2], which lessens the risk of diversity loss in translocated or artificial populations [89]. Both inbreeding and outbreeding depression are potential risk factors when combining germplasm or augmenting gene-flow [32,88]. Outbreeding depression is considered to be less of a risk [31,32] as it is most likely when combining lineages with greater genetic distinction than we observed here [32]. We consider the greatest risk for *S. napiformis* to be associated with losing overall genetic variation if entire sites/regions are extirpated [27].

On the basis of our observation of sub-optimal gene-flow, we recommend the establishment of intermediate sites, with a particular focus on the southwest region, which we identified as the most likely to persist beyond our 50-year projection. As introduction/translocation only impacts genetic structure or adaptation when migrants establish and reproduce, the choice of recipient sites and ongoing monitoring is crucial to maximise the long-term success of any conservation efforts. Patterns of genetic diversity are not necessarily related to adaptive traits; therefore, quantitative approaches are likely to improve conservation success (see [6,7]).

Wide road verges (including some travelling stock routes (TSR) in New South Wales) often consist of high-quality native grassland/open woodland vegetation and some contain remnant *S. napiformis* populations. Roadside areas where *S. napiformis* is currently absent should be considered as candidate sites for establishment of corridors to facilitate gene-flow, ideally comprising plants with mixed genetics from each of the three regions. The provision of dispersal corridors by maximising the use of substantial TSRs and road verges in the area is a feasible and cost-effective conservation action. However, we believe the most beneficial conservation measure would be to open converted land for restoration of native grasslands, which would have wider benefits for the entire southeast Australian grassland system and the organisms that rely on it.

*Diversity* **2020**, *12*, 417

**Supplementary Materials:** The following are available online at http://www.mdpi.com/1424-2818/12/11/417/s1, Figure S1: All regions. DAPC: Bayesian information criterion (BIC) and assignments; fastStructure: DeltaK/ProbK. Figure S2: Central + northeast. DAPC: BIC, DAPC plot, assignments; fastStructure: DeltaK/ProbK. Figure S3: Southwest. DAPC: BIC, DAPC plot, assignments; fastStructure: DeltaK/ProbK. Figure S4: Haplotype network and geographic distribution of haplotypes based on mapping of ddRADseq loci to chloroplast reference genomes. Figure S5: Percentage contributions of soil and climate variables when modelling the distribution of e on the basis of occurrence data. Table S1: List of variables used for modelling suitable habitat for *Sclerolaena napiformis*. Table S2: Analysis of molecular variance among disjunct southwest, central, and northeast regions of *Sclerolaena napiformis*. Table S3: Analysis of molecular variance among southwestern *Sclerolaena napiformis* sites. Table S4: Analysis of molecular variance among central *Sclerolaena napiformis* sites. Analysis includes investigations of variance (a) between Victoria and New South Wales (states are divided by the Murray River), and (b) among all sites without consideration of state variable. Table S5: Analysis of molecular variance among northeast *Sclerolaena napiformis* sites. Table S6: Lower, mean and upper FIS estimates for each collection site of *Sclerolaena napiformis*.

**Author Contributions:** Conceptualisation, M.D.A., E.A.J. and N.G.W.; methodology, M.D.A., E.A.J., and N.G.W.; formal analysis, M.D.A.; investigation, M.D.A., E.A.J., and N.G.W.; resources, E.A.J. and N.G.W.; data curation, N.G.W. and M.D.A.; writing—original draft preparation, M.D.A. and E.A.J.; writing—review and editing, M.D.A., E.A.J., and N.G.W.; visualisation, M.D.A. and E.J.; supervision, E.A.J.; project administration, E.A.J.; funding acquisition, E.A.J. and N.G.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Department of the Environment and Energy, Australian Government, grant number TSRF-161 and the Royal Botanic Gardens Victoria (RBGV). The article processing charge was funded by the RBGV.

**Acknowledgments:** Gareth Holmes (RBGV) for preparing DNA libraries for genome sequencing; Chris Jackson (RBGV) for advice and assistance with data handling; Mirinda Thorpe, Iestyn Hosking, and Karly Learmonth (Victorian Department of Environment Land Water and Planning) for locality information, landholders, and local stakeholders who provided information, transport, and arranged access to sites; David Hines and Deanna Marshall (Trust for Nature) for incorporating plants from our collection into conservation plantings on private landholdings on the Patho Plains; and Hayley Cameron (Monash University) for valuable input into an earlier version of the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

**Data Accessibility:** Raw sequence data, assemblies and associated scripts can be accessed via Mendeley data (http://dx.doi.org/10.17632/664sh75jgz.1).

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Disentangling the Genetic Relationships of Three Closely Related Bandicoot Species across Southern and Western Australia**

**Rujiporn Thavornkanlapachai 1, Esther Levy 1, You Li 2, Steven J. B. Cooper 3,4, Margaret Byrne <sup>1</sup> and Kym Ottewell 1,\***


**Abstract:** The taxonomy of Australian *Isoodon* bandicoots has changed continuously over the last 20 years, with recent genetic studies indicating discordance of phylogeographic units with current taxonomic boundaries. Uncertainty over species relationships within southern and western *Isoodon*, encompassing *I. obesulus*, *I. auratus*, and *I. fusciventer*, has been ongoing and hampered by limited sampling in studies to date. Identification of taxonomic units remains a high priority, as all are threatened to varying extents by ongoing habitat loss and feral predation. To aid diagnosis of conservation units, we increased representative sampling of *I. auratus* and *I. fusciventer* from Western Australia (WA) and investigated genetic relationships of these with *I. obesulus* from South Australia (SA) and Victoria (Vic) using microsatellite markers and mitochondrial DNA. mtDNA analysis identified three major clades concordant with *I. obesulus* (Vic), *I. auratus*, and *I. fusciventer*; however, *I. obesulus* from SA was polyphyletic to WA taxa, complicating taxonomic inference. Microsatellite data aided identification of evolutionarily significant units consistent with existing taxonomy, with the exception of SA *I. obesulus*. Further, analyses indicated SA and Vic *I. obesulus* have low diversity, and these populations may require more conservation efforts than others to reduce further loss of genetic diversity.

**Keywords:** golden bandicoot; southern brown bandicoot; quenda; phylogenetic; taxonomy; evolutionarily significant unit; ecosystem engineer

Received: 23 November 2020 Accepted: 7 December 2020

Published: 22 December 2020

**Citation:** Thavornkanlapachai, R.; Levy, E.; Li, Y.; Cooper, S.J.B.; Byrne, M.; Ottewell, K. Disentangling the Genetic Relationships of Three Closely Related Bandicoot Species across Southern and Western Australia. *Diversity* **2021**, *13*, 2. https://dx.doi.org/10.3390/

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

d13010002

**Copyright:** © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/ licenses/by/4.0/).

**1. Introduction**

Threatened species often require active management to ensure their persistence. An important first step in threatened species conservation is identifying and defining appropriate units of management to direct conservation effort to where it is needed the most [1,2]. Confusion about taxonomic boundaries of subspecies and other conservation units has the potential to lead to inappropriate management decisions that may have detrimental consequences. For example, mixing populations that are genetically and evolutionarily distinct could result in the loss of local adaptation or outbreeding depression [3,4]. Conversely, managing populations as separate units, particularly when population sizes are small, may lead to increased levels of inbreeding and loss of adaptive potential due to random genetic drift [5]. Developing an appropriate taxonomic classification that reflects the actual nature of species or subspecies is therefore essential to guide management decisions to improve conservation outcomes.

The concept of "evolutionarily significant units" (ESU) has been used to aid management of populations in recent decades. The concept was introduced by Ryder [6] to resolve "subspecies" into manageable units where an ESU is defined as a group of organisms with high genetic and ecological distinctiveness. With increasing use of molecular tools, Moritz [7] proposed a genetic criterion for defining an ESU as a group of organisms that are reciprocally monophyletic for mtDNA haplotypes and have diverged allele frequencies at nuclear loci, a definition that places emphasis on historical isolation of populations. A sub-category under an ESU is "management unit" (MU), which recognises populations that do not show reciprocal monophyly for mtDNA haplotypes but have diverged haplotype and allele frequencies at mitochondrial or nuclear loci to accommodate more contemporary population structuring [7]. The use of reciprocal monophyly as a criterion for defining an ESU, whilst popular, has been criticised as being too stringent [8,9]. Fraser and Bernatchez's [8] concept of "adaptive evolutionary conservation" offers a more flexible approach, which defines an ESU as "a lineage demonstrating highly restricted gene flow from other such lineages within the higher organization level (or lineage) of the species". These lineages are sufficiently isolated that each lineage has limited or no impact on the evolution, genetic variance, and demography of other lineages [8]. This approach integrates multiple ESU concepts, including Moritz's, to fulfil conservation goals, which under differing circumstances will warrant the use of some concepts over others.

The short-nosed bandicoots, genus *Isoodon* (family Peramelidae)*,* is a small-to-mediumsized Australian marsupial that can be found across Australia. Bandicoots are terrestrial, omnivorous marsupials with a bounding gait and forelimbs adapted for digging for food [10]. As digging mammals, they play an important role as ecosystem engineers, contributing to altered soil nutrient and moisture dynamics that enhance plant germination and growth [11]. Their "rat-like" appearance and cryptic nature make them less attractive for conservation efforts. However, they are known for their complex biology of adaptive traits and evolutionary diversity [11,12]. These traits enable some *Isoodon* species to adapt to urban environments [13,14]. However, facing pressure from introduced predators and human disturbances, many *Isoodon* species have dramatically reduced in distribution since European settlement and persist in remnants of their formal ranges.

The taxonomy of the genus *Isoodon* has changed multiple times over the last 70 years and is currently still in flux. Up to eleven different species of *Isoodon* have been recognised at various times [15]; however, only three are formally recognised currently: *I. macrourus, I. obesulus*, and *I. auratus* [16,17], with additional taxa *I. peninsulae* [18] and *I. fusciventer* [19] recently proposed, both raised from subspecies (of *I. obesulus*) to full species rank. Most recently, comprehensive geographic and taxonomic sampling in a phylogeographic study of *I. obesulus* and *I. auratus* has indicated further complexity in the relationships of species and subspecies within these taxa [12]. One major finding was that the geographic distribution of *I. o. obesulus* is significantly more restricted than previously recognised, with specimens of *I. o. obesulus* from South Australia (SA; encompassing Mount Lofty Ranges, Fleurieu Peninsula, and Kangaroo Island) genetically differentiated from *I. o. obesulus* from south-eastern Australia and more closely allied with *I. fusciventer* from Western Australia (WA; [12,20]). Further, consistent with early molecular studies [15,21], Cooper et al. [12] failed to resolve a clear pattern of reciprocal monophyly between *I. fusciventer* (southern WA) and *I. auratus* (northern WA) in mitochondrial DNA, evidence that has previously been used to suggest the species be synonymised [15,21]. This suggestion based on genetic data contrasts with morphological data indicating that *I. obesulus* and *I. auratus* can be readily distinguished by a range of characters, including size, skull and teeth characters, and fur colour [15,19,22,23]. While monophyly could not be resolved, Cooper et al. [12] identified a lack of mitochondrial DNA haplotype sharing between the taxa as well as fixed allozyme differences and private nuclear gene haplotypes, suggesting a putative pattern of paraphyly of these taxa.

Here, we focus on identifying conservation units within the "southern and western" group of *Isoodon* bandicoots, WA taxa *Isoodon auratus, I. a. barrowensis* and *I. fusciventer,* and southern Australian taxa *I. obesulus* (Vic and SA), all of which are of conservation concern. *Isoodon* species have dramatically declined in their abundance and distribution in the last 220 years since European settlement of Australia [24,25], primarily as a result of introduced predators such as feral cats, foxes and black rats, altered fire regimes, disease, and habitat loss from clearing of native vegetation and habitat modification [26–28]. *Isoodon obesulus, I. fusciventer*, and *I. auratus* are listed as Near Threatened, Least Concern, and Near Threatened, respectively (International Union for Conservation of Nature red list/Environment Protection and Biodiversity Conservation (EPBC) Act, Australia). The subspecies *I. o. obesulus* in eastern and south-eastern Australia is listed as Endangered, although re-circumscription as suggested by analyses in Cooper et al. [12] means this subspecies is now more geographically restricted requiring re-evaluation of its conservation status; whilst other subspecies, *I. a. auratus* and *I. a. barrowensis*, are listed as Vulnerable under the EPBC act. These species/subspecies are currently subjected to different levels of conservation action to mitigate threats [26], with *I. a. auratus* and *I. o. obesulus* particularly targeted for translocations and reintroductions to island and mainland feral predator-free areas to increase their population size [29]. Clarification of taxa and conservation units is required to support conservation assessment and listing across the group and to ensure future conservation management activities are appropriately applied.

In this study, we expand sampling of representative *I. auratus* from northern WA and *I. fusciventer* from southern WA, as well as introduce additional molecular data from hyper-variable nuclear loci (microsatellites), to provide further resolution to the relationships amongst southern and western *Isoodon* bandicoots as raised in Cooper et al. [12]. Incorporating existing molecular data from Cooper et al. [12] and Li et al. [20] with our expanded sampling, we use a combination of phylogenetic and hierarchical population genetic approaches to identify putative ESUs across the southern and western *Isoodon* bandicoots, *I. auratus*, *I. fusciventer*, *I. obesulus* SA, and *I. o. obesulus* Vic. Within each regional population, we investigate geographical sub-structuring and assess patterns of mitochondrial and microsatellite genetic diversity to inform priorities for conservation genetic management.

#### **2. Materials and Methods**

#### *2.1. Sample Collection, Microsatellite Genotyping, and Mitochondrial DNA Sequencing*

We collated microsatellite and mitochondrial DNA (mtDNA) profiles from previously published sources with new sampling for a total of 731 and 218 additional samples, respectively. *Isoodon obesulus, I. fusciventer*, and *I. auratus* samples were distributed across multiple locations in Western Australia, South Australia, and Victoria and across multiple years from 2002 to 2018 (Figure 1, Table S1). Microsatellite profiles for WA specimens of *I. auratus* and *I. fusciventer* (*n* = 172) were predominantly obtained from previously published datasets in Ottewell et al. [30] and Ottewell et al. [13], respectively, with some additional new sampling from southwest WA (*I. fusciventer*) and the Kimberley region (*I. a. auratus*) (Table S1). Ear biopsy samples were opportunistically obtained by environmental consultants during urban fauna relocations and during population monitoring by the Australian Wildlife Conservancy (Kimberley samples). A small number of samples were obtained from roadkill animals by citizen scientistsunder the Department of Biodiversity, Conservation and Attractions license to use animals for scientific purposes (permit no. U10/2018). DNA extraction and microsatellite genotyping were conducted as described in Ottewell et al. [30] using 10 microsatellite primers (B3-2, B20-5, B34-2, Ioo2, Ioo4, Ioo6, Ioo7, Ioo8, Ioo10, and Ioo16). Briefly, amplification reactions were carried out using the Qiagen Multiplex Kit, following the manufacturer's instruction (Qiagen, Hilden, Germany). PCR products were separated on an Applied Biosystems 3130xl capillary sequencer (Foster City, CA, USA). Fragment sizes were determined with GeneScan-500 LIZ size standard (Applied Biosystems) and scored in Genemapper v5.0 software (Applied Biosystems). Microsatellite profiles from published datasets were re-scored with new samples to ensure

consistency in allele scoring. Microsatellite genotypes from SA and Vic were obtained from Li et al.'s [20] study (*n* = 559). Due to the addition of M13 tags to primers used in this study, which added 30 bp to tandem repeat sizes, we re-amplified a subset of 15 samples (*n* = 5 from each of Mount Burr, Mount Lofty Ranges, Grampians populations) in our WA laboratory to calibrate genotype scores between SA/Vic and WA datasets.

**Figure 1.** Map of current and historical distribution of *Isoodon* taxa included in the current study (adapted from Zenger et al. [21]) and geographic distribution of mitochondrial DNA haplotypes. Grey shading represents contemporary distribution of taxa formerly or currently recognised within *I. obesulus* and black shading represents current distribution of *I. auratus*. Dotted line indicates historical limits of *I. obesulus* and dashed lines designate historical *I. auratus* limits. S indicates sampling locations. Each pie chart represents haplotype frequencies within each sampling site, and size of pie chart represents numbers of haplotypes. Colour codes indicate the sites where haplotypes were mostly found. Pattern codes represent individual haplotypes found within each site.

Sequences from the highly variable non-coding mitochondrial control region (D-loop) were obtained from Li et al. [20] (*n* = 49) and Cooper et al. [12] (*n* = 23, Table S1) with additional sequences obtained for a subset of the WA samples used in microsatellite genotyping (*n* = 146). New specimens were sequenced using primers L15999M and H16498M [31] with amplifications carried out in 25 μL reactions, which included ~10–20 ng of template DNA, 0.5 μM of each primer, 1.25 mM MgCl2, 0.2 μM of each dNTPs, 0.05 μL of 10% BSA, 1× Reaction Buffer, and one unit of Invitrogen Taq polymerase. Reactions were run under the following conditions: 94 ◦C for 3 min followed by 35 cycles of denaturation 94 ◦C for 30 s, annealing at 55 ◦C for 45 s, and extension at 72 ◦C for 60 s, with a final extension of 72 ◦C for 7 min. Products were visualised on 1% agarose gels and Sanger-sequenced at a commercial service (Australian Genome Research Facility). Sequences were edited and aligned using SEQUENCHER v5.2 (Gene Codes Corporation, Ann Arbor, MI, USA) with consensus sequences from each of the datasets then aligned using CLUSTAL W with default parameters [32] and sequences trimmed to the same length in MEGA v10.0.5 [33].

#### *2.2. Data Analysis*

#### 2.2.1. Phylogenetic Analysis

Phylogenetic analyses of mtDNA sequences were carried out using maximum likelihood in RAXML v7.0.3 [34], neighbour-joining in MEGA v10.0.5 [33] and Bayesian inference in BEAST v1.10.4 [35]. In RAXML, we applied a single model of evolution, General Time Reversible (GTR) model [36] with unequal variation at sites modelled using a Gamma (G) distribution [37] to the sequence data. Rapid bootstrap analysis was applied to

search for the best-scoring maximum likelihood (ML) tree. In MEGA, a neighbour-joining tree was constructed using the proportion (*p*) of nucleotide site changes as a distance estimate and a Gamma (G) distribution as the rate among sites. The robustness of the nodes in the tree was assessed by 1000 bootstrap replicates. For BEAST analysis, we ran JMODEL-TEST v2.1.10 [38] to identify the most likely substitution model for our dataset. In BEAST, we used HKY+I+G model as inferred by the previous step with an uncorrelated lognormal relaxed clock. Four independent runs of 50 million generations and sampling every 5000 generations were carried out. The log files were analysed in TRACER v1.7.1 [39] to ensure stationarity had been reached by visually inspecting the traces and each log file had the effective sample sizes >200 for all parameters. Tree files were combined in LOGCOM-BINER v1.10.4 (BEAST package) with the burn-in period set to 2500 applied to each tree. The combined tree file was annotated using TREEANNOTATOR v1.10.4 (BEAST package). ML, neighbour-joining, and Bayesian trees were visualised using functions from the R packages "APE" [40] and "PHYTOOLS" [41] in R version 3.6.2 [42] using *I. macrourus* as an outgroup.

#### 2.2.2. Population Level Analysis

To investigate genetic relationships between *Isoodon* from different clades, principal component analysis was carried out for both nuclear and mitochondrial DNA using the "ADE4" package [43] and the "ADEGENET" package [44] in R based on Euclidean genetic distances using the *dudi.pco* command. We did not scale allele or haplotype frequencies and left missing genotypes as is. The first two principal components were retained.

Multiple methods were used to infer population structure between and within major geographic regions (WA, SA, Vic) from nuclear DNA. Firstly, we used a Bayesian clustering in STRUCTURE 2.3.4 [45] based on a model that assumed admixture of ancestry and correlated allele frequencies. In each analysis, individuals were assigned a membership coefficient, which is the fraction of the genome with membership to a particular genetic cluster. Ten independent runs were performed using 500,000 iterations, with a burn-in period of 50,000 iterations. We compared the likelihood values for different K values (1–20) and selected K based on the largest decrease in Delta K value [46,47]. Cluster membership coefficients were permuted over multiple runs using the Greedy algorithm in CLUMPP to obtain a mean across replicates [48]. Secondly, we used spatially explicit Bayesian clustering implemented in TESS 2.3.1 [49,50]. TESS differs from STRUCTURE in that it seeks genetic discontinuities in continuous populations and estimates individual admixture proportions using spatial prior information [49]. We used the admixture model with 50,000 sweeps and 10,000 burn-in for K ranging from 2–20 and 10 replicates per K. The optimal K was chosen where the value of the deviance information criterion (DIC) stabilises. The mean estimated cluster membership coefficient of the chosen K in TESS was obtained using CLUMPP as previously. Thirdly, unlike STRUCTURE and TESS, discriminant analysis of principal components (DAPC) does not assume Hardy–Weinberg equilibrium of clusters. DAPC is a multivariate method that partitions genetic variation into principal components to find groups using K-means that minimize within group variation [51]. We ran the *find.cluster* command using the ADEGENET package in R. We used the number of components that allowed 90% of cumulative variance to be retained. We selected the optimal K based on a combination of K with the lowest Bayesian information criterion (BIC) and after the largest decrease in BIC [51]. In resolving population structure, we undertook a hierarchical approach, firstly analysing all geographic regions together to identify the broadest genetic structuring, then successively removing the most distinctive cluster and repeating analyses to identify sub-structuring and hence putative conservation units down to the local population scale. Note that for regional analyses, we included the Mt Burr population, located in the far south-eastern South Australia (Figure 1), with other Victorian populations, as this population was shown to cluster with *I. o. obesulus* Vic [12,20].

Analysis of molecular variance (AMOVA) of mitochondrial DNA was applied in ARLEQUIN version 3.0 [52] to identify the distribution of genetic variation between and within hierarchical clusters identified by STRUCTURE. The locus-by-locus variance component was estimated from 16,000 permutations. Genetic divergences between pairs of population samples were quantified for nuclear DNA using Jost's [53] estimate of differentiation (Dest) using GENALEX version 6.5 [54]. While pairwise φST values for mitochondrial DNA were quantified using ARLEQUIN version 3.0 with 16,000 permutations [52].

To test whether the observed haplotype patterns within *Isoodon* spp. populations indicate recent bottlenecks or, conversely, demographic expansion, we used Tajima's *D* and Fu's *FS* neutrality tests and mismatch distribution carried out in ARLEQUIN version 3.0 [52]. In general, Tajima's *D* and Fu's *FS* neutrality tests were used to assess whether the data were consistent with the population being at neutral mutation-drift equilibrium [55,56]. Departure of Tajima's *D* or Fu's *FS* from zero, either negative or positive, indicates that the population of interest is deviated from the assumption of neutrality and/or equilibrium. If *D* or *FS* < 0, the population size may be increasing or an indication of purifying selection. If *D* or *FS* > 0, the population size may be decreasing (or bottleneck) or an indication of overdominant selection. We used 1000 simulated samples to calculate the level of significance for both tests. Mismatch distribution employs the distribution of pairwise differences between haplotypes from which spatial population expansion can be estimated. Multi-modal mismatch distribution testifies the population's demographic equilibrium or subdivided populations, while uni-modal suggests recent demographic expansion [57,58], or range expansions with a high level of gene flow [59,60]. We used 1000 bootstrap pseudoreplicates to test the model of spatial expansion by comparing the sum of squared deviations (SSD) between observed and simulated data. The Harpending's raggedness index (*r*) was also used to test for deviation from unimodality. Smaller values indicate a better fit. The significance of the estimate was also obtained from the corresponding *p* value.

#### 2.2.3. Genetic Variation Analysis

We assessed the genotypic error rate of microsatellite scores between laboratories by calculating the allele- and locus-specific genotypic error rates [61]. We then tested for Hardy–Weinberg equilibrium (HWE) among loci for each population with sample size ≥10 using GENALEX version 6.51 [54]. We tested for the presence of null alleles at each locus using MICROCHECKER [62]. Estimates of the allelic richness (an estimate of the number of alleles per locus corrected for sample size), number of alleles, gene diversity, inbreeding coefficient (*F*IS) were calculated using FSTAT version 2.9.3.2 [63]. The significant deviation of *F*IS values from Hardy–Weinberg Equilibrium was determined by randomization tests in FSTAT. Differences in gene diversity and allelic richness among regions were statistically tested using a non-parametric Friedman's test with locus as a blocking factor and post-hoc multiple comparisons made using the Conover test with Bonferroni correction in the R package "PMCMR" [64]. Mitochondrial DNA diversity was quantified by calculating the number of haplotypes, gene diversity, and nucleotide diversity in ARLEQUIN version 3.0 [52].

#### **3. Results**

#### *3.1. Genetic Data Quality Assessment*

Across 218 mtDNA sequenced samples, 96 polymorphic sites, including sites with gaps, were found in the 497 bp D-loop region, giving a total of 77 haplotypes. Across the microsatellite dataset, the amplification success rate was 0.926 per locus. The allele-specific genotypic error rate from 15 re-genotyped samples was 0.107 (0.030 due to allelic drop-out and 0.077 due to false alleles). All loci deviated from HWE at least in one location, but none consistently deviated from HWE across all locations. Likewise, MICROCHECKER detected putative null alleles in all loci, but they were not consistent in all locations. All loci were retained for further analysis.

#### *3.2. Phylogeographic Structure across Southern and Western Australia*

Phylogenetic analysis of mtDNA using Bayesian methods resolved three major evolutionary groups within our sample (Figure 2). At the highest level, *I. o. obesulus* from Victoria was distinguished from the remaining SA and WA entities with very high support (PP 1.0; Figure 2). Additional sampling helped us to further resolve two evolutionary groups within the latter clade of southern and western *Isoodon,* with northern WA taxa (*I. a. auratus*, *I. a. barrowensis*), for the most part, distinct from southern WA (*I. fusciventer*, PP 0.93). However, mtDNA haplotypes from SA *I. obesulus* were polyphyletic, occurring across both clades. While *I. obesulus* on Kangaroo Island and Fleurieu Peninsula clustered with southwest WA *I. fusciventer*, *I. obesulus* in Mount Lofty Ranges was paraphyletic in both *I. fusciventer* and *I. auratus* clades, although with low internal posterior support (PP < 0.90, Figure 2). Similarly, a small number of *I. auratus* and *I. fusciventer* haplotypes were placed within opposing clades, again with low support (PP < 0.90, Figure 2). Maximum likelihood and neighbour-joining trees showed a similar pattern of phylogenetic clustering as the Bayesian tree with three major evolutionary groups formed, however, with differing relationships amongst clades (Figures S1 and S2). In both ML and NJ trees, SA *I. obesulus* were polyphyletic across *I. fusciventer* and *I. auratus* in a similar pattern to the Bayesian tree.

Nuclear (microsatellite) markers further confirmed support for the distinction of Mount Lofty Range, Fleurieu Peninsula, and Kangaroo Island *I. obesulus* in South Australia from Mount Burr in south-east SA and Victorian *I. obesulus*, with two main clusters (Mount Burr/Victoria *I. obesulus* and SA/WA *Isoodon*) evident in the PCoA (Figure 3a). This latter cluster indicated a geographic cline and genetic overlap from South Australia (*I. obesulus*), southern Western Australia (*I. fusciventer*) to northern Western Australia (*I. auratus*) (Figure 3b), which was similarly supported by PCoA analysis of mtDNA (Figure 3c,d).

At the highest level, all three clustering analyses (Structure, TESS, and DAPC) resolved K = 3 genetic clusters supporting the clear distinction of Mount Burr and Victorian *I. o. obesulus* from SA/WA (Figure 4a and Figure S3), but which failed to resolve distinction between the SA and WA taxa as currently recognised (i.e., *I. auratus, I. fusciventer,* SA *I. obesulus*). SA and northern WA, the two most geographically distant populations, formed the basis of two additional genetic clusters with a pattern of admixture occurring in southern WA (Figure 4a). This admixture pattern was also retained when Victorian *I. o. obesulus* was removed from analyses (Figure 4b and Figure S4).

Structuring within WA taxa based on microsatellite data was only revealed upon hierarchical analysis. With SA samples removed, structure analysis recovered K = 3 populations representing southern WA (*I. fusciventer*), Barrow Island (*I. a. barrowensis*), and the Kimberley (*I. a. auratus*) (Figure 4c). These clades are largely represented in phylogenetic analyses with subspecies *I. a. barrowensis* forming a distinct clade in maximum likelihood and neighbour-joining trees (Figures S1 and S2) but nested within a group comprising other Kimberley mainland samples in Bayesian analyses (Figure 2).

**Figure 2.** Results of Bayesian phylogenetic analysis in BEAST of D-loop mitochondrial sequences from Isoodon obesulus (Vic, SA), *I. fusciventer*, and *I. auratus*. Posterior probability values above 0.9 are indicated by grey circles overlaid on branches. *Isoodon macrourus* was used as an outgroup. Photo credits: *I. auratus* Australian Wildlife Conservancy [65], *I. fusciventer* Dr Kenny Travouillon, Western Australian Museum, *I. obesulus* Museums Victoria [66].

**Figure 3.** Stepwise principal co-ordinates analysis (PCoA) of allele frequencies based on ten microsatellite loci (**a**,**b**) and D-loop mtDNA haplotypes (**c**,**d**) including (**a**,**c**) and excluding (**b**,**d**) Victorian *I. obesulus* samples. Relative PCoA scores have been plotted for the first two dimensions (percentage variation explained on each axis). Individuals are represented in dots, and colours indicate sampling sites.

**Figure 4.** Summary of the hierarchical clustering analysis using STRUCTURE based on 10 microsatellite loci. The analyses include (**a**) all samples, (**b**) excluding Victoria, (**c**) excluding Victoria and South Australia, and (**d**) regional populations. Pie charts on Australia maps illustrate individuals' estimated membership to each genetic cluster relative to their geographic locations. Barplots in (**d**) show the most likely K value for each region. Regions include northern Western Australia (NWA), southern Western Australia (SWA), South Australia (SA), and Victoria (VIC). Black lines in barplots separate samples from different sampling locations. Each coloured bar represents an individual with different colours reflecting estimated proportional membership to particular clusters.

#### *3.3. Population-Level Analysis*

At the population level, we detected substantial genetic sub-structuring within each geographic region, with the exception of southern WA (Figure 4d). Three clusters were identified in northern WA separating Kimberley islands, Kimberley mainland and Barrow Island (Figure 4d). We detected multiple admixed clusters in southern WA and the modal K value varied between three and four clusters (Figure S3); in each, however, inland southwest WA was consistently distinct from the Perth population. In South Australia, Kangaroo Island was differentiated from Mount Lofty Ranges, but Fleurieu Peninsula was either assigned to Kangaroo Island (STRUCTURE), admixed between the two (TESS), or assigned to its own cluster (DAPC). In Victoria, the Grampians was differentiated from

Mount Burr in south-east SA, while Lower Glenelg either assigned to the same cluster as Mount Burr (STRUCTURE and TESS) or formed its own cluster (DAPC) (Figure S3).

AMOVA also supported geographic structuring of mtDNA haplotypes when each region was analysed separately and revealed that a large proportion of genetic variation occurred between sampling sites, ranging from 27.5–73.3% (*p* < 0.001 in all cases, Table 1). In South Australia, 73.3% of genetic variation was found among populations, while approximately half of the genetic variation was found among populations in northern WA (48.3%) and Victoria (48.4%). In southern WA, more genetic variation was found within population (72%) indicating greater common shared genetic variation among populations than other regions (Table 1), consistent with patterns identified in cluster analyses.

**Table 1.** Hierarchical AMOVA of mitochondrial DNA performed by grouping sampling sites into regions. Barrow Island (BWI), Kimberley Island (KIMI), Kimberley mainland (KIMM), south west Western Australia (SW WA), Mount Lofty Ranges (MLR), Fleurieu Peninsula (FP), Kangaroo Island (KI), Lower Glenelg (LG).


Pairwise population *D*est values based on microsatellite and φST values based on mtDNA data indicated a substantial genetic divergence between *Isoodon* in different regions (Table 2). The highest differentiation was recorded between SA and Victoria samples with *D*est and φST values ranging from 0.87–0.97 and 0.75–0.98, respectively, and comparable to differentiation between Victoria and WA samples (*D*est 0.69–0.88 and φST 0.71–0.88). SA and southern WA showed lower genetic differentiation (*D*est 0.38–0.65 and φST 0.36–0.61), while *Isoodon* from northern WA and southern WA indicated significant allelic differentiation (*D*est 0.62–0.82) and comparable differentiation in mtDNA (φST 0.38–0.68).

Substantial genetic differentiation was also observed amongst populations within geographic regions, particularly of Kangaroo Island from remaining SA populations (Table 2).

#### *3.4. Population Genetic Diversity*

Overall, mtDNA diversity was high and comparable between southern WA, northern WA, and SA *Isoodon* populations (Table 3). Across all diversity metrics, Victorian populations had low diversity, with lower haplotype diversity than all other populations (Table 3) and similarly low nuclear diversity to SA (*H* = 0.44–0.61 and 0.42–0.60, respectively; Friedman's test, *p* < 0.05) indicating these populations may be in genetic decline. Multi-locus *F*IS values had significantly positive *F*IS values indicating a possible Wahlund effect (samples were sourced from multiple populations) or non-random mating patterns within all sites with the exception of Kimberley mainland (randomization tests, *p* < 0.0042, Table 3). Kimberley mainland had the highest number of unique haplotypes, yet most haplotypes occurred at low frequencies. This pattern is concordant with evidence of recent population expansion according to the unimodal mismatch distribution and correlation with the spatial expansion model (Tajima's D = −1.87, Fu's Fs = −5.90, *p* < 0.05 in both cases; Figure S5). Similarly, the Kimberley mainland retains high microsatellite diversity relative to other populations.



Significant*D*estvaluesandφSTvaluesafterapproximately1000permutationsaredenotedwithbold font. Coloursrepresentlevelsofdifferentiationfromlighttodark:0.0–0.2,0.2–0.4,0.4–0.6,0.6–0.8,and0.8–1.0.

*Diversity* **2021**, *13*, 2


**Table 3.** Nuclear and mitochondrial DNA genetic diversity characteristics of *Isoodon* populations.

mtDNA sample size (ND-loop), number of haplotypes (*Nh*), number of unique haplotypes (*uh*), haplotype diversity (*h*), nucleotide diversity (∏), microsatellite sample size (*N*), gene diversity (*H*), allelic richness (*A*r), allele number (*A*), and inbreeding coefficients (*F*IS). Standard errors are given after mean values. *F*IS estimates significantly greater than zero after correction for multiple comparisons are denoted with bold font. Statistical differences amongst geographic regions for nuclear diversity parameters were tested using a Friedman's test and superscripts identify statistically significant groups as determined by post-hoc multiple comparison of group means. Populations with different lettered superscripts are statistically different at *p* < 0.05.

#### **4. Discussion**

In conservation biology, a taxonomic classification system that closely reflects the biology of a species is essential to guide conservation management actions and increase the chance of success. Incorrect designation of species and subspecies could lead to inappropriate management decisions that can have detrimental effects such as outbreeding depression, maladaptation, or separating populations when genetic augmentation could benefit inbred populations [3,5,67]. Through increased spatial sampling and addition of hyper-variable nuclear markers, our study aimed to provide further resolution to the genetic relationships of *Isoodon* species in Western Australia and South Australia, where current taxonomy does not reflect reported phylogeographic relationships [12]. At the broadest level, increased sampling enabled us to resolve two well-supported mtDNA clades representing WA entities *I. auratus* (northern WA) and *I. fusciventer* (southern WA); however, *I. obesulus* in South Australia remained polyphyletic with SA haplotypes represented in both WA clades, suggesting that incomplete lineage sorting may be confounding phylogenetic analyses across this group. Further highlighting the close relationship amongst these three taxa, analyses of nuclear microsatellite markers indicated a geographic cline stretching from northern WA to South Australia rather than resolving distinct taxonomic entities, with sub-structuring amongst taxa only becoming evident in hierarchical clustering analyses. In all analyses, we detected highly localised patterns of mitochondrial and nuclear genetic diversity indicating limited contemporary gene flow. Taken together, our analyses indicate a complex phylogeographic history for *Isoodon* sp. in southern and western Australia. We discuss identification of conservation units below, as well as some of the underlying processes that may have contributed to the observed results.

#### *4.1. Distinction between I. fusciventer, I. auratus, and I. obesulus (SA)*

The relationship of WA taxa *I. auratus* and *I. fusciventer* has long been queried, with multiple mtDNA studies over the past 20 years reporting the taxa as polyphyletic, evidence used to suggest the two species be synonymised [12,15,21]. This observation has been controversial given the striking morphological differences between *I. auratus* and *I. fusciventer*, with *I. fusciventer* more similar in size and pelage colour to *I. obesulus* than *I. auratus* (Figure 2; [19]), although several case-studies have indicated morphological variation in WA bandicoots can be strongly environmentally determined [68–71].

Here, with expanded geographic sampling of both *I. auratus* (northern WA) and *I. fusciventer* (southern WA), we largely resolved two monophyletic groups in mtDNA (putative ESUs) with strong Bayesian support (PP 0.93). However, a very small number of haplotypes from *I. fusciventer* were clustered with *I. auratus* and vice versa, rendering these groups polyphyletic. Further, to add complexity to this relationship, haplotypes from South Australian *I. obesulus* were polyphyletic with *I. auratus* and *I. fusciventer*, with Kangaroo Island and Fleurieu Peninsula samples clustered within *I. fusciventer* and Mount Lofty Ranges samples polyphyletic between *I. auratus* and *I. fusciventer*. Even though SA mtDNA haplotypes were unique, they were most closely related to haplotypes of either *I. auratus* and *I. fusciventer* indicating a relictual connection between these lineages. For species undergoing allopatric speciation, coalescent theory predicts that the process is detected in molecular phylogenies through the progression from a pattern of polyphyly to paraphyly to reciprocal monophyly, the timing of which is a function of effective population size with large populations taking longer to achieve reciprocal monophyly than small ones [72,73]. *Isoodon auratus*, in particular, was historically abundant and widespread across much of central and western Australia, possibly extending into New South Wales [17] before massive range collapse following European settlement, and *I. auratus* and *I. fusciventer* currently retain large population sizes relative to other eastern *I. obesulus* [28]. The close relationships of various SA *I. obesulus*, *I. auratus*, and *I. fusciventer* mtDNA haplotypes likely reflects historical connection amongst these taxa and suggests that time since geographic isolation has been insufficient to allow complete lineage sorting at this locus. This is perhaps unsurprising given the recent diversification of bandicoots in Australia within the

late Pleistocene (3.1 Mya), with more recent speciation between *I. auratus* and *I. fusciventer* estimated at approximately 1.5 Mya [19]. An opportunity to genetically sample extinct populations from the previous range of these species (e.g., via museum specimens or sub-fossil deposits [74]) could help to more fully illuminate the historical demography and connectivity across this group.

Analyses of nuclear (microsatellite) allele frequency data showed inconsistency with mtDNA results for the relationship of SA *I. obesulus* with WA taxa. PCoA of microsatellite allele frequencies indicated a geographic cline ranging from northern WA, through southern WA to South Australia reflecting *I. fusciventer* as a connecting population according to their geographic relationship. In contrast, PCoA of mtDNA showed SA *I. obesulus* as closely related to both *I. auratus* and *I. fusciventer* with the latter two species at the ends of the cline. Clustering analyses of microsatellite data (TESS, Structure, DAPC) consistently resolved K=2 clusters separating WA (*I. auratus, I. fusciventer*) from SA (*I. obesulus*), albeit with a pattern of admixture between the two clusters occurring in southern WA (Structure, TESS) that is often indicative of a pattern of genetic isolation by distance [75,76]. Such a pattern of IBD in microsatellite data may reflect greater male-biased dispersal, whereas restricted female dispersal has led to greater structuring in mitochondrial DNA that is maternally inherited (e.g., [74]). Hierarchical structuring present within the WA cluster was revealed upon stepwise removal of divergent groups in clustering analyses, which then resolved K=3 clusters within WA representing *I. a. auratus* (Kimberley), *I. a. barrowensis* (Barrow Island), and *I. fusciventer* (southern WA).

Conflicting signals from mitochondrial and nuclear DNA analyses in our study thus make diagnosis of ESUs difficult under Moritz's genetic criteria of reciprocal monophyly, particularly in relation to SA *I. obesulus*. Under the less restrictive Adaptive Evolutionary Conservation concept of Fraser and Bernatchez [8], requiring only demonstration of highly restricted gene flow amongst lineages with a fundamental goal to preserve adaptive genetic variance within species, our data suggest *I. auratus, I. fusciventer*, and SA *I. obesulus* represent separate ESUs, with additional hierarchical substructure identified within *I. auratus* separating *I. auratus auratus* (northern WA, Kimberley) and *I. a. barrowensis* (northern WA, Barrow Island). The high genetic distinction of *I. auratus, I. fusciventer* and SA *I. obesulus* in mitochondrial and nuclear DNA is reflective of highly reduced gene flow, likely as a result of their allopatric distribution, and is in line with previous results from allozyme data [12] and morphological differences [15,19,22,23]. Multiple lines of evidence suggest these are distinct evolutionary lineages that warrant separate taxonomic status. While SA *I. obesulus* appeared related to both *I. fusciventer* and *I. auratus* in mtDNA, cluster analysis of nuclear data indicated that SA *I. obesulus* is more genetically distinct from the remaining WA taxa, which may warrant recognition of this entity at species level, particularly if *I. auratus* and *I. fusciventer* are retained as such. Allozyme data presented in Cooper et al. [12] shows a much closer relationship of *I. fusciventer* and SA *I. obesulus*, clearly distinct from *I. auratus,* which is consistent at least with the superficial morphology of the two entities. Further genetic and morphological investigation is required to clarify taxon boundaries amongst these entities, particularly as the SA population is highly restricted and declining and likely to require urgent conservation attention.

#### *4.2. Lack of Clarity on the Taxonomic Relationship of SA I. obesulus from Current Genetic Data*

Previous studies of the phylogenetic relationships of *Isoodon* bandicoots have relied on small numbers of samples and/or limited numbers of phylogenetic markers (mitochondrial or nuclear sequencing markers) [12,15,20,21]. Here, with expanded sampling of problematic taxa (*I. auratus*, in particular) and with the addition of high mutation rate nuclear markers (microsatellites), we aimed to provide greater resolution to the relationships between purported taxa in the "southern and western" clade of closely related *Isoodon* bandicoots identified in Cooper et al. [12]. Our analyses recovered ESUs and sub-hierarchical ESUs that are largely reflective of current taxonomic designations, although ambiguity remains over the taxonomic identification of SA *I. obesulus*. There is

growing recognition that multi-locus and multi-species coalescent approaches are required to overcome the limitations of current species delimitation methods, particularly when single gene trees are poorly resolved or in conflict with the actual species tree, as may arise due to ancestral lineage sorting, hybridisation, and a range of other population genetic processes [77,78]. Here, we find a disconnect in the results obtained for mitochondrial and nuclear DNA analyses for SA *I. obesulus* that may be attributed to a mismatch in the tempo of mutation rates in the markers we have used to the evolutionary processes occurring in the group. For example, mtDNA analysis is impacted by incomplete lineage sorting, and microsatellites may have mutation rates that are too rapid to reflect speciation patterns, as they largely reflect contemporary population genetic processes. Genomic sequencing approaches, for example, targeted exon capture [79] enable concurrent sampling of multiple, genome-wide markers with a range of mutation rates and could provide sufficient resolution to resolve speciation patterns across southern and western Australia. Given the relatively recent pattern of speciation and massive range collapse experienced by these taxa investigation of the demographic history of the group will be particularly fascinating, especially as the current distribution boundaries of *Isoodon* populations are coincident with biogeographic barriers that have been identified elsewhere (e.g., Nullarbor barrier, Murravian barrier; [80]).

#### *4.3. Genetic Diversity and Population Substructure*

Bandicoot species, particularly across southern Australia, are highly impacted by habitat fragmentation, habitat loss, and declining population size. Here, we found that both Victorian and South Australian populations of *I. obesulus* showed consistently low genetic diversity across nuclear markers relative to WA populations, and that Victorian populations had low mitochondrial diversity. In contrast, the Kimberley mainland population of *I. auratus,* which is less impacted by anthropogenic fragmentation effects, retains high genetic diversity with signals suggesting population expansion in the region.

Consistent with this, we also found evidence of restricted gene flow with very high numbers of unique local mtDNA haplotypes, significant and high pairwise *D*est and φST values, and a significant proportion of genetic variation apportioned between, rather than within, populations. We also detected significant genetic sub-structuring within each region via cluster analyses, although populations in southern WA were more admixed than elsewhere. Such a pattern is consistent with the expected limited dispersal range of bandicoots (~3 km; Li et al. [81,82]), but connectivity is also likely moderated by the amount (and fragmentation) of native vegetation in the landscape [13].

#### *4.4. Recommendations for Conservation Management*

Our mtDNA analysis, haplotype uniqueness, and significant divergence among populations in different regions suggested that *I. auratus, I. fusciventer*, and *I. obesulus* in South Australia should be managed as separate units, and that SA *I. obesulus* requires urgent consideration to assess its taxonomy and conservation status. Several populations included in this study showed evidence of genetic decline, particularly those in Victoria and South Australia. Ongoing habitat fragmentation and degradation in these regions may be contributing to limited gene flow amongst patches and subsequently genetic drift. Improving landscape-scale connectivity will be of high priority for these populations with evidence that low, shrubby vegetation can provide a structural habitat for bandicoots to avoid predation as well as connectivity, even in degraded environments [13,14,83]. Management of introduced predators (foxes, cats) and habitat quality improvement will also be critical factors [84–86]. As one of Australia's important digging mammals, bandicoots are also targeted for translocation to restore ecological function in ecological restoration projects and to secure "safe haven" sites to improve their conservation status [87–93]. Golden bandicoots (*I. auratus*) are of particular importance for arid zone translocations with those undertaken to date considered successful [30,70]. There is evidence that bandicoots show morphological adaptation to local conditions [68–70], which has been shown to have a genetic, as well as an environmental component [71], suggesting that translocation to similar habitats is important. However, all translocations to date have been in north-western Australia and have involved the Barrow Island population of *I. auratus*, which we demonstrate here to have lower genetic diversity (being an island population) than mainland Kimberley *I. auratus*. It may be prudent when planning future translocations to consider whether it is feasible to source animals from Kimberley populations to ensure genetic diversity within the species is maintained across multiple sites to reduce risks to further genetic decline.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/1424-281 8/13/1/2/s1.

**Author Contributions:** Conceptualization, R.T., K.O., S.J.B.C., and M.B.; methodology, R.T. and K.O.; software, R.T. and K.O.; validation, R.T. and K.O.; formal analysis, R.T.; investigation, R.T. and E.L.; resources, S.J.B.C., Y.L., and K.O.; data curation, R.T.; writing—original draft preparation, R.T. and K.O.; writing—review and editing, K.O., S.J.B.C., and M.B.; visualization, R.T. and K.O.; supervision, K.O. and M.B.; project administration, R.T.; funding acquisition, K.O. All authors have read and agreed to the published version of the manuscript.

**Funding:** No specific funding was provided for this research.

**Acknowledgments:** We thank A. Wayne, P. Glen, G. Harewood, P. Townsing for provision of additional *I. fusciventer* samples for this analysis and J. Smith and staff from the Australian Wildlife Conservancy for additional *I. auratus* samples from the Kimberley. Contributors to original datasets obtained from [12,13,20,30] are indicated in Acknowledgements of the respective publications.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Genetic Consequences of Multiple Translocations of the Banded Hare-Wallaby in Western Australia**

**Daniel J. White 1,\*, Kym Ottewell 1,2,3, Peter B. S. Spencer 3, Michael Smith 4,5, Je**ff **Short 3,6, Colleen Sims <sup>2</sup> and Nicola J. Mitchell <sup>1</sup>**


Received: 9 October 2020; Accepted: 24 November 2020; Published: 27 November 2020

**Abstract:** Many Australian mammal species now only occur on islands and fenced mainland havens free from invasive predators. The range of one species, the banded hare-wallaby (*Lagostrophus fasciatus*), had contracted to two offshore islands in Western Australia. To improve survival, four conservation translocations have been attempted with mixed success, and all occurred in the absence of genetic information. Here, we genotyped seven polymorphic microsatellite markers in two source (Bernier Island and Dorre Island), two historic captive, and two translocated *L. fasciatus* populations to determine the impact of multiple translocations on genetic diversity. Subsequently, we used population viability analysis (PVA) and gene retention modelling to determine scenarios that will maximise demographic resilience and genetic richness of two new populations that are currently being established. One translocated population (Wadderin) has undergone a genetic bottleneck and lost 8.1% of its source population's allelic diversity, while the other (Faure Island) may be inbred. We show that founder number is a key parameter when establishing new *L. fasciatus* populations and 100 founders should lead to high survival probabilities. Our modelling predicts that during periodic droughts, the recovery of source populations will be slower post-harvest, while 75% more animals—about 60 individuals—are required to retain adequate allelic diversity in the translocated population. Our approach demonstrates how genetic data coupled with simulations of stochastic environmental events can address central questions in translocation programmes.

**Keywords:** genetic diversity; population viability analysis; allele retention; translocation; conservation management; threatened marsupial; remnant

#### **1. Introduction**

Translocation, the anthropogenic movement of a group of organisms from one location to another, is an increasingly necessary tool for conservation management [1]. Conservation translocations take many forms (reintroductions, reinforcements, genetic rescue, assisted colonisation) and in all cases should be carefully planned, as they are costly and their success is hard to predict [2–7]. For example, in

fauna translocations, animals may be naïve to predators at translocation sites [6,8], and the harvesting of source populations may disrupt existing social dynamics [9–11]. Nevertheless, translocations are often the only option outside of ex situ conservation, to spread demographic risk and prevent extinctions [12–14].

As translocated populations often arise from a small number of founders, they risk losing genetic diversity via bottlenecks and/or genetic drift, and inbreeding depression can occur due to mating between related individuals [15]. It is therefore important to quantify the genetic diversity in source populations in order to manage these risks and to increase the evolutionary potential of translocated populations [7,16–19]. Individual-based population viability analysis modelling is a powerful tool for making predictions about potential outcomes of various translocation scenarios and can be used to optimise inherent, and often sensitive, trade-offs [18,20]. The incorporation of genetic data into population viability modelling is an important component of conservation decision making, and there has been a marked uptake of these types of analyses for Australian mammals [4,20–23].

Australia has the world's highest rate of mammal extinction [24]. Since European colonisation in 1788, 29 species have gone extinct and the western long-beaked echidna (*Zaglossus bruijnii*) is now extinct in Australia, equating to a rate of loss of around 0.13 species per year [24]. Major threats include predation from exotic introduced pests, in particular the European red fox (*Vulpes vulpes*) and feral cat (*Felis catus*), as well as habitat degradation. Habitat degradation includes both the loss of habitat and habitat fragmentation, both of which can exacerbate predation issues. In addition, smallto medium-sized mammals within a critical weight range (35–5500 g) are more prone to extinction than those outside this range [25–28]. For many of the endemic mammals of Australia that remain, their distributions have contracted to isolated populations, including offshore islands that remain feral predator free. To compound matters, climate change increases the risks of starvation, drought stress, and hyperthermia [29,30]. Accordingly, it is widely agreed that interventions are essential to secure Australia's remaining mammal diversity [24].

The banded hare-wallaby, *Lagostrophus fasciatus*, is a medium-sized (approx. 1700 g), critical weight range herbivorous and nocturnal macropod, the sole member of the Lagostrophinae sub-family. It is currently listed as Vulnerable by the IUCN and under Australia's environmental legislation (the Environment Protection and Biodiversity Conservation Act). Their pre-European range stretched from the coast of central Western Australia to southern South Australia, but the species now survives only in the Shark Bay region of Western Australia on Bernier Island (4267 ha) and Dorre Island (5163 ha) (Figure 1a). The last recorded sighting on the mainland was in 1906 [31]. A 2016 survey suggested there were 2790 individuals on Bernier Island and 2440 individuals on Dorre Island [32], however, populations cycle through boom and bust phases, triggered by rainfall and drought, respectively, and may be reduced by as much as 75% before subsequent recovery [33,34]. These small island populations are expected to have low genetic diversity, which could be further reduced if used as source populations for conservation translocations.

**Figure 1.** (**a**) Distribution of *L. fasciatus* in Western Australia (inset shows the approximate historical distribution); (**b**,**c**) measures of genetic diversity based on seven microsatellite markers; (**d**) translocation history (failed translocations not shown). Two recent translocations to Mt. Gibson and Dirk Hartog Island are underway; (**e**) STRUCTURE analysis of the two remnant wild populations (Bernier and Dorre Islands), two translocated populations (Faure Island and Wadderin), and two historic captive breeding populations (Peron Captive Breeding Centre (CBC) and Dryandra). Sampling periods are indicated at top of plot; \* sampling period for Dryandra is 1999–2002; (**f**) population-level principal coordinates analysis based on genetic distance. The first two axes explain 91.6 of the variation. The *L. fasciatus* image was designed by Creazilla (creazilla.com).

Since 1974, four translocations of *L. fasciatus* have been attempted, and two captive breeding colonies have been established but since discontinued (Figure 1d). The captive colony at Peron Captive Breeding Centre (hereafter Peron CBC) was managed by the Western Australian Department of Biodiversity, Conservation and Attractions (DBCA), and was sourced from Bernier Island. In contrast, a captive breeding colony at Dryandra was sourced from Dorre Island. Two translocation attempts were unsuccessful—specifically the movement of 21 animals from Dorre Island to southern Dirk Hartog Island from 1974 to 1978, and a movement of 18 animals from the Peron CBC to Francois Peron National Park in 2001. The failure of these translocations was attributed to predation by feral cats, drought (Dirk Hartog Island translocation), and the impact of livestock [35–37]. The two successful translocations were each founded from the Peron CBC, with 91 individuals released onto Faure Island between 2004 and 2013, and 12 individuals into Wadderin Sanctuary in 2013. Faure Island is a 4561-hectare island in Shark Bay managed by the Australian Wildlife Conservancy (AWC) where all feral predators and livestock have been removed and was reported to support a population size of approximately 300 in 2017 [38]. Wadderin Sanctuary on the mainland (Figure 1d) is a 430-hectare predator-free enclosure managed by a community group [39] and now holds approximately 30 animals [40].

Currently, two new populations of *L. fasciatus* are being established. From 2017 to 2018, a population of *L. fasciatus* was reintroduced to a 7832 ha safe haven (a fenced area from which introduced predators have been removed) at Mount Gibson Wildlife Sanctuary. A total of 119 individuals were translocated to Mt. Gibson from Bernier (*n* = 40), Faure (*n* = 19) and Dorre (*n* = 60) Islands. In 2017, a translocation programme began for *L. fasciatus* to Dirk Hartog Island (58,640 ha), forming part of DBCA's "Return to 1616" project following the removal of exotic predators and herbivores [41]. Fifty-six individuals from both Bernier and Dorre Islands were proposed to be moved over 3 years (total *n* = 112). In this study, we aim to determine how the translocation history of *L. fasciatus* has affected the genetic health of all extant populations, particularly as serial translocations via intermediary captive populations have led to the possibility of genetic bottlenecks. We use genetic and population viability analysis models to explore the impact on source populations of concurrent harvesting scenarios and how the retention of genetic diversity can be maximised, and we predict growth rates and genetic diversity of the new Dirk Hartog Island and Mount Gibson populations. Most analyses are conducted under a baseline assumption of regular drought cycles that reduce reproductive success and survivorship, and we test the sensitivity of our results to both reduced and increased drought frequencies, as both scenarios are possible under climate change [42].

#### **2. Materials and Methods**

#### *2.1. Tissue Samples*

Ear biopsies were sampled over 19 years, from 1998 to 2017, from six locations: Dorre Island (*n* = 79), Bernier Island (*n* = 51), Faure Island (*n* = 10), the Peron CBC (*n* = 73), the Return to Dryandra Captive Facility (hereafter Dryandra, *n* = 6), and Wadderin Sanctuary (*n* = 17) (Figure 1d). All available *L. fasciatus* samples (*n* = 236, Spreadsheet S1) were analysed.

#### *2.2. DNA Extraction and Nuclear Microsatellite Amplification*

Genomic DNA was extracted from ear tissue using a standard "salting out" protocol [43]. Ten microsatellite loci were initially amplified using primers derived from the tammar wallaby (*Macropus eugenii*; Me14, Me17; [44]), yellow-footed rock-wallaby (*Petrogale xanthopus*; Y105, Y175, Y151, Y148; [45]), and allied rock-wallaby (*P. assimilis*; Pa593, Pa297, Pa385, Pa55; [46]). Briefly, for samples prior to 2016, polymerase chain reactions (PCRs) were carried out for individual microsatellite primers in a total volume of 30 μL with ~100 ng DNA, 1X PCR buffer, 400 μM of dNTPs, 2 mM MgCl2, 0.2 μM of each primer, and 0.825 U Taq or, after 2016, in three PCR multiplexes using the Qiagen Multiplex PCR kit for a total volume of 7.5 μL per multiplex with ~5–10 ng DNA, 1X Qiagen buffer, 0.2 μM primer mix, and water (Table S1). Fluorescently labelled DNA fragments were separated using an ABI373xl

capillary sequencer (Applied Biosystems, Foster City, CA, USA) and scored manually with the aid of GENEMARKER software (v1.5, Soft Genetics, State College, PA, USA). Allele size was determined by co-running a Genescan500 standard (Applied Biosystems, Melbourne, Australia). Data were checked for input errors and duplicate genotypes using the Excel Microsatellite Toolkit add-in [46]. Deviations from Hardy-Weinberg equilibrium and linkage disequilibria were tested using GENEPOP v4.1.4 [47]. For markers with fewer than five alleles, a complete enumeration algorithm was used to estimate the exact *p* value; and for markers with five or more alleles, the Markov chain algorithm of Guo and Thompson [48] was used to generate an unbiased estimate of the exact *p* value. The presence of any null alleles was tested with MICROCHECKER v2.2.3 [49].

#### *2.3. Statistical Analyses of Genetic Data*

The programme STRUCTURE v2.3.4 [50] was used to estimate the number of genetically distinct populations and to assign individuals to populations. STRUCTURE uses a Bayesian clustering method to assign individuals to one of k populations and to estimate the degree of inter-population admixture. While some assumptions made by the software will likely not be met by the island populations in our system, such as the distribution of genotypes under Hardy Weinberg Equilibrium (HWE) and marker linkage equilibria, this approach still provides a useful assessment of population genetic divergence. Our models assumed no admixture between locations, but since in some cases locations represent recently sub-sampled source populations, and therefore could have shared allele frequencies, we ran models that assumed either correlated or uncorrelated allele frequencies between groups for comparison. After preliminary assessment of convergence times for the Monte Carlo Markov chain, a burn-in period of 100,000 steps was chosen, followed by 1,000,000 steps of the chain. To estimate k, four replicate runs at each value of k from 1 to 8 were performed, and the most likely value was estimated from the plot of ln Pr (X|k) vs. k, as well as from Evanno's method [51] which plots Δ k (a second order rate of change of ln Pr (X|k)) vs. k, using CLUMPAK beta version [52]. STRUCTURE figures were generated using DISTRUCT v1.1 [53] in CLUMPAK [52]. To assess genetic similarity between populations using an approach that does not rely on the same biological assumptions, a principal coordinates analysis was conducted based on genetic distance, run in GENALEX v6.503 [54,55]. The level of genetic differentiation among all populations was determined by estimating pairwise FST and Jost's Dest, also in GENALEX v6.503. Probabilities were calculated as the proportion of times the observed data generated values greater than values generated from 999 random permutations.

Standard genetic diversity metrics, including mean number of individuals per marker, mean number of alleles per marker, and observed and expected heterozygosities, were estimated for each sampling site, in GENALEX v6.503. Allelic richness, a measure of allelic diversity that controls for variable sample size, was estimated in HP-RARE [56] based on rarefaction to ten individuals. As marker Me17 was monomorphic except for one individual in the Faure population, in which it was homozygous for the alternate allele, allelic richness was calculated with and without this marker. Population inbreeding coefficients were measured using Wright's FIS in GENEPOP v4.1.4 [47]. Expected and observed heterozygosities, allelic richness, and FIS measurements were made for source populations separated into year cohorts. We estimated the genetic effective population size (Ne) using the linkage disequilibrium method on populations with a sample size of 25 or more [57–59], excluding singleton alleles (those that occur in one copy in one heterozygote) to prevent an upward bias in Ne estimation, as implemented in NeESTIMATOR v2.1 [60].

To assess whether any of the managed or natural populations underwent a genetic bottleneck, we compared the heterozygote distribution for each marker with the number of alleles for each population in BOTTLENECK v1.2 [61], using the two-phase substitution model with default proportions of the infinite alleles and single stepwise mutation models. A one-tailed Wilcoxon signed-rank test was used to estimate the likelihood that the observed data deviate from what is expected under mutation-drift equilibrium, the most appropriate test with fewer than 20 loci [62,63]. We also reported whether the allele frequency distribution deviated from an L-shaped mode, which can be a qualitative assessment of whether a population has passed through a genetic bottleneck [64].

Queller and Goodnight's [65] estimator was used to measure the mean pairwise relatedness within each population, as an indicator of genotype diversity and identity by descent within source populations. Whether the mean pairwise relatedness of a population was statistically different from 0 (defined as mean pairwise relatedness across the entire data set) was estimated by randomly permuting the data in the population pairwise matrix 999 times, and determining the proportion of permutations that gave a relatedness value greater than the observed value, as calculated in GENALEX v6.503.

#### *2.4. Population Modelling*

To estimate the number of founders needed to retain low-frequency alleles within newly established populations, scenarios were simulated in ALLELERETAIN v1.1 [66,67] implemented in R v3.6.2 [68]. ALLELERETAIN is an individual-based model that simulates population growth using a user-defined suite of parameters based on life-history traits. It estimates the probability of retention of selectively neutral alleles in founder populations over generations, with the starting frequency of these alleles set by the user. We were interested in assessing the impacts of translocating individuals of varying founder group size (N) on allele retention. To do this, we tested founder N values from 20 to 200 in increments of 20 individuals with allele frequency set to 0.05. We then repeated each scenario but considered the translocation to have occurred during a drought year. This was simulated by setting the initial survival after translocation to 0.4 and excluding reproduction for the first breeding cycle. For all simulations, we assumed that one *L. fasciatus* breeding cycle lasts 9 months [69]. Details and justification of the demographic parameters used in the model are provided in Table 1 and Table S2.


**Table 1.** Life history parameters used for *L. fasciatus* population modelling in VORTEX and ALLELERETAIN. Parameter values were obtained from data in [33,34,41,70] and refined by C.S. and J.S. Where a parameter is not listed, defaults are used. V: VORTEX; A: ALLELERETAIN; EV: environmental variation; BHW: banded hare wallaby; DDR: density-dependent reproduction function.


**Table 1.** *Cont.*

To estimate the impact of current translocation programmes on extinction probabilities and heterozygosities of the newly established *L. fasciatus* populations and the source populations, population viability analysis (PVA) simulations were run in VORTEX v10.17.2 [71,72]. VORTEX is an individual-based model that uses Monte Carlo simulations to estimate how factors intrinsic to individuals within populations alter growth rates, birth rates and extinction probabilities [72]. Initially, we developed a baseline PVA for a closed population with a starting N of 2000 individuals and a carrying capacity of 3000, where droughts at an average frequency of 1 in 6.25 calendar years reduced survival and reproduction by 50%, similar to empirical observations [33,34]. Although there is no empirical evidence for inbreeding depression, we included it in our models as there has been as yet no formal effort to detect it and populations do regularly pass through extreme population bottlenecks [33,34]. However, as populations appear to persist through these bottlenecks without obvious signs of depression, we halved the default number of lethal equivalents. Other island populations of marsupials are known to survive small effective population sizes and high levels of inbreeding [73]. Baseline model parameters are provided in Table 1 and Table S3.

Various translocation scenarios (each of 1000 replicates) were then simulated that tested the number of founders and source populations needed to maximise genetic diversity and population growth for a translocation programme that, due to the logistical cost to the management agency, runs for a maximum of two years (Table 2). Bernier Island was chosen for single source population translocations for two reasons. Firstly, as this population is smaller in size and has lower genetic

diversity, it represents a conservative, worst case scenario, and secondly, this island is easier to access due to protected landing beaches. To quantify the impact of drought on translocation success, the best performing of these scenarios was re-run with both increased and decreased drought frequencies, and with no drought. Finally, a case study (scenario 8) was simulated based on the recent movements of *L. fasciatus* from Bernier and Dorre Islands to Dirk Hartog Island, and from Bernier, Dorre, and Faure Islands to the Mount Gibson Wildlife Sanctuary. Although the carrying capacity of Faure Island could theoretically be similar to Bernier and Dorre Islands (predicted here to be 3000) based on land area, there is currently no empirical evidence for this and the census population size remains well below 1000. We therefore tested two different carrying capacities for Faure Island, 3000 and a more conservative 1000. To further assess the impact of drought cycles in translocation planning, both the current and post-drought estimates of source population sizes were used as inputs in case study models.


**Table 2.** Hypothetical and case study translocation scenarios \* examined using population viability analyses.

**\*** Different founder numbers were trialed for scenarios 1–4, ranging from 60 to 140 every 20 individuals. Scenario 8 is a case study representing current translocations. Unless otherwise stated, a drought is simulated at a frequency of 1 in 6.25 years.

#### **3. Results**

#### *3.1. Marker Performance*

Of the ten microsatellite markers amplified, three were monomorphic in this species. The informativeness of the seven polymorphic markers varied, with the number of alleles ranging from two (Y175 and Me17) to eight (Pa593) (Table S4). Expected heterozygosity (mean ± SE) ranged from 0.03 ± 0.03 (Me17), which was polymorphic in only one population, to 0.66 ± 0.02 (Pa593). The mean percentage of samples with missing genotype data across all populations ranged from 0.0% (markers Pa593, Me14 and Y105) to 5.1% (marker Y148). Marker Y175 violated the Hardy–Weinberg equilibrium (*p* < 0.05) in three populations out of six. Analysis in MICROCHECKER suggested the unexpectedly high homozygote frequency was caused by a null allele in all three populations. The percentage of samples with missing genotype data for this marker ranged from 0.0% (Dryandra) to 12.3% (Peron CBC), with a mean of 3.8%. After assessing the impact of null alleles on the estimation of global and pairwise FSTs in FreeNA [74], no significant differences were observed between estimates calculated with and without correcting for null alleles. Considering this, and the low number of markers, Y175 was retained for further analysis.

#### *3.2. Genetic Diversity*

Structure analysis revealed that the most likely number of genetically distinct clusters of *L. fasciatus* is two (k = 2; Figure 1e), representing the two remnant wild populations on Bernier and Dorre Islands. There was very good agreement in results whether allele frequencies were assumed to be correlated or uncorrelated between locations, and the absolute probability for k = 2 was slightly greater for correlated allele frequencies. Of the contemporary translocated populations, Faure Island and Wadderin are predominantly of Bernier Island origin, with negligible genomic contribution from Dorre Island. Of the extinct captive populations, Peron CBC was an extension of Bernier Island, whereas Dryandra was predominantly of Dorre Island origin. A small amount of potential admixture was detected between Bernier and Dorre Islands. In total there are ten individuals (four from Bernier and six from Dorre) that have an 80% or more probability of assignment to the other source population. Principal coordinate analysis agreed with the STRUCTURE results (Figure 1f). Both pairwise FST and Jost's Dest measurements were similar (FST = 0.026, *p* < 0.001; Dest = 0.029, *p* < 0.001), reflective of weak, significant divergence between the two parental populations. A summary of all pairwise FSTs is provided in Table S5, which shows that greatest distance exists between Bernier Island-derived populations and Dryandra (the sole Dorre Island-derived population). Pairwise relatedness analysis revealed that one source population (Bernier Island, *p* < 0.005), one extinct captive population (Peron CBC, *p* < 0.002), and one contemporary translocated population (Wadderin, *p* < 0.014) have relatedness values statistically greater than zero (Figure S1).

Overall, genetic diversity was low across all populations (mean expected heterozygosity, HE = 0.34–0.45). Of the source populations, Dorre Island exhibited higher heterozygosity and lower inbreeding than Bernier Island, although allelic richness was comparable. Dorre Island had one private allele (alleles present in one population only) for marker Y105 (q = 0.02), Faure Island had two (Pa593, q = 0.05; Me17, q = 0.10), and Peron CBC had one (Y148, q = 0.01). When the two parental populations were compared only to each other, Dorre Island had four private alleles and Bernier Island had one. Diversity estimates within year cohorts for both Bernier and Dorre Islands showed a similar trend of higher than expected heterozygosities in the late 1990s, and lower than expected heterozygosities in more recent cohorts. Both expected heterozygosity and allelic richness have declined over the sampling period (Supplementary Figure S2). There were no significant differences in genetic variation between translocated and source populations except for Faure Island which showed a significant decrease in observed heterozygosity compared to Peron CBC using a one-tailed Wilcoxon signed-rank test (*p* < 0.05), and this island also indicated significant inbreeding due to non-random mating. Considering the two parental populations as one remnant *L. fasciatus* metapopulation, Faure has retained 93% of allelic diversity, while Wadderin has retained 87%. Calculating allelic richness without marker Me17 increases its value in all populations, but Faure no longer has the highest allelic diversity (Table S6). Caution is needed when interpreting all diversity results, however, as sample sizes for diversity measurements were often very low and varied between sites and temporal data.

Estimates of effective population sizes in extant populations reflected overall diversity measurements (Table 3), as Dorre Island had the highest Ne in the 2016–2017 cohort (Ne = 140, 95% CI 29-∞), which was 71% higher than for Bernier Island *L. fasciatus* sampled in the same period (Ne = 82, 95% CI 12-∞). Values of infinity for upper confidence limits arise from a lack of evidence for variation in the genetic characteristic which, in this case, is likely due to the small, homogeneous marker panel [59,60]. Our estimates of effective population size should therefore be considered cautiously and await confirmation with a more powerful dataset.



 heterozygosity; FIS: inbreeding population Diversity samples polymorphic markers that violate HWE at *p* < 0.05; c value obtained only with no critical minor allele frequency threshold. Heterozygosity metrics and inbreeding coefficients are alsocalculated for different time periods for Bernier and Dorre Islands, when at least five individuals were sampled for that time period.

of

Combining samples from all years, only one source population, Dorre Island, had statistical support (*p* = 0.039) for passing through a genetic bottleneck (Table 4). However, when Bernier and Dorre Islands were separated into year cohorts, both populations showed evidence of genetic bottlenecks in the earlier cohorts (1990s), and for Dorre Island also in 2013. Of the managed populations, Wadderin was the only population showing evidence of a genetic bottleneck (Table 4). Thirty individuals and 10 polymorphic loci are recommended to achieve power of at least 0.8 with the Wilcoxon signed-rank test however, so these results should be considered with caution. Both Wadderin and Peron CBC had an allele frequency distribution shifted towards more common alleles, consistent with a bottleneck, although this qualitative test requires at least 30 individuals and 10 to 20 polymorphic loci to be reliable [61].


**Table 4.** Results of two tests for a genetic bottleneck of source and translocated populations.

Integers represent *p* values, with significant values in bold. *n*: sample size; t: translocated population size; y: year of translocation. The two-phase mutation model of microsatellite evolution is assumed with a 70:30% ratio of single stepwise mutation to infinite allele models. Year cohorts with fewer than five individuals were omitted from analyses. Estimates based on sample sizes of fewer than 30 should be considered tentative.

#### *3.3. Modelling Conservation Translocations*

In the absence of periodic droughts, ALLELERETAIN demonstrated the strongest increases in the probability of retaining low-frequency alleles when the founder population size approached 20, and began to plateau after 40 (Figure 2). Sixty founders are needed to retain 90% of alleles at a frequency of 0.05, and 80 founders are needed to retain 95% of alleles (recommended thresholds) at a frequency of 0.05. These numbers rose to 120 and 140 founders, respectively, when realistic impacts of drought were simulated.

**Figure 2.** Probability of retaining a selectively neutral allele at a starting population frequency of 0.05 after 50 calendar years for translocated *L. fasciatus* populations with various founder sizes, with and without impacts of drought. Shaded ribbons represent 95% confidence intervals.

Population viability analysis modelling showed that when contrasting founder population sizes between 60 and 140, survival probabilities increase with increasing founder number until this relationship begins to asymptote from around 100 individuals (Table S7a–d). This level of harvesting also had limited impact on the source populations. Mixing the two source populations of Bernier and Dorre Islands increased heterozygosity in the translocated population, relative to only using founders from Bernier Island, and there was minimal impact of moving all individuals in one year versus splitting the translocation over two years (Table 5). Considering the additional logistical cost of a second year of translocation activities, the best scenario in terms of maximising survival probability, genetic diversity (heterozygosity and allelic diversity), and the final population size of the translocated population was to harvest the two source populations in one year (Scenario 4, Table 5).


**Table 5.** Viability of a translocated *L. fasciatus* population after 50 calendar years, testing four scenarios. One hundred individuals translocated in one or over two years, including a drought with a mean frequency of one every 6.25 calendar years. Carrying capacity was set at 10,000 and scenarios were run for 1000 replicates.

P (surv): mean probability of survival; HE = mean expected heterozygosity using empirical estimates of allele frequencies to determine starting heterozygosities; N: final mean population size of extant populations.

Simulations that varied drought frequency indicated the likelihood of an appreciable impact on the survival probability of a translocated population after 50 calendar years, dropping from 100% under the assumption of no drought, to 79% with a realistic drought frequency of 1 in 6.25 years, and 58% with a drought frequency of 1 in 5 years (Figure 3). There was also predicted to be a profound effect on the final N, as predicted carrying capacity (set to 10,000) was reached with no drought, whereas final N was estimated at 2277 individuals with a 1 in 6.25-year drought and 1064 at 1 in 5 years (Figure 3).

**Figure 3.** Performance over 50 years for 100 translocated *L. fasciatus* to DHI with drought frequencies of 1 in 5 years, 1 in 6.25 years, 1 in 10 years, or no drought. (**a**) Survival probability. (**b**) Population size.

Simulation of translocations recently conducted by management agencies between 2017 and 2018—harvesting Bernier and Dorre Islands for translocation to Dirk Hartog Island, and concurrent harvesting of Bernier, Dorre, and Faure Islands for translocation to Mount Gibson—including the impacts of 1 in 6.25-year droughts, predicted good survival probabilities over 50 years for the translocated populations of 83% and 84%, respectively. This result was also achieved even if harvesting occurred with a reduction in the size of source populations following drought (Table 6). In contrast, the timing of harvesting had a more significant effect on the source populations. Harvesting at current census population sizes (Nc) resulted in high survival probabilities (99%, 98%, and 93% for Bernier, Dorre, and Faure Islands, respectively; Table 6), assuming Faure Island to have a carrying capacity (K) of 1000. Bernier and Dorre Island population sizes stabilise at ~82% of current estimates, whereas the Faure Island population stabilises after ~65% growth from current estimates to a census size of around 500. If harvesting were to occur immediately after a drought, predicted survival probabilities reduced after 50 calendar years, most markedly to 60% on Faure Island, and there was a further reduction in predicted Nc in Bernier and Dorre Island populations to around 74% of current estimates, whereas growth on Faure Island was limited to 29% (Nc ~386). Interestingly, when Faure Island was assumed to have a K of 3000, there was little improvement in survival probability post-harvest if a drought census size was assumed (63% with K of 3000 vs. 60% with K of 1000). However, there was a dramatic increase in predicted Nc as 1400 was reached when harvesting occurred with a non-drought Nc and 900 with a drought impacted Nc, compared to 495 and 386 with a K of 1000, respectively.

**Table 6.** Comparison of population viability analysis (PVA) outputs after 50 calendar years when harvesting source populations using either current conservative source census sizes, or likely source census sizes following droughts. Models are based on scenario 8, describing recent and ongoing movement of *L. fasciatus*, and assuming an average drought frequency of 1 in 6.25 years.


P (surv): mean probability of survival; N: final mean population size of extant populations; HE = mean expected heterozygosity. \* Numbers to the left of the slash are for a carrying capacity of 1000, and to the right are for a carrying capacity of 3000. Scenarios were run for 1000 replicates.

#### **4. Discussion**

As the remnant range of *L. fasciatus* consists of just two adjacent offshore islands, the species-level impact of detrimental stochastic events affecting one or both islands could be profound. Establishing insurance populations and restoring the species to other suitable habitats free from introduced predators is a fundamental management objective, as is the retention of remnant genetic diversity to maintain adaptive capacity. While it is difficult to make conclusive summaries from the limited genetic data we provide here, we nonetheless attempt to explain observed trends that justify a more extensive genetic study. For example, our data suggest that both remnant island populations underwent genetic bottlenecks in the mid to late 1990s, potentially linked to the boom and bust cycles of *L. fasciatus* that occur in response to periodic droughts and thus a marked reduction in primary productivity [33,34]. Further, we show that the impact of past conservation actions (captive breeding and translocations) has possibly manifested in different ways in the two surviving translocated populations (Faure Island and Wadderin). Based on the limited samples available, Faure Island may be inbred due to non-random breeding but has relatively high allelic diversity, whereas Wadderin does not show signs of genetic inbreeding due to non-random breeding but may have passed through two selectively stronger bottlenecks (Bernier Island to Peron CBC, and Peron CBC to Wadderin). Our population viability and genetic modelling has revealed the importance of considering periodic fluctuations in population size when planning translocations and informed on the number of founders needed to avoid genetic bottlenecks. Taken together, these lines of evidence point to a critical need for genetic management and guidance for future translocations.

#### *4.1. Genetic Diversity in Remnant Populations*

Island populations are expected to lose genetic diversity as a result of genetic drift, especially in the absence of migration (i.e., gene flow) that acts to increase allelic diversity [75,76]. Here, we found that genetic diversity (allelic richness and heterozygosity) of *L. fasciatus* on Bernier and Dorre Islands was approximately two-fold lower relative to values reported for mainland populations of other Australian marsupials [77,78], and reviewed in [79], which is unsurprising considering the low number of polymorphic markers used in this study. Interestingly, values were similar to the Shark Bay island population of another hare-wallaby, the rufous hare-wallaby (*Lagorchestes hirsutus*), which was measured with shared markers and also had lower diversity than its mainland counterpart [80]. We also identified extremely low effective population sizes for *L. fasciatus* (Table 3) given census sizes of ~2800 for Bernier Island and ~2500 for Dorre Island [32,33], leading to good agreement in Ne/Nc ratios of 0.05 and 0.07, respectively. These estimates are several times smaller than those for other mammals, e.g., 0.2 to 0.8 reported in [81,82] and a mean of 0.75 reported in [83]. However, due to our low power to estimate Ne from genetic data, these trends require confirmation.

Bernier and Dorre Islands have been separated from mainland Australia for 8000 years (around 4000 generations), and from each other for around 5000 years [84,85], providing substantial opportunity for stochastic loss of allelic variation through genetic drift. The finding that around 10 remnant source population individuals (out of 130) were assigned with 80% or more probability to the other source population is therefore surprising. This observation could be consistent with recent gene flow, leading to migrant populations that have remained relatively isolated from the resident populations. However, this is unlikely, especially as it would have needed to have happened reciprocally. A more likely explanation is either a lack of discriminatory power in our markers, perhaps due to being developed in sister species [86], or there has been a technical issue, such as mis-labelling. Adequate sampling to properly determine genetic structure and the use of more powerful markers (e.g., single nucleotide polymorphic markers generated through next-generation sequencing), would resolve these alternative hypotheses.

#### *4.2. Impact of Past Translocations on Genetic Diversity*

The expected heterozygosities of either of the established translocated populations were not less than their source population (Peron CBC) or the parental population (Bernier Island). Despite describing two examples of serial translocations occurring from a single source population via a captive breeding programme, and hence providing opportunity for two bottleneck events, in only one case did we find evidence for a bottleneck—the Wadderin population translocated in 2013 with 12 founders. Wadderin was founded from the Peron CBC population, which itself was founded from Bernier Island (Figure 1d), and each translocation contributed a 2.4% and 5.7% loss of allelic diversity, culminating in a total loss from the Bernier source population of 8.1%. Faure, on the other hand, did not undergo a genetic bottleneck according to our analysis, and had a relatively high allelic diversity. The high allelic diversity is surprising considering its demographic history, and we found it to be driven by the discovery of an additional allele (marker Me17), present in one individual in homozygous form. While this allele could have arisen in the translocated Faure Island population, its independent verification would benefit from expanding sampling of Faure and Bernier Islands. While we were not able to detect a genetic signal of a bottleneck in the Peron CBC population, founded by 25 individuals, this does not preclude the possibility that a bottleneck has occurred that we were not able to detect or from which the population later recovered.

Since translocations began to Faure Island, however, observed heterozygosity appears to have decreased below equilibrium expectations and inbreeding coefficients have increased, possibly due to limited dispersal and increased breeding between related animals post-translocation. Another possibility is an increase in mating success heterogeneity [78]. These factors could also have contributed to the island's notably low effective population size estimate. This is of concern, as small populations are prone to higher rates of loss of genetic variation via genetic drift, especially if inbreeding depression leads to lower rates of reproduction and further reductions in population size [17]. However, the apparently high inbreeding does not appear to be associated with poor health of the Faure Island population, as it is well established with ongoing recruitment [87]. One explanation for this apparent paradox is that the boom and bust population cycles characteristic of Shark Bay *L. fasciatus* populations have led to a purging of lethal alleles, negating any detrimental effect of inbreeding [34]. A more prosaic explanation is that the Faure Island sampling cohort was both small (*n* = 10) and consists of a disproportionately high number of related individuals, leading to an overestimate of the inbreeding coefficient. A more comprehensive genetic analysis and a future study of the social dynamics of the Faure Island population could resolve this uncertainty.

While our bottleneck results require independent replication, a lack of bottleneck signal for the Peron CBC population with 25 founders supports results from our allele retention modelling. We show that a 70% probability of retaining a low frequency allele is maintained with a founder size of 25, but that retention probabilities drop dramatically with founder sizes less than 20. Other populations that have passed through bottlenecks tend to have greater founder numbers. For example, in birds, a bottleneck occurred when the Hawaiian gallinule (*Gallinula galeata sandvicensis*) was reduced to approximately 60 individuals [88], and reductions to 100 individuals may be sufficient to observe bottleneck effects in other avian species [89]. In the Maud Island frog (*Leiopelma pakeka*), simulations indicate that a sustained increase in irreversible allele loss begins when populations are reduced to 140 [90]. The apparent low power to detect genetic bottlenecks in translocated *L. fasciatus* populations could reflect relatively small changes in allelic diversity between source and founder populations (due to low effective population sizes and genetic variation in the source), compounded by our small panel of only moderately polymorphic markers. Further, the relatively rapid demographic recovery inherent to *L. fasciatus* during the boom phase of natural population cycles could mask any bottleneck signal that depends on heterozygosity excess [62]. Therefore, ongoing monitoring, including genetic monitoring, of translocated populations during their establishment would provide useful data with which to detect any detrimental genetic effects that might be initially masked.

#### *4.3. Towards an Optimal Translocation Protocol for L. fasciatus*

A clear trade-off exists in conservation translocation programmes between maximising viability of a new population and minimising the negative impact on critical source populations [90]. For example, sufficient founders are required to ensure a translocated population is buffered from some post-establishment mortality, as well as to retain rare alleles that bolster evolutionary potential [67,90]. Smaller founder numbers, on the other hand, reduce any negative impacts on source populations, but translocated populations will be more sensitive to mortality and the loss of rare alleles. Our PVA models for *L. fasciatus* show that for newly translocated populations, survival probabilities increase with increasing founder number until this relationship asymptotes at around 120 individuals, and the release of 100 individuals ensures good growth and survival in a hypothetical haven (island or fenced sanctuary with a carrying capacity of 10,000). That carrying capacity did not significantly impact 50-year predicted survival probabilities for the Faure Island population when using reduced drought-influenced census sizes, indicates the sensitivity of the successful establishment of *L. fasciatus* translocated populations to the founder number, and suggests that sufficient numbers are required to withstand regular population reduction caused by stochastic events. Importantly, harvesting the numbers recommended here has no apparent impact on the critical source populations, either in terms of population growth or survival probability over a 50-year period.

Our allele retention models predict that 80 founders are needed to retain 95% of low-frequency alleles, noting that retaining 90 or 95% genetic variation are recommended minimum thresholds for maintaining the evolutionary potential of populations [17,91]. Hence, aggregating the results of both modelling processes suggests that at least 100 individual founder animals are needed to maximise viability and retention of allelic diversity, a comparable figure to other taxa (e.g., 120 in frogs [90], 60 to 120 in passerines [19]). Further, by simulating the effects of periodic drought on population

growth and survival, we have provided predictions about when best to translocate. For example, if harvesting were to occur immediately after a drought, we predict a 10–22% reduction in the census size of the source population after 50 years, relative to harvesting when populations are at their peak. In turn, we predicted that 140 founders would be required to retain 95% of alleles with a population frequency of 0.05 in a new population under drought conditions (as opposed to 80 founders under no drought); a 75% increase in the number harvested from the critical source populations. Therefore, climatic cycles appear to be an important consideration in translocation programmes, and we advise that the movement of animals should occur outside periods of low rainfall and when populations have recovered demographically from their impacts.

One way to increase genetic diversity of species that exist as only small and genetically depauperate remnants is to manage the species as a metapopulation [92]. However, whether to combine genetically distinct populations is contentious [1,7,78,93–95]. While there is an immediate and obvious benefit of maximising genetic diversity and heterozygosity, thereby reducing the chance of inbreeding depression, mixing comes with risks, including outbreeding depression [1]. For example, if source populations derive from different habitats or climates, and have been long isolated, local adaptation can cause the rise of locally adapted alleles and for gene complexes to be in gametic disequilibrium. Locally adapted alleles can be diluted and gametic associations can be broken by recombination after mixing between differently adapted individuals, resulting in hybrid individuals with low fitness [96–99]. Other risks include competition for resources when there are morphological and size differences between source populations, and behavioural differences could affect mate choice, potentially reducing Ne and leading to a bias in representation of future generations (e.g., as occurs in the boodie *Bettongia lesueur*, [78]). Here, however, the habitats of the two adjacent *L. fasciatus* source populations have been similar across the 4000–5000 years (approx. 2000 generations) of their separation [79]. Further, there are no known differences in size, morphology, or mating behavior between islands, suggesting a low probability of outbreeding depression [95]. Considering the overall low genetic diversity maintained in this species, and by mixing we achieve a 10–12% increase in HE of the recipient population compared to using only Bernier Island as a source population, we advocate mixing the two source island populations in future translocations and predict a low risk of any adverse consequences.

#### **5. Conclusions**

Our work shows how an integrated analysis of genetics and population modelling can be used to inform management planning by simulating sensitive trade-offs involved in translocating threatened species. While much work has been done on the numbers of founders needed to establish new populations, less is known about the impact of harvesting for translocations on source populations [89], which is highly relevant for species with few remnant populations remaining. We suggest that the optimal translocation protocol for *L. fasciatus* is to mix the Bernier and Dorre Island populations in the same year, harvesting 60 individuals from each island source population (total *n* = 120). This gives a good probability of retaining low-frequency private alleles from each island, maximises survival probability and heterozygosity of translocated populations, limits logistical cost, and, crucially, has no major impact on the source populations. As the translocations currently underway to Dirk Hartog Island and Mt. Gibson approximate these recommendations, it will be valuable to monitor these populations over time to compare performance to model predictions. It is also worth considering Faure Island as an alternative source population to Bernier Island for future translocations, as was done with the Mt. Gibson translocation, to relieve pressure on one of the two remnant wild populations. However, in addition to possible inbreeding, our modelling shows Faure Island is more sensitive to stochastic events due to a smaller census population, further compounded if its carrying capacity is less than Bernier and Dorre Islands, and so the PVA models developed here should be used to assess the impact of any future harvesting. Wherever feasible, and after consideration of potential over-harvesting, future supplementations of the established translocated populations on Faure Island and Wadderin Sanctuary should derive from Dorre Island which, as a genetic mixing event, would

be expected to increase gene diversity of the recipient population. Further, Dorre Island has a higher heterozygosity, more private alleles and lower inbreeding than Bernier Island. Finally, while we show that *L. fasciatus* seems resilient to harvesting and is well suited to translocation programmes, including multiple harvests from source populations, regular droughts, and limited carrying capacity, which have a substantial impact on population viability, particularly for populations with census sizes less than 500. We recommend that translocations are avoided during extended periods of drought and subsequent demographic recovery, due to a notably lower growth rate of the source populations after harvesting, and reduced capacity of new populations to retain allelic diversity.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/1424-2818/12/12/448/s1, Spreadsheet S1: Genetic samples; Table S1: PCR multiplexes; Table S2: Details and justification of the demographic parameters used in ALLELERETAIN; Table S3: Details and justification of the demographic parameters used in VORTEX; Table S4: Performance of seven polymorphic microsatellite markers; Table S5: Population pairwise FSTs and probability values; Table S6: Comparing allelic richness with and without marker Me17; Table S7a–d: Summary of PVA outputs for translocation scenarios; Figure S1: Pairwise relatedness; Figure S2: Temporal change in genetic diversity for the two remnant source populations.

**Author Contributions:** Conceptualization, N.J.M., K.O. and D.J.W.; formal analysis, D.J.W.; investigation, D.J.W. and K.O.; resources, K.O., P.B.S.S., M.S., C.S. and J.S.; data curation, K.O. and P.B.S.S.; writing—original draft preparation, D.J.W.; writing—review and editing, N.J.M., K.O., J.S., P.B.S.S., M.S., C.S. and D.J.W.; visualization, N.J.M. and D.J.W.; project administration, D.J.W. and N.J.M.; funding acquisition, N.J.M. and K.O. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the Australian Government's National Environmental Science Program through the Threatened Species Recovery Hub.

**Acknowledgments:** We thank Keith Morris and Manda Page for comments on earlier versions of the manuscript, John Kanowski for valuable comments on late versions of the manuscript, Mark Eldridge, Anthony Friend, and Mia Hillyer for help with conceptualisation, Saul Cowen for discussion on translocation strategies, Laura Ruykus and Kim Branch for collection of samples, and Shelley McArthur for assistance in the laboratory. The authors would also like to thank three anonymous reviewers for their helpful comments.

**Conflicts of Interest:** The authors declare that they have no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Applying Population Viability Analysis to Inform Genetic Rescue That Preserves Locally Unique Genetic Variation in a Critically Endangered Mammal**

**Joseph P. Zilko 1,\*, Dan Harley 2, Alexandra Pavlova <sup>1</sup> and Paul Sunnucks <sup>1</sup>**


**Abstract:** Genetic rescue can reduce the extinction risk of inbred populations, but it has the poorly understood risk of 'genetic swamping'—the replacement of the distinctive variation of the target population. We applied population viability analysis (PVA) to identify translocation rates into the inbred lowland population of Leadbeater's possum from an outbred highland population that would alleviate inbreeding depression and rapidly reach a target population size (*N*) while maximising the retention of locally unique neutral genetic variation. Using genomic kinship coefficients to model inbreeding in Vortex, we simulated genetic rescue scenarios that included gene pool mixing with genetically diverse highland possums and increased the *N* from 35 to 110 within ten years. The PVA predicted that the last remaining population of lowland Leadbeater's possum will be extinct within 23 years without genetic rescue, and that the carrying capacity at its current range is insufficient to enable recovery, even with genetic rescue. Supplementation rates that rapidly increased population size resulted in higher retention (as opposed to complete loss) of local alleles through alleviation of genetic drift but reduced the frequency of locally unique alleles. Ongoing gene flow and a higher *N* will facilitate natural selection. Accordingly, we recommend founding a new population of lowland possums in a high-quality habitat, where population growth and natural gene exchange with highland populations are possible. We also recommend ensuring gene flow into the population through natural dispersal and/or frequent translocations of highland individuals. Genetic rescue should be implemented within an adaptive management framework, with post-translocation monitoring data incorporated into the models to make updated predictions.

**Keywords:** genetic swamping; genetic rescue; locally unique alleles; translocation; population viability analysis; population genetics; marsupial; adaptive potential; Leadbeater's possum; *Gymnobelideus leadbeateri*

#### **1. Introduction**

Many threatened species are restricted to small and isolated populations, which places them at elevated risk of inbred matings and loss of beneficial genetic variation by genetic drift [1,2]. Small and isolated populations of diploid species that usually outbreed nearly always exhibit reduced fitness as a consequence of inbreeding (i.e., inbreeding depression) [3]. Usually concurrent with inbreeding depression, the loss of beneficial variation and the accumulation of harmful variation further reduce the population mean fitness [4]. Together with the loss of genetic variation that would otherwise facilitate adaptation to future environmental conditions, these issues reduce population viability [1,5]. If inbred populations remain isolated, extinction becomes increasingly likely due to the 'extinction vortex' effect, wherein a shrinking population size is reinforced by increasing inbreeding depression and fitness loss [6,7].

Translocations of individuals from other populations can increase genetic diversity of isolated populations and elevate population fitness by reducing inbreeding depression

**Citation:** Zilko, J.P.; Harley, D.; Pavlova, A.; Sunnucks, P. Applying Population Viability Analysis to Inform Genetic Rescue That Preserves Locally Unique Genetic Variation in a Critically Endangered Mammal. *Diversity* **2021**, *13*, 382. https:// doi.org/10.3390/d13080382

Academic Editors: Kym Ottewell and Margaret Byrne

Received: 30 June 2021 Accepted: 12 August 2021 Published: 16 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

and alleviating the harmful effects of beneficial variation being lost from the population (i.e., causing genetic rescue), thus increasing the persistence and adaptive capacity of populations [2,8,9]. Such genetic management is a highly effective conservation tool; when appropriate sources of genetic augmentation are used, outcrossing increases fitness in the vast majority of inbred populations that usually outbreed, and an enhanced ability to adapt can greatly increase population persistence in changing environments [10].

Despite its demonstrated effectiveness, genetic rescue has not been widely implemented, partly owing to concerns about 'genetic swamping' and a consequent loss of local adaptation and distinctiveness [11–13]. Genetic swamping can be defined as the replacement of a population's locally adaptive genetic variation with that of immigrants [13]. This definition may be extended to include apparently neutral alleles within populations because distinguishing adaptive from non-adaptive variation can be extremely challenging, and such variation can contribute to future adaptive potential [14]. An extreme example of genetic swamping occurs in some hybridising species, in which hybrid vigour (heterosis) causes the extirpation of the original locally adapted population and their alleles [15]. A subtler form of genetic swamping occurs when locally unique genetic variation is replaced during the immigration of conspecifics [16]. Concerns about genetic swamping tend to promote conservation strategies that maintain genetic distinctiveness of isolated populations rather than evaluating alternative management options in a risk-benefit framework, but a different strategy might have a better outcome for the stated conservation goals [17]. On one hand, species-wide genetic variation could be lost if translocation strategies cause the replacement of locally unique alleles, potentially reducing the adaptive potential and increasing extinction risk [13,18–20]. Conversely, the genetic distinctiveness of small and isolated populations is often an artefact of genetic stochasticity rather than local adaptation, and if 'uniqueness' reflects harmful variation that would have been removed by selection in larger and/or more connected populations, maintaining the distinctiveness of small populations can even exacerbate extinction risk [21].

Effective population sizes (*Ne*) >100 are needed to limit the mean inbreeding coefficient (the probability that any two alleles are identical by descent) to *F* < 0.1 over five generations [10]. Smaller and more inbred populations need high supplementation rates to reverse inbreeding, which could elevate the risk of swamping [1]. Nevertheless, alleviating inbreeding depression while avoiding the replacement of the local gene pool may be achievable by appropriately managing gene flow during genetic rescue. Even under considerable gene flow during intensive genetic rescue, selection can maintain locally adaptive genetic variation in the face of potential swamping [22]. However, selection is generally weaker in small populations because genetic drift has a greater influence on allele frequencies [23]. Preserving genome-wide variation, including variation that is currently neutral, also enhances future adaptive potential [14,24,25]. Thus, it is of interest to estimate how much locally unique genome-wide variation will persist in a population under genetic rescue, especially if candidate populations for genetic rescue are small. Accordingly, poor understanding of swamping during genetic rescue is a significant knowledge gap [13].

Population demographic and genetic models that account for inbreeding depression and the genetic contribution of outbred immigrants could be useful for designing genetic rescue regimes that help retain locally unique alleles. Forward-based computer simulations can project the changes in the genetic profile of populations under different demographic and population genetic scenarios [26]. Such simulations have been used to estimate the impacts of inbreeding depression and predict changes in the viability of inbred populations after genetic rescue [14,27]. PVAs can be used to predict how much of the genetic variation from a source population can be retained when founding a new population or supplementing an existing population [28–30]. However, the preservation of alleles unique to a local population that is supplemented with unrelated individuals has been less scrutinised, and fitness disparities between outbred immigrants and inbred locals can cause the replacement of local variation [31]. Incorporating inbreeding and inbreeding depression into models accounts for the reduced contribution to the gene pool of inbred locals relative to immigrants. Models that consider locally unique variation and the lower fitness of inbred locals relative to immigrants could therefore be useful for designing genetic rescue programs that improve population viability without replacing the local gene pool.

Leadbeater's possum (*Gymnobelideus leadbeateri*) is a critically endangered Australian marsupial and a strong candidate for genetic rescue. Genetic rescue of this species could be informed by population viability analysis (PVA) modelling the effect of various translocation scenarios on the retention of unique alleles when the recipient gene pool has reduced fitness due to severe inbreeding. The species has suffered extensive loss and deterioration of habitat caused by vegetation clearing, bushfires, and logging, and has a highly restricted distribution (Figure 1) within a 70 × 95 km area within the Central Highlands of Victoria, Australia, with a neighbouring lowland isolate [32,33]. The single population of Leadbeater's possum outside of the Central Highlands is localised entirely within a narrow 6 km stretch of lowland swamp forest habitat in the Yellingbo Nature Conservation Reserve [34]. The Yellingbo population is genetically distinct from highland populations and is thought to represent the sole remnant of a previously widespread evolutionary lineage adapted to lowland swamp [32,35,36]. The geographic isolation of the Yellingbo population makes contemporary gene flow with other populations highly unlikely [36]. This population had greatly reduced genetic diversity compared to all conspecific populations and exhibited substantial inbreeding depression for survival to sexual maturity, longevity, probability of breeding during lifetime, and lifetime reproductive output [37]. Inbreeding depression likely contributed to the decline from ~120 individuals in 1997 to 35 individuals in 2017. Heterozygosity concurrently declined by 12%, putatively limiting the capacity of the population to adapt to environmental change and prevent the expression of deleterious variation [37]. Furthermore, suitable habitat for Leadbeater's possum has declined in extent and quality, mainly because of habitat succession and altered hydrology [32]. Based on monotonic decline in population size, presence of inbreeding depression, and deteriorating habitat condition, the extinction of the Yellingbo population is highly likely without intervention [37,38]. Translocation of lowland Yellingbo individuals to high-quality habitat beyond Yellingbo and outcrossing with genetically diverse highland animals is planned to prevent the extinction of this population. Sourcing migrants for genetic rescue of the lowland population from highland populations is the only available option, which has been assessed to have a low risk of outbreeding depression [37]. Although genetic variation unique to the lowland population at Yellingbo has been characterised with microsatellite markers and short mitochondrial sequences [36], it was not assessed with genome-wide markers, which should better approximate functional and standing variation [24,25]. Doing so would be an important step in planning translocation strategies that do not replace the local gene pool.

We aimed to design a translocation regime of lowland and highland Leadbeater's possums into new habitat that would rapidly increase population size, demographic stability, and genetic diversity, while providing for the retention of unique lowland alleles. We predicted the level of retention of alleles unique to the lowland population, incorporating inbreeding depression to reflect the potential fitness advantage of translocated outbred individuals and their descendants over inbred locals. We used genome-wide SNP markers to detect genomic variation unique to the lowland population, which was then monitored across simulated scenarios. We also used genome-wide SNPs to model inbreeding and genetic diversity within simulated populations. The following objectives were targeted:


Our results will inform genetic rescue of the last extant population of the lowland lineage of this critically endangered species. More generally, our approach for design-

ing genetic rescue regimes to retain locally unique variation should be useful for other conservation programs.

**Figure 1.** Map of sampling locations for genetic samples used in this study (red dots). The blue dots are observations of the species from 2010–2020, indicating the contemporary range. The extent of the sampling area is indicated by the red rectangle on the inset. Contour lines indicate elevation above sea level in metres.

#### **2. Materials and Methods**

#### *2.1. Study System and Sampling*

The lowland population at Yellingbo was extensively surveyed from 1997–2019, with mark–recapture data available for >70% of the population in any given year [37,39]. These data provided estimates of key demographic parameters, such as mortality and reproduction rates (see below). Genetic samples (*n* = 304) were collected from Yellingbo during surveys from 1997 to 2001 and 2011 to 2019 [36,37]. A further 216 individuals were genetically sampled throughout the species' range in the Central Highlands (Figure 1), against which to estimate the locally unique variation at Yellingbo. These included 113 individuals from Lake Mountain—the thoroughly sampled population used to represent allele frequencies of a donor highland population in simulations. For most samples, a small (~2 mm2) tissue biopsy of skin and cartilage was taken from the outer pinna of the ear and stored in ethanol prior to extraction. Skin samples were also taken from animals found dead or from the pelts of preserved museum specimens.

#### *2.2. Defining Conservation Targets*

Conservation programs guided by PVAs require clearly defined criteria of success [40]. We defined a strategy as successful if it resulted in a population of at least 110 individuals. This target was chosen to recreate apparent demographic stability observed at Yellingbo in the 1990s before the rapid deterioration of habitat conditions and concurrent population decline. The limited availability of lowland habitat also means that founded populations must necessarily be <120 individuals in size, with translocations maintaining population size and genetic diversity. This target represents a first (and urgent) milestone to prevent the extinction of the lowland population; however, further population expansion would

be required thereafter. We assumed that genetic rescue would begin with 20 Yellingbo founders, which represents 57% of the population in 2019.

#### *2.3. General Assumptions of the PVA Model*

All PVAs were performed using the software package Vortex v.10.3.5.0 [41], an individual-based Monte Carlo simulation program that models the impact of deterministic and stochastic forces on animal populations. Demographic, environmental, and genetic processes were incorporated by inputting parameter estimates for mortality and fecundity, along with environmental variability. For a range of scenarios (Table 1), the program tracked individuals from birth to death, with reproduction and mortality occurring at pre-defined probabilities, derived from actual population data (below). Environmental catastrophes were not included in the model. Due to the uncertainty of environmental effects on fitness, environmental effects were minimised by setting the standard deviation for mortality and reproduction parameters to 10% of their mean value (Table 2). This was possible only for the parameters modelled using mean and standard deviation values.

**Table 1.** Scenarios simulated in Vortex. Initial lowland population size (*N*), carrying capacity (*K*), whether or not the starting population was made inbred by providing a kinship matrix (Inbred), whether inbreeding depression was modelled, and whether estimated or adjusted values for juvenile (0–1 y) mortality were used is indicated. Supplementation rate is the number of individuals supplemented (by translocation) into the population per year. Supplementation duration is the number of years the population was supplemented. Simulation duration is the number of years simulations were run. Source is the population that animals were sourced from for supplementation.


**Table 2.** Biological parameters used in Vortex simulations. The study that provided each parameter estimate is given.


**Parameter Value (SD) Source** 1 litter 7 [44] 2 litter 93 [44] Distribution of offspring per litter 1 offspring 45 [39] 2 offspring 55 [39] Mortality 0–1 years (males) 42 (4.2) † [44] 0–1 years (females) 41 (4.1) † [44] 1–2 years (males) 17 (1.7) [44] 1–2 years (females) 22 (2.2) [44] Annual >2 years (males) 29 (2.9) [37] Annual >2 years (females) 29 (2.9) [37] Percent males in breeding pool 100

**Table 2.** *Cont.*

\* Up to three litters per year can be produced; however, this only occurs if a previous litter is lost. † 6% was added to early mortality estimates because these did not include partial litter loss, except for *1990s Condition* and *2019 Trajectory* (main text).

> Genetic data were incorporated into all but one analysis (*1990s Condition* being the exception) as the input allele frequencies and pairwise kinships among individuals (detailed methods below). Alleles were assumed to be neutral and randomly assorting. Vortex uses pairwise kinships to calculate inbreeding coefficients for individuals and their offspring. Inbreeding depression was included in some models as a reduced survival probability in the first year of life (the default in Vortex). Inbreeding depression is caused by an increased expression of deleterious recessive alleles and a loss of heterozygote advantage with increasing autozygosity [45]. Lethal equivalents represent recessive alleles that compromise fitness when homozygous in an individual, with a single homozygous lethal equivalent imposing a fitness cost equivalent to the death of an individual [3]. Decreases in individual survival probability are determined by the number of lethal equivalents, the relative contribution of lethal recessive alleles and the heterozygote advantage to inbreeding depression, and the individual inbreeding coefficient. The default value of 6.29 diploid lethal equivalents for first year survival was used. This value was used in preference to estimates of lethal equivalents in the lowland Leadbeater's possum population obtained in a previous study because these estimates were found to have wide confidence intervals and varied between different inbreeding coefficients [37]. Vortex randomly assigns lethal equivalents to each founder at the beginning of simulations. In Vortex, homozygosity for any lethal recessive always results in mortality in the first year of life. The relative contribution of lethal and sub-lethal recessive alleles to inbreeding depression is not well understood, and therefore, we specified 50% inbreeding depression due to lethal recessive alleles, a classic empirical estimate from fruit flies [43]. The reduction in first year survival caused by reduced heterozygote advantage is calculated by applying an exponential equation that includes the inbreeding coefficient of an individual.

#### *2.4. Genotyping*

DNA was extracted by salting-out [46] or using the DNeasy Blood and Tissue Kit following Qiagen's (the manufacturer) instructions. The preparation of reduced-representation genome libraries, sequencing, and the discovery of genome-wide single nucleotide polymorphism loci (SNPs) was carried out using the DArTseq complexity reduction methodology of Diversity Arrays Technology [47] and described in a previous study [37]. This generated 13368 SNP loci that were then filtered for downstream analysis using the *R* package *dartR* [48], as explained below. The function *gl.sexlinkage* was used to detect

putatively sex-linked loci, which were removed from the dataset. Loci containing more than one SNP were thinned to retain the SNP with the highest polymorphism information. Locus reproducibility was determined during marker discovery from technical replicates representing 25% of all data, and loci with average reproducibility <100% were removed from the dataset to minimise error rates.

Three genetic datasets were then created by sub-setting the data by sampling location and applying different quality-filtering criteria in *dartR* to meet the requirements of each downstream analysis. The first dataset of 4218 loci, created by retaining individuals from all sampling locations and removing loci with a call rate of <85%, was used to detect alleles unique to the lowland population at Yellingbo. The second dataset of 1066 loci, created by retaining only the Lake Mountain and Yellingbo populations and removing loci with a call rate of <100%, was used during simulations to represent genetic variation at Yellingbo and a thoroughly sampled representative highland population. The third dataset of 1076 loci, created by retaining only Yellingbo individuals and removing loci with a call rate of <99%, was used to calculate pairwise kinships at Yellingbo.

#### *2.5. Detecting Alleles Unique to the Lowland Population*

Alleles occurring only within the lowland population at Yellingbo (locally unique alleles) were detected by comparing the allele frequencies of all highland individuals pooled together with allele frequencies of lowland individuals sampled at Yellingbo from 1997–2019. The package *poppr* [49] was used to report unique alleles and counts of each locally unique allele in the population. Loci with locally unique alleles found as a single copy in only one Yellingbo individual were removed, as these could represent genotyping errors or rare deleterious mutations. Allele frequencies calculated for loci with locally unique alleles for Yellingbo individuals sampled in 2011–2019 were used in genetic rescue simulations.

#### *2.6. Pairwise Kinship Estimation*

Inbreeding among individuals in the population was estimated using pairwise kinships, which measure the proportion of the genome that is identical by shared ancestry between pairs of individuals [50]. Kinships calculated using genetic markers often better represent genomic similarity than do pedigree-derived estimates [51]. Pairwise kinships were calculated with the *beta.dosage* function in the *R* package *hierfstat* [52] using 1076 loci and all Yellingbo genotypes from 1997 to 2019. This kinship estimator has a mean of zero for the focal population, with negative values representing below-average kinship and positive values representing above-average kinship [53]. Vortex requires positive kinship values that represent the probability that any two homologous alleles drawn from a pair of individuals are identical by descent [41]. Therefore, we used the transformation described previously [54] to convert calculated kinships to proportions, with the resulting pairwise kinship value for the least-related pair equal to zero. Kinship values for the 2019 population were incorporated into simulations as a matrix. By default, Vortex regards non-genotyped individuals as completely outbred and unrelated, which is unrealistic. Accordingly, five non-genotyped individuals were represented in the kinship matrix with pairwise kinship values set to the mean calculated for the 2019 individuals.

#### *2.7. Survival and Reproduction Parameters Used in Vortex Models*

The Vortex models were populated with realistic parameter values (Table 2) using published estimates of reproduction and mortality at different life stages [39]. The age classes for which that study calculated mortality differed from those required by Vortex. As such, mortality from 0 to 1 years was calculated as the sum of total litter loss (9%) for both sexes, plus mortality for each sex from when juveniles first emerged from the pouch to 12 months old. The mean annual adult (>2 years old) mortality rate was estimated by averaging the annual estimates from 1996 to 1999.

Based on the observation of a relatively stable Yellingbo population at ~110 individuals in the 1990s, simulations using realistic demographic parameters were expected to yield a demographically stable population—that is, one that did not change in size by more than 10% over 50 years in the absence of inbreeding depression or environmental catastrophes (see scenario *1990s Condition* below). However, simulations using published mortality estimates resulted in a rapidly growing population (Supplementary Figure S1). To achieve demographic stability, we increased the mortality rates for individuals aged 0 to 1 years by 6% for both sexes to 42% in males and 41% in females. Actual mortality at <3 months old was likely underestimated in the empirical studies because the initial number of young present in the pouch at birth (one or two) could not readily be established, preventing a reliable estimation of partial litter loss [39]. Accordingly, increasing mortality at age 0–1 years by 6% to achieve demographic stability was deemed a realistic adjustment.

#### *2.8. Sensitivity Analysis*

The sensitivity of PVAs to input parameter estimates was investigated using perturbation analysis—using simulations of scenario *1990s Condition* (with no inbreeding depression) while increasing or decreasing a single input parameter by 10% relative to its baseline value (Table 2). The average growth rate (*r*) was calculated from 200 replicate runs. The growth rate was used to assess model sensitivity because other parameters, such as survival probability and population size, have theoretical limits of zero and/or one, which limits their usefulness for estimating effect size. The relative sensitivity of the model to each parameter was calculated with the following formula: (*r*<sup>+</sup> – *r*−)/(0.2 × *r0*), where *r*<sup>+</sup> is the mean *r* with the parameter increased by 10%, *r*<sup>−</sup> is the mean *r* with the parameter decreased by 10%, *r*<sup>0</sup> is the mean *r* using the baseline parameter value, and 0.2 is equal to the perturbation in the parameter value. These estimates indicated whether certain parameters had a disproportionate effect on models, which can make predictions less reliable if the parameter estimates are inaccurate.

#### *2.9. Simulated Scenarios*

Five general scenarios were tested, which are outlined in Table 1 and described in detail below. For all scenarios except *1990s Condition*, the starting population was specified using a studbook file (a list of founder individuals and their characteristics in Vortex) that included the age and sex of each individual at Yellingbo in 2019. Relationships among individuals were not provided in the studbook file. Instead, a kinship matrix was also provided for all scenarios except *1990s Condition* to estimate individual inbreeding. For scenarios in which the lowland population was supplemented with highland individuals, the population started with the same set of 20 (10 male and 10 female) Yellingbo founders that were randomly sampled from the studbook containing 35 Yellingbo individuals in 2019. Translocated animals had the same mortality and reproduction rates as the recipient population once added to the population (not including inbreeding depression effects), and all were assumed to survive translocation. For each simulated scenario, the extinction probability was estimated as the proportion of replicate runs that went extinct, with extinction defined as only individuals of one sex remaining; this definition was used in all subsequent simulations. Mean and standard deviation for output parameter estimates were generated using 200 replicate iterations for each scenario. Increasing the number of replicates above 200 did not decrease the standard error of estimates. Simulation outputs included observed heterozygosity, population size (*N*), growth rate across all years (*r*), and time to reach *N* = 110.

(i) 1990s Condition: The effect of limited carrying capacity and inbreeding on population size at Yellingbo

The influence of the limited carrying capacity and inbreeding on the lowland population size at Yellingbo since the 1990s was determined. A population of 110 individuals with all pairwise kinships equal to zero was simulated to represent the population in the 1990s, varying the carrying capacity to *K* = 120 and *K* = 1000. These scenarios were run

using actual juvenile mortality estimates rather than those increased by 6% to achieve a demographically stable baseline in other scenarios. Simulations at these carrying capacities were run with and without inbreeding depression.

(ii) 2019 Trajectory: The forward projection of the population size of lowland possums at Yellingbo without genetic rescue

The time until extinction of the lowland population at Yellingbo without genetic rescue was estimated with simulations, starting with 35 lowland individuals alive in 2019 in the absence of supplementation and specifying pair-wise kinships, including inbreeding depression and as actual juvenile mortality rates (as for the *1990 Condition* above). These simulations were carried out using two different carrying capacities; *K* = 40, representing the currently degraded habitat conditions, and *K* = 120, representing higher quality habitat present in the 1990s.

(iii) Demographic Rescue: The effect of numerical population reinforcement on population growth

Scenarios were run to separate the effect of supplementation on population growth (i.e., demographic rescue) from the effect of genetic rescue. The effect of supplementation on population size was estimated by introducing migrants from a donor population genetically similar to the lowland population. The donor population was made genetically similar to that of Yellingbo by setting non-zero kinships between Yellingbo and donor individuals. Kinship values were set at the beginning of each run by randomly sampling from a Poisson distribution with a mean equal to the average pairwise kinship between individuals at Yellingbo (mean = 0.5, the kinship matrix is provided in the Supporting Data). This was also done for pairwise kinships between individuals within the donor population to make it similarly inbred to Yellingbo. Input allele frequencies for this donor population were identical to those of the lowland population.

(iv) Genetic Rescue: The supplementation of a new lowland population with highland possums

All genetic rescue scenarios were carried out assuming that a new population would be founded using 20 lowland individuals relocated from Yellingbo. To represent the founding of a new population in high-quality habitat, a carrying capacity of 1000 was specified in genetic rescue scenarios. Scenarios were run to determine the minimum supplementation rate required to establish a demographically stable (*N* changing by <10% over 50 years) population of at least 110 individuals, starting with 20 lowland founders (Table 1). This target size was chosen to recreate the apparent demographic stability observed at Yellingbo in the 1990s and early 2000s [39]. The number of supplemented individuals increased by increments of two (one of each sex) to maintain equal sex ratios during simulations.

For genetic rescue scenarios, a highland population was created using allele frequencies calculated for the Lake Mountain population. These highland individuals were specified as unrelated to each other, and unrelated to lowland individuals, by setting pairwise kinships to zero. This enabled the simulation of genetic rescue by introducing outbred individuals. The size of the highland population was set to 1000 individuals with the mean and standard deviation set to 1 for survival at all life stages to ensure that the population remained sufficiently large for translocation and maintained genetic diversity for the duration of simulations. All other parameters were the same as those used for the lowland population at Yellingbo.

Genetic rescue effects were measured as changes in the mean observed heterozygosity and the mean population size from the first to the last year of simulation, the mean growth rate for all years (*r*), and the mean time to reach *N*=110. Changes in genetic diversity in the rescued population were modelled by specifying the observed allele frequencies for 1066 loci for the highland population and lowland individuals alive in 2019, and using these to calculate heterozygosity for each year of simulation.

(v) Genetic Swamping Test: Determining whether a higher supplementation rate risks the replacement of locally unique alleles

Vortex outputs summary statistics for loci only at the end of simulations. Thus, allele frequencies and allele retention probabilities were estimated in a time series; genetic rescue scenarios were repeated with simulations that ran for 10, 20, 30, and 40 years. These values were compared between different supplementation rates to determine whether higher rates of gene flow reduced the retention of locally unique alleles. Whether unique lowland allele retention probabilities and frequencies differed significantly among supplementation rates was determined using aligned rank transformation analysis of variance (ART-ANOVA) in the *R* package ARTool v.0.10.7 [55]. This test is suitable for non-parametric data with paired observations with categorical predictors. Generalised linear models were fit with ARTool with the mean retention probability or frequency as the response, the translocation rate as a predictor, and with data paired by specifying a random intercept for each allele.

Only 23 loci with locally unique lowland alleles were detected (see Section 3.4). Because such a small number of locally unique alleles would make it difficult to predict the effect of the starting allele frequency on retention probability, we used 230 alleles with frequencies replicating those of the 23 observed unique alleles. Frequencies for these unique alleles were set to zero in the highland population. All alleles were assumed to be independently assorting and selectively neutral. The retention probability for each unique lowland allele was given by the proportion of replicate runs in which it did not go extinct.

#### **3. Results**

#### *3.1. Sensitivity Analysis*

Perturbation analysis indicated that no single input parameter had a greatly disproportionate effect on the population growth rate (Table 3). Annual adult mortality (2+ years) had the greatest relative impact on the population growth rate, closely followed by the litter size (average number of individuals per litter). Point values for *r* at each parameter value are given in Supplementary Figure S4.

**Table 3.** Sensitivity analysis output. The relative sensitivity of PVAs to different input parameters based on perturbation analysis.


#### *3.2. Trajectory of the Lowland Population without Genetic Management*

In the absence of inbreeding depression, simulations using demographic rates estimated in the 1990s showed that population growth was limited by the carrying capacity (Figure 2). A large carrying capacity of 1000 resulted in positive growth (*r* = 0.02, SD = 0.08) for the 50-year simulated timeframe. Limiting the carrying capacity to a more realistic value of 120 resulted in population decline (*r* = −0.013, SD = 0.10) for the duration of the simulations. When inbreeding depression was incorporated into the models that assumed that individuals in the starting population are unrelated (MK = 0), the population declined regardless of whether the carrying capacity was 1000 (*r*= −0.028, SD = 0.12) or 120 (*r* = −0.039, SD = 0.15), although in the former case, it initially increased for half of the simulation period (Figure 2).

**Figure 2.** Simulated change in the mean population size ± 1 SD for 200 replicate simulations using mortality estimates for lowland individuals at Yellingbo in the 1990s, with inbreeding depression either excluded or included in models. Simulations were carried out with the carrying capacity (*K*) set to 120 to reflect the limited habitat availability in the reserve, and 1000 to determine whether greater habitat availability could yield a demographically stable population without inbreeding depression. Simulation settings are detailed in Table 1 (*1990s Condition*).

Simulations of the Yellingbo population from 2019 onwards without genetic rescue and with the level of inbreeding informed by genetic estimates (kinship matrix and 2019 allele frequencies) resulted in extinction probabilities of 50% within 11 years, and 100% within 23 years (Figure 3). The mean time to extinction was 11.7 years (SD = 3.4) with a growth rate (*r*) of −0.20 (SD = 0.24) regardless of the carrying capacity (*K* = 40 or *K* = 120). This suggests that the lowland population will likely be extinct within two decades without genetic rescue, even if habitat quality is improved.

**Figure 3.** The mean population size (±1 SD) and probability of extinction averaged across 200 replicate simulations, beginning with 35 individuals in Yellingbo with inbreeding depression effects included in the models. The extinction probability is given as the proportion of simulations that had gone extinct by each year, with extinction defined as only individuals of one sex remaining. Simulation settings are given in Table 1 (*2019 Trajectory*).

#### *3.3. Effect of Genetic and Demographic Rescue on Lowland Population Recovery*

Mean population size at the end of simulations of genetic rescue scenarios increased with the supplementation rate (Figure 4). Supplementation with 10 highland individuals per year was the only scenario to reach the target size within 10 years, taking an average of 8.27 (SD = 2.0) years with a growth rate of *r* = 0.07 (SD = 0.1). This was closely followed by supplementation with 8 individuals per year, which took 11 years to reach the target size with a growth rate of *r* = 0.06 (SD = 0.09).

**Figure 4.** The mean population size ± 1 SD and heterozygosity ± 1 SD for simulations translocating two to ten individuals per year into the inbred lowland population, beginning with 20 lowland founders. Inbreeding depression as included in the models and K=1000. For genetic rescue (GR) scenarios, unrelated individuals were translocated from a simulated highland population. For demographic rescue (DR) scenarios, individuals were introduced with the same allele frequencies and mean pairwise kinships as individuals in the lowland population in 2019. Simulation settings are given in Table 1 (*Genetic Rescue* and *Demographic Rescue*).

Alleviation of inbreeding had an effect distinguishable from merely increasing population size; demographic rescue scenarios (supplementation from the genetically Yellingbolike population) failed to produce positive population growth in simulations regardless of the supplementation rate (Figure 4). The highest mean population size for the demographic rescue scenarios was 102 individuals and was reached after 49 years at a supplementation rate of 10 individuals per year. Varying the contribution of lethal recessive alleles from 0–75% did not substantially alter the simulation outcomes, although increasing the contribution of lethal alleles unrealistically to 100% resulted in much higher population viability and larger genetic rescue effects (Supplementary Figure S2).

Mean heterozygosity across replicate runs also increased with the supplementation rate for genetic rescue. The highest mean heterozygosity reached in simulations was 0.177 (SD = 0.001), at a supplementation rate of 10 individuals per year. This was the highest value possible, given the genetic diversity of the highland population (heterozygosity = 0.18). Maximum heterozygosity was reached sooner with higher rates of gene flow. For genetic rescue, all translocation rates reached at least 90% of the source population's heterozygosity within ten years. As with the population size, the increase in genetic diversity from genetic rescue was distinguishable from that of demographic rescue. The highest heterozygosity for any

demographic rescue scenario was 0.092, which was the highest possible, matching the genetic diversity at Yellingbo.

#### *3.4. Locally Unique Alleles in the Lowland Population at Yellingbo*

Out of 4218 loci examined, 23 loci (0.55%) were found to harbour alleles that occurred only in the lowland population. This excluded 70 alleles that occurred in the heterozygous state in a single individual and were removed; including these alleles would increase the proportion of loci with unique lowland alleles to 2.2%. A large proportion of the locally unique alleles were very rare; out of 23 locally unique alleles, 5 (22%), were found at frequencies below 1% in the lowland population (Figure 5a). When all Yellingbo samples from 2011–2019 were considered together, the highest recorded frequency for any unique allele was 57.7%. The highest frequency of any unique allele in 2019 was 19.9%. Of the 23 alleles detected from 2011–2019, 13 (57%) were not detected in 2019 (noting that 14% of individuals were not genotyped in this year).

**Figure 5.** (**a**) Frequency distribution of locally unique alleles in the lowland population in 2019. Counts of alleles are expressed as a proportion of all alleles in the dataset (*n* = 4218). Frequencies of locally unique alleles from 2011 to 2019 are shown in Supplementary Figure S3. (**b**) Mean ± 1 SD retention probability for locally unique alleles of any initial frequency during 50 years of simulated genetic rescue. Simulations began with 20 Yellingbo individuals, supplemented with different numbers of highland individuals per year (Rate). Probability and frequency values were output every ten years, with values calculated from 200 replicate runs of each scenario. A carrying capacity of 1000 was specified.

#### *3.5. Risk of Decreased Retention of Locally Unique Alleles from Genetic Rescue*

The overall mean retention probability for locally unique alleles was 0.29 (SD = 0.01) at 10 years of genetic rescue, declining to 0.06 (SD = 0.07) at 50 years (Figure 5b). When considered separately, the retention probability for alleles with starting frequencies >10% (Figure 6a) during genetic rescue were somewhat higher, with the highest mean retention probability after 10 years of genetic rescue being 0.68 (SD = 0.11) and the lowest 0.54 (SD = 0.10). After 50 years of genetic rescue, the highest retention probability at any translocation rate was 0.21 (SD = 0.05). For locally unique alleles with starting frequencies <10% (Figure 6b), the highest mean retention probability at 10 years of genetic rescue was much lower at

**Figure 6.** Mean frequencies ± 1 SD and the probability of persistence ± 1 SD of locally unique alleles with starting frequencies of >10% (**a**) and <10% (**b**) in the lowland population. Simulations began with 20 Yellingbo individuals, supplemented with different numbers of highland individuals per year (Rate). Frequency and probability of retention values were output every ten years, with values calculated from 200 replicate runs of each scenario. A carrying capacity of 1000 was specified.

Higher supplementation rates did not produce lower retention probabilities of locally unique alleles during genetic rescue but instead yielded overall higher probabilities (Figure 5). After 20 years of genetic rescue, retention probabilities were substantially higher when the translocation rate was increased from 8 to 10 individuals per year. The difference in retention probability at different translocation rates was statistically significant for all years: 10 years (F = 169.02, df = 4, *p* < 0.0001), 20 years (F = 238.2, df = 4, *p* < 0.0001), 30 years (F = 221.8, df = 4, *p* < 0.0001), 40 years (F = 312.75, df = 4, *p* < 0.0001), and 50 years (F = 393.9, df = 4, *p* < 0.0001). Differences between the translocation rates were also statistically significant for alleles with starting frequencies of <10%: 10 years (F = 64.52, df = 4, *p* < 0.0001), 20 years (F = 212.76, df = 4, *p* < 0.0001), 30 years (F = 115.78, df = 4, *p* < 0.0001), 40 years (F = 157.18, df = 4, *p* < 0.0001), and 50 years (F = 179.6, df = 4, *p* < 0.0001).

Despite higher retention probabilities, locally unique alleles were lower in frequency at higher rates of gene flow (Figure 6). After 10 years of genetic rescue, the highest mean frequency for alleles with a >10% starting frequency was 4.0% (SD = 1.0), which was observed at the lowest supplementation rate of two individuals per year. Using a rate of 10 individuals per year yielded a substantially lower mean frequency of 1.6% (SD = 1.6). For alleles starting at <10% frequency, the highest mean frequency was 0.8% (SD = 0.7), which was also observed at the lowest supplementation rate of the individuals per year. Using a rate of 10 individuals per year yielded a somewhat lower mean frequency of 0.4% (SD = 0.3). The difference in allele frequency at different translocation rates was statistically significant for all years: 10 years (F = 551.21, df = 4, *p* < 0.001), 20 years (F = 268.1, df = 4, *p* < 0.0001), 30 years (F = 143.72, df = 4, *p* < 0.0001), 40 years, (F = 30.5, df = 4, *p* < 0.0001), and 50 years (F = 8.9, df = 4, *p* < 0.0001). Differences between the translocation rates were also statistically significant for the initially rare alleles (<10%): 10 years (F = 64.52, df = 4, *p* < 0.001), 20 years (F = 105.2, df = 4, *p* < 0.0001), 30 years (F = 35.35, df = 4, *p* < 0.0001), 40 years, (F = 8.43, df = 4, *p* < 0.0001), and 50 years (F = 12.31, df = 4, *p* < 0.0001).

#### **4. Discussion**

Many threatened wildlife species have highly disjunct distributions with little to no natural gene flow between populations, and translocations are critical for their long-term demographic and genetic stability [21]. Reintroducing populations within the historical range of a species or establishing populations beyond the historical range in low-threat areas (such as predator-free islands or reserves) are increasingly common [30,56,57]. Using predictive models to guide the founding of isolated populations with ongoing translocations is critical to maximising species-wide genetic variation [58]. We demonstrated how PVAs that incorporate inbreeding and inbreeding depression can predict how much locally unique variation can be retained while genetic rescue increases overall genetic variation.

PVAs predicted a very high probability that the lowland population of Leadbeater's possum will go extinct within two decades (100% in 23 years) without any intervention. As environmental impacts on mortality and reproduction were minimised in simulations to make inferences based primarily upon genetics, this is likely an overestimate of time to extinction. Inbreeding depression strongly increased population decline in simulations, although a low carrying capacity was shown to curtail population size, even in the absence of inbreeding depression. This highlights the importance of providing lowland individuals with high-quality habitat to achieve population growth under genetic rescue. Reinforcing the population with only Yellingbo individuals (demographic rescue) did not sufficiently support population growth to reach our target size of 110 individuals, even using the largest translocation rate of 10 individuals per year. We demonstrated that a modest portion of neutral locally unique variation detected in the lowland population can likely be preserved through a combination of population expansion and outcrossing. Approximately 26% of unique lowland alleles occurred at frequencies above 10%, which had a >50% probability of persistence after 10 years of rescue.

There were several key assumptions in our PVAs that could bias estimates of genetic rescue and allele retention. Random mating among highland and lowland individuals was assumed in all scenarios. Preferential mating among individuals originating from similar habitat or populations can impede outcrossing attempts [59]. For Leadbeater's possum, this could cause translocated highland individuals to increase in abundance and supplant lowland individuals, increasing the risk of genetic swamping. This may be prevented by the careful choice of translocation sites, maximising the probability of highland-lowland pairings. Assortative mating among lowland individuals could also subvert genetic rescue efforts. However, females of some species become more selective in their mate choice if they are inbred, developing a preference for outbred males [60]. For lowland possums, this would result in a preference for highland mates, potentially

increasing admixture rates and improving locally unique allele retention rates by increasing the fitness of individuals carrying such alleles, as well as increasing population growth and thus, reducing genetic drift. Our PVA models also assumed that the currently observed unique genetic variation is selectively neutral. If locally unique alleles are favourable, selection will increase their likelihood of persistence and frequency more so than if they are neutral. The effect of selection is expected to increase over time with increasing population size [22]. Alternatively, lowland-specific genetic variation may be maladapted if lowland individuals are relocated to a habitat that differs from Yellingbo, in which case, this variation would be selected against. All alleles were also assumed to be unlinked, whereas linkage can promote allele retention by genetic hitch-hiking of neutral alleles linked to loci under selection [31].

PVAs predicted that the supplementation rates of up to 10 individuals per year for 8 years will likely be necessary to meet population recovery targets. Although this supplementation rate is substantial, lower rates and less frequent supplements will likely be sufficient in reality to achieve the target population size. This is because the demographic rates modelled were estimated from data collected from the Yellingbo population during the 1990s, at which time the population was already inbred, having roughly half the genetic diversity of highland populations [36]. Therefore, the positive effects of genetic rescue and immigration will likely be greater than those estimated in this study, and fewer translocations might be necessary to provide the desired population benefits. Furthermore, Vortex does not account for a fixed genetic load in recipient populations [41], which means that fitness could be further improved by increasing heterozygosity at loci with fixed (100% homozygous) deleterious alleles. Using assumptions similarly optimistic as ours, such as completely unrelated migrants, population growth projections were still underestimated by Vortex relative to the real data for translocations in another marsupial species (*Bettongia penicillata ogilbyi*) [61]. Thus, it is conceivable that the required rate of translocation estimated in our study is an overestimate. Conversely, a higher rate of supplementation may be necessary given that all individuals in our simulations were assumed to have the same mortality rate as local individuals. This is unrealistic, given that elevated post-translocation mortality is common in wildlife [62]. Genetic rescue should be implemented within an adaptive management framework, with post-translocation monitoring data incorporated into the models to make updated predictions [63,64]. Ongoing monitoring of fitness will also enable incorporation into future models of unlikely but possible effects of outbreeding depression and maladaptation. The genetic diversity and viability of source populations must also be considered, and further PVAs can help predict and mitigate the negative impacts of harvest [58,65].

The sensitivity of our PVAs to different parameter estimates suggests that management interventions that improve survival and reproductive output could facilitate more rapid population growth during genetic rescue. In particular, the proportion of adult females that were breeding and adult mortality had somewhat larger effects on growth rates than other parameters. Establishing colonies with translocated individuals and unpaired local possums could increase the proportion of females breeding in any given year. Adult mortality could be decreased by feral predator elimination measures, given that predation by feral cats and foxes is a known cause of mortality in Leadbeater's possum [66], and a common cause of translocation failure in Australian mammals [67]. Such measures could reduce the amount of supplementation required to reach a target population size.

Lower retention of locally unique lowland alleles at higher rates of supplementation with highland individuals was not observed in any of the simulated scenarios. However, locally unique alleles did decrease in frequency at higher supplementation rates, which is likely an effect of higher proportions of immigrants diluting neutral allele frequencies. A higher retention probability under more intensive gene flow was presumably caused by weaker genetic drift for scenarios where the population size increased more quickly. Thus, under the conditions used in simulations for the lowland population of Leadbeater's possum, genetic drift poses a greater threat to the preservation of unique genetic variation

than dilution at higher rates of translocation. The number of lowland and highland founders will be a strong determinant of the number of alleles retained. Based on allele retention models in another translocated marsupial, using 20 founders is expected to retain ~60% of rare (frequency = 0.05) alleles and using at least 60 founders is expected to retain ~90% of rare alleles [30]. Minimising genetic and demographic stochasticity in small populations by rapid supplementation has been recommended in other species to maximise the effectiveness of population reinforcement [68]. Our study suggests that this principle could also apply to genetic rescue and the preservation of locally unique variation. Although the frequency of these locally unique alleles may be reduced at higher supplementation rates, this would be offset by the benefits of a larger population size (such as reduced stochasticity), and selection will increase their frequency if they present a fitness advantage.

Previous experimental investigations into genetic swamping effects following genetic rescue have focused on locally adaptive alleles [22,69]. However, genetic variation that is not currently subject to selection (such as cryptic variation that only alters phenotype under atypical circumstances) can become useful if environments change; such enhancement of evolutionary potential would preserve apparently neutral variation in the face of uncertain futures [14,70]. Although only a small proportion of alleles (0.55–2.2% of loci) were unique to the lowland population at Yellingbo and none were fixed, it is possible that they could contribute to local adaptation or evolutionary potential. Much unique variation will have been lost from the lowland population owing to the small population size over decades, with an observable reduction in the number of locally unique alleles from 2011 to 2019 (Supplementary Figure S4). Nonetheless, locally adaptive genetic variation can persist in the presence of strong genetic drift [71]. The detection of locally unique variation is also dependent on marker type and density, with SNPs representing only one form of genetic variation. Other types of variation that contribute to local adaptation include structural and copy number variants [72]. Such variation is unlikely represented in SNP datasets, making the amount of unique variation among populations an underestimate when using SNPs alone. Deleterious alleles that are recent mutations tend to be prevalent at low frequencies [73], and many locally unique alleles detected were very rare (17% with <1% frequency). However, it is unclear whether the locally unique alleles detected could be deleterious, because intense genetic drift in the lowland population will have strongly influenced allele frequencies.

The contribution of ongoing post-rescue gene flow to demographic and genetic stability is another important management consideration. Although there is some potential that supplementing the lowland population with individuals from large populations will increase the genetic load, maintaining heterozygosity and minimising inbreeding via occasional gene flow will mask recessive genetic load. Some studies have cited pronounced population declines following the immigration of as few as a single individual as evidence that genetic rescue can further imperil small populations by introducing deleterious alleles that are more numerous in larger populations [74]. Fitness decline with insufficient genetic rescue is unsurprising because inbreeding (identity-by-descent) inevitably increases to pretranslocation levels in very small populations without ongoing gene flow [75]. Founding a new lowland population near existing Central Highlands populations could enable dispersal between populations, which would increase long-term population viability. Natural dispersal could also facilitate the spread of lowland genetic variation that is potentially adaptive to warmer conditions, improving the fitness of Central Highlands populations under climate change. The establishment of self-sustaining populations is a highly soughtafter goal in conservation management [76]. However, populations that are genetically isolated with *Ne* < 1000 will inevitably require at least occasional assisted gene flow if natural gene flow is not possible [1,58]. Early interventions that prevent extreme reductions in *Ne* and mitigating the underlying causes of a small population size are the best means of reducing dependency on conservation management.

#### **5. Conclusions**

We estimate that without intervention, all unique lowland Leadbeater's possum genetic variation will be lost due to extinction of the last population of the lineage within approximately two decades, whereas modest amounts of this unique variation can be preserved with genetic rescue. A higher proportion of locally unique variation is predicted to be retained by adding a greater number of highland animals each year with genetic rescue of the lowland population. Ongoing gene flow from highland populations will also be critical to maintaining long-term demographic and genetic stability within the lowland population. Although PVAs are based on many assumptions, they can form an integral component of adaptive genetic management and offer flexibility to incorporate post-translocation data to resolve uncertainties such as relative fitness among outbred and inbred individuals.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10.3 390/d13080382/s1, Figure S1. Population growth using values of mortality at 0–1 years estimated in the 1990's and using estimated values +6% to produce a demographically stable population in the absence of inbreeding depression. Figure S2. Effect of percent inbreeding depression due to lethal recessive alleles (as opposed to heterozygote advantage) on genetic rescue. Effects are shown as change in population size (N) with genetic rescue by supplementing Yellingbo founders with two highland individuals per year. Percentages of 0–75 inbreeding due to lethal recessives produced comparable changes in population size. Inbreeding depression 100% due to lethals produced much more positive population growth. Figure S3. Frequencies of locally-unique alleles at Yellingbo from 2011–2019, and 2019 alone. Figure S4. PVA input parameter sensitivity tests. Mean growth rate (r) from 200 replicate runs of 50-year simulations. All scenarios began with 110 individuals, excluded inbreeding depression, and used a carrying capacity of 1,0000. A baseline scenario was run using actual input parameter values (except with 0–1 yr mortality increased by 6%), which gave a stable population of approximately 110 individuals. Output values for this baseline scenario are shown by the dashed line. For each scenario shown on the x-axis, a single input parameter was either increased (Low) or increased (High) by 10% of its baseline value to determine its relative influence on simulation outcomes.

**Author Contributions:** Conceptualization, J.P.Z. and P.S.; methodology, J.P.Z., D.H., A.P. and P.S.; software, J.P.Z.; validation, J.P.Z., D.H., A.P. and P.S.; formal analysis, J.P.Z.; investigation, J.P.Z.; resources, P.S. and J.P.Z.; data curation, J.P.Z.; writing—original draft preparation, J.P.Z.; writing review and editing, J.P.Z., D.H., A.P. and P.S.; visualization, J.P.Z.; supervision, P.S and A.P.; project administration, J.P.Z. and P.S.; funding acquisition, P.S. and J.P.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by an ARC Linkage Grant LP160100482 with Partner Organizations Victorian Department of Environment, Land, Water, and Planning (DELWP), Diversity Arrays Technology, Zoos Victoria, Australian Capital Territory Environment, Planning & Sustainable Development Directorate, University of Canberra, and Western Australian Department of Biodiversity, Conservation and Attractions. Joanne Antrobus from Parks Victoria has provided substantial support to the long-term monitoring program for Leadbeater's possum at Yellingbo. J.P.Z. and D.H. were supported by Monash University internal funds and Research and Training Program Scholarships (formerly Australian Postgraduate Award), and awards from the Holsworth Wildlife Research Endowment. This research drew on materials from an Australian Academy of Sciences Research Award for the Conservation of Endangered Australian Vertebrate Species to Andrea Taylor.

**Institutional Review Board Statement:** All population monitoring and tissue sample collection procedures were approved by the Zoos Victoria Animal Ethics Committee (project ZV14004 approved 16/04/2014 and project ZV18001 approved 15/03/2018).

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Supporting data for this publication are publicly available at the Monash University online data repository (https://doi.org/10.26180/15019797, accessed on 12 August 2021).

**Acknowledgments:** We wish to thank Bob Lacy for extremely useful advice regarding PVAs. We thank the editors and *Diversity* for the invitation to submit a manuscript for consideration in this special issue, three anonymous reviewers for their comments, and Carolyn Hogg and Steve Cooper who assessed an earlier version.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Genetic Diversity and Population Structure of Mesoamerican Scarlet Macaws in an Ex Situ Breeding Population in Mexico**

**Patricia Escalante-Pliego 1,\*, Noemí Matías-Ferrer 2, Patricia Rosas-Escobar 1, Gabriela Lara-Martínez 3, Karol Sepúlveda-González <sup>3</sup> and Rodolfo Raigoza-Figueras <sup>3</sup>**


**Abstract:** Given the interest in the conservation of the Mesoamerican scarlet macaw (*Ara macao cyanoptera),* the Xcaret Park formed an initial reproductive population about 30 years ago, which has progressively grown to a considerable population in captivity. In this work, we focus on the evaluation of the genetic diversity of the captive population, taking two groups into account: its founding (49) and the current breeding individuals (166). The genetic analysis consisted of genotyping six nuclear microsatellite loci that are characterized by their high variability. Tests for all loci revealed a Hardy–Weinberg equilibrium in four loci of the founders and in no loci of the breeding groups. The results showed that the genetic variation in the Xcaret population was relatively high (founders He = 0.715 SE = 0.074, breeding pairs He = 0.763 SE = 0.050), with an average polymorphism of 7.5 (4–10) alleles per locus in founders and 8.3 (4–14) in breeding pairs. No significant differences in the evaluated genetic diversity indexes were found between both groups. This indicates that the genetic variability in Xcaret has been maintained, probably due to the high number of pairs and the reproductive management strategy. Bayesian analysis revealed five different genetic lineages present in different proportions in the founders and in the breeding pairs, but no population structure was observed between founders and breeding individuals. The analyzed captive individuals showed levels of genetic diversity comparable to reported values from *Ara macao* wild populations. These data indicate that the captive population has maintained a similar genetic diversity as the metapopulation in the Mayan Forest and is an important resource for reintroduction projects, some of which began more than five years ago and are still underway.

**Keywords:** ex situ conservation; Psittacidae; *Ara macao*; conservation genetics; Xcaret; captive breeding

#### **1. Introduction**

A recommendation made by the Species Survival Committee of the International Union for the Conservation of Nature [1] concerning reintroduction projects mentions the need to include genetic studies, since it is important to try to introduce sufficient genetic variability in the founding individuals of a new population to avoid bottlenecks, greater inbreeding, and possible problems of local adaptation to diseases or environmental changes [2,3]. Genetics is therefore an important aspect in the conservation or recovery program of any species, though by no means the only one [4].

The Mesoamerican scarlet macaw *(Ara macao cyanoptera)* is classified as endangered in Mexico [5] because it has disappeared from most of its original distribution, which used to extend from Tamaulipas through Veracruz, Oaxaca, Tabasco and Chiapas, and as far south as Costa Rica [6–8]. The IUCN received a proposal to consider this subspecies as

**Citation:** Escalante-Pliego, P.; Matías-Ferrer, N.; Rosas-Escobar, P.; Lara-Martínez, G.; Sepúlveda-González, K.; Raigoza-Figueras, R. Genetic Diversity and Population Structure of Mesoamerican Scarlet Macaws in an Ex Situ Breeding Population in Mexico. *Diversity* **2022**, *14*, 54. https://doi.org/10.3390/ d14010054

Academic Editors: Miguel Ferrer, Kym Ottewell and Margaret Byrne

Received: 27 September 2021 Accepted: 5 January 2022 Published: 14 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

endangered [9]. The drastic decline in its populations is caused by the poaching of nestlings for the pet trade and loss of its natural habitat: the high evergreen forest [10]. In Mexico, only small remnants are left of what were once abundant populations. Experts estimate that this subspecies may be lost forever if no preservation action is taken in the next 10 years [11]. At present, we can do more for the conservation of this subspecies by reintroducing it in areas where it can be viable in the mid and long term with the use of captive breeding.

Wiedenfeld [6] found that wild populations of scarlet macaws distributed from Mexico to Central America and to the Amazon River basin may in fact be divided into two subspecies, one distributed from Costa Rica to southern Brazil *(Ara m. macao)* and the other from Mexico to Costa Rica *(Ara m. cyanoptera).* García-Feria [12] analyzed the genetic variation of four contemporary populations of *A. m. cyanoptera* in Chiapas (Mexico), Guatemala, Belize, and Honduras, using two fragments of mitochondrial DNA and a nuclear gene. His conclusions were that there is no genetic break between the studied populations, and that they comprise a cohesive reproductive unit. He observed that 91% of genetic variation was found within populations and only 9% between populations. A more complete phylogeographic study of *Ara macao* using sequences of mitochondrial genes, based on museum specimens, confirmed the existence of two lineages, which must indeed be recognized as subspecies [13]. Their findings in relation to the Mexican and Central American populations indicated that populations from separate sites from the Isthmus of Tehuantepec to the Caribbean slope in Belize and Guatemala did not present any significant substructure at separate sites, forming a single demographic unit or panmictic population, the Mayan Forest metapopulation (Isthmus Tehuantepec-Lancadona-Guatemala-Belize), but found a second demographic unit in the area, the Honduras–Nicaragua–NE Costa Rica metapopulation.

The Xcaret Ecoarchaeological Park is a private institution whose income comes from tourism, and that has been conducting ex situ reproduction of the species for the past 30 years [14]. It is registered as an UMA (Management Environment Unit) in the Wildlife Federal office in Mexico (Dirección General de Vida Silvestre, DGVS), under permit INE/CITES/DFYFS-ZOO-P-0011-99-Q.ROO. This permit does not allow commercial use of the macaws regarding direct sales, since this activity was banned in 2008, but allows the Park to manage its captive population for exhibition/education, and conservation purposes. Within this restriction, the Park has built many facilities for the macaws depending on the use of the specimens; some facilities are only used for night housing the macaws that are imprinted with humans and fly and return to the aviary and are not of reproductive age. Other facilities are used for macaws that are used for exhibition in close encounters with the public, and other facilities are for macaws that are reproducing.

Breeding at Xcaret is managed in aviaries of different sizes. At the beginning of the breeding program, pairs were artificially made by placing two adults together in the same cage, but later, bigger cages were built to include several adults of both sexes to let them choose their mates by themselves. Once the pair is formed, they are put in a cage of mid-size with a nesting box. Raising is performed both by hand and by their parents until they are three months old. There are climate-controlled facilities to raise nestlings by hand and more open facilities to feed youngsters until they feed themselves. All the macaws born in the population are banded with a closed marked steel ring according to the permit at 1–1.5 months old because, later, it would not be possible to insert the ring anymore. The ring has a unique key number and the Xcaret name. The reproductive output is reported annually to the DGVS with the list of rings placed with the specimens.

In a study of ex situ conservation genetics assessment, Witzenberger and Hochkirch [3] recommended that the number of founders must be a minimum of 15 and the population size at least 100 in order to minimize inbreeding problems in ex situ programs. The macaw population started with 29 pairs obtained from other captive populations in Mexico and from seized illegal specimens given to Xcaret by PROFEPA (Procuraduría Federal de Protección al Ambiente) for an approximate total of 60 individuals of unknown kinship. Breeding in captivity started in 1994 with six nestlings. The captive population grew to the goal of 100 breeding pairs, and more. The maximum number of macaws in the population

was 946 (2012). As a result of different donations for reintroduction programs, the total number today is 596. Given these numbers, the captive population is expected to meet the criteria to avoid inbreeding.

Over the course of the first 20 years of management, decisions were made to maximize reproductive success by avoiding reproductive pairing of close relatives, using the record keeping system. Such decisions showed improvement regarding reproductive output and population growth. Although records were not kept at the beginning of the Park's operation, breeding data were recorded starting in 2007.

The actual breeding capacity now is 200 newborns per year but, due to space limitations, many nests are kept closed. All macaws born in captivity are issued a birth certificate with their parents' IDs and ring numbers. Available pedigree information has been used to avoid inbreeding.

Evaluation of the maintenance or loss of genetic diversity in the current ex situ population is very important, since this population is the foundation for reintroduction programs in Mexico (Palenque 2013–2018, Los Tuxtlas 2014–2021). This information could help optimize strategies to select individuals for reintroduction by maximizing genetic variation, thus avoiding bottlenecks and negative effects of inbreeding in the new populations to be established.

The objectives of the present study were to estimate the genetic diversity of the founders and breeding pairs of the Xcaret population in order to contribute to the conservation of this subspecies and to compile previous genetic studies to compare them with this captive population. To achieve this objective, we genotyped the founding individuals and breeding pairs with six microsatellite loci to therefore compare the diversity of both groups (founders and breeding pairs).

#### **2. Materials and Methods**

#### *2.1. Sample Collection*

Blood samples were obtained from 49 macaws identified as founders that are still alive, though they are no longer breeders. We also sampled the 166 scarlet macaws comprising the 83 breeding pairs of 2015, descendants of these founders. A drop of blood was collected from each individual and placed on labeled FTA cards (WhatmanTM, Florham, NJ, USA). All samples were collected at the Xcaret aviary and then sent to the laboratory for analysis and storage at room temperature. All samples were confirmed as *Ara macao cyanoptera* based on mitochondrial DNA (unpublished data).

#### *2.2. DNA Extraction*

DNA was extracted from blood samples using the proteinase K digestion technique with the kit and DNeasy Blood & Tissue® (Qiagen Valencia, Santa Clarita, CA, USA). Two snips of blood were used for lysis, 60 μL of PBS 1X, and 7 μL of proteinase K (1 mg/mL), adjusted at 220 μL of lysis buffer.

The extracted DNA was then quantified in a nucleic acid spectrophotometer (Nanodrop® Thermo, Wilmington, DE, USA) and visualized on 2% agarose gel. All samples had a final DNA concentration between 20 and 58 ng/μL at a purity of 1.5–1.9 (260/280 absorbance). Such quantity and purity are suitable to amplify microsatellite loci [15,16].

#### *2.3. Amplification and Genotyping*

Primers for six microsatellite loci designed for different species of *Ara* [17] were previously standardized in our laboratory (Table 1) and identified as variable and informative for *Ara macao cyanoptera.* Amplification of the loci was performed using the forward primer labeled with a fluorescent dye (VIC, FAM, PET, and NED, (Applied Biosystems® Foster City, CA, USA). PCR amplification was carried out with the Platinum® Taq DNA Polymerase kit (ThermoFisher, Waltham, MA, USA) with 1X Buffer, 2 mM MgCl2, 0.8 μM of each primer, 0.2 μM dNTPs, 0.25 U taq polymerase, 1–2 μL of DNA (5–25 ng/μL), and finally bidistilled water to adjust a reaction volume of 12.5 μL. The reaction was performed with

an initial denaturation of 95 ◦C for 30 s followed by 14 touchdown cycles and an annealing temperature of 60 ◦C for 30 s (with 0.5 ◦C per cycle decreases to 51 ◦C), 1 min at 72 ◦C, 30 cycles of 95 ◦C for 30 s, 51 ◦C for 30 s, and 72◦ C for 1 min, and a final extension of 7 min at 72 ◦C.

**Table 1.** Estimates of population genetic parameters for founders and breeding pairs from the Xcaret scarlet macaw population, including number of individuals (N), number of alleles (Na), effective number of alleles (Ne), observed heterozygosity (Ho), expected heterozygosity (He), private alleles (P), Nei's gene diversity (D), allelic richness (AR), and inbreeding coefficient (F). \* Significant (*p* < 0.05). \*\* Significant (*p* < 0.05) after correction using Bonferroni procedure.


The DNA amplicons obtained were sent to the Biodiversity and Health Genomic Sequencing Laboratory of the Institute of Biology of the Universidad Nacional Autónoma de México for genotyping. Each sample was prepared with 500 LIZ® Size Standard (Applied Biosystems) according to the manufacturer, and 1 μL of DNA amplicon per individual per loci, and then subsequently analyzed on an Applied BiosystemsTM 3500xL laser sequencer (Life TechnologiesTM, Carlsbad, CA, USA). Allele scoring was performed with GeneMapper v. 4.1 (Applied BiosystemsTM). All the fragments were analyzed twice to confirm the obtained results. Tandem v. 1.09 [18] was used to correct mobility problems and possible artifacts, which also confirmed the allelic assignment according to the repetition units of each locus. To detect null alleles, large allele dropout, and scoring errors, all loci were analyzed with MICRO-CHECKER v2.2.3 [19]. In GENEPOP Web [20], deviations from the Hardy–Weinberg equilibrium (HWE) caused by an excess or deficit of heterozygotes were analyzed using Fisher's exact test [21]. In GENALEX v. 6.5 [22], the linkage disequilibrium of all loci was assessed using the exact probabilities test. We estimated the significance level values of HWE and linkage disequilibrium by applying a Bonferroni correction (α = 0.01). The following diversity indices were also obtained: average number of alleles per locus (Na), effective number of alleles (Ne), observed (Ho) and expected (He) heterozygosity, number of private alleles (P), and Nei's genetic diversity (D). Allelic richness (AR), standardized with the smallest sample size (46), was obtained in FSTAT v. 2.9.4 [23]. Each index was calculated per locus and per scarlet macaw group studied (founders and breeding pairs).

#### *2.4. Population Structure*

The population structure was examined through a principal coordinates analysis (PCoA) based on Nei's genetic distances matrix between individuals using GENALEX ver 6.5. In addition, population structure was examined using STRUCTURE v.2.3 software [24] to perform a Bayesian clustering method in order to infer the number of clusters and structure. An admixture model with the LOCPRIOR option was used. The number of tested populations (K) ranged from 1 to 10 using 20 independent Markov Chain Monte Carlo (MCMC), by sampling 200,000 iterations, and a 200,000 burn-in period. The most likely number of clusters was inferred using STRUCTURE HARVESTER [25]. To infer the optimal K-value, both ln Pr (K) and the ΔK statistic [26] were calculated. To further search for genetic clusters without assuming an underlying population genetic model such as HW equilibrium or linkage disequilibrium, we applied the discriminant analysis of principal components (DAPC, [27]), performed with the package ADEGENET v 2.2.1 [28] of RStudio v 1.4.1717 (©2009–2021 RStudio, PBC) in R v. 4.1.1 [29]. We explored the data with K = 10, K = 20 and K = 30. The lowest value of the Bayesian information criterion (BIC) was used as the number of clusters that best reflect the population structure of the data.

To estimate the degree of kinship (r) between individuals and populations (founders and breeding pairs), we evaluated different "relatedness r" estimators available in the GENALEX software. To compare the informativeness and the power of relatedness detection of available estimators, we used the reciprocal of the mean squared deviations (RMSD) of estimators provided in the KinInfor program v.2 [30]. Our objectives were to detect full sibs (Δ1 = 0.5, Δ2 = 0.5), paternal halfsibs (Δ1 = 1, Δ2 = 0), maternal half sibs (Δ1 = 0.5, Δ2 = 0), and unrelated (Δ1 = 0, Δ2 = 0) individuals. We ran simulated pairs of genotypes and set the confidence level at 0.05. For each estimator, the mean and standard deviation were calculated. The COANCESTRY program [31] was used to compute the R values for each pair of individuals (dyads) and evaluate the statistical errors associated with these estimates using bootstrapping (1000 replicates).

#### **3. Results**

Genotypes were obtained for six microsatellite loci from a total of 49 founders and 166 breeding individuals of Mesoamerican scarlet macaws. No evidence of null alleles or allelic dropout was found. Linkage disequilibrium (LD) was observed between 11 of 30 loci comparisons and was observed only in the breeding population (Supplementary Materials Table S1). Deviations from the Hardy–Weinberg equilibrium were detected only in AgGT17 and UnaCT74 in the founder population, but in all loci, except UnaCT74 in the breeding pairs group, after Bonferroni correction (Table 1 and Table S2). Based on the overall test, the founder population showed a deficit of heterozygotes (*p* = 0.0250), while the breeding pairs showed an excess of heterozygotes (*p* = 0.0164).

In terms of genetic diversity, the six loci were polymorphic for both populations. The number of private alleles was greater in the breeding population. (8 vs. 3). The diversity (D) and allelic richness (AR) differed very slightly (Table 1). The comparison of allelic richness was not significantly different between founders and breeding populations (AR= 7.466 vs. AR = 7.375, *p* = 0.96), neither in observed heterozygosity (Ho = 0.717 vs. 0.763, *p* = 0.8325) nor gene diversity (D = 0.731 vs. D = 0.717, *p* = 0.496).

The pattern of genetic structure determined by PCoA displays the overlap between founder and breeding pair groups (Figure 1, Supplementary Materials Figure S1). The Bayesian analysis of population structure revealed that the maximum value of ln P (K) obtained was K = 5 (Figure 2a), which is concordant with the maximum Delta K (Figure 2b) (Supplementary Materials Table S3). The individuals of the founder group are formed by different proportions of the five genetic lineages observed (Figure 2c); however, there is a higher prevalence of lineages 1 (red) and 2 (green), both in the founders and in the breeders (Figure S1).

**Figure 1.** PCoA analysis of 215 individuals of founders (*n* = 49) and breeding pairs (*n* = 116) in the Xcaret captive population of Mesoamerican scarlet macaw (*Ara macao cyanoptera*) based on genotyping results of six microsatellites markers.

In contrast to STRUCTURE results, eight gene clusters were obtained with DAPC (Figure 3) for K = 10, K = 20, and K = 30 analysis (Supplementary Materials, Figure S1). However, the separation of the clusters is not entirely clear (Supplementary Materials, Figure S1); clusters 1, 2, 4, 5, and 7 overlap extensively, but 3, 8, and specially 6 are more clearly defined.

Comparisons among the informativeness of different relatedness estimators yielded Ritland [32] as the best estimator (Tables S4–S6). Ranking of the loci according to informativeness was: AgGT42, AgGT17, AgGT21, UnaCT21, UnaCT74, and AgGT19. Mean and variance of the different relatedness estimators are provided in Supplementary Materials Tables S4 and S5. The averaged relatedness values are no greater than −0.004 in founders and −0.002 in breeding pairs (Figure 4). The relatedness values in both groups are within the expected estimated confidence intervals (Figure 4). Mean relatedness was lower in the founders but slightly increased in the breeding pairs, perhaps due to the presence of siblings in the group (although R is close to zero, indicating most individuals are unrelated).

**Figure 2.** (**a**) Mean ln P (K) graph, (**b**) Evannos ΔK graph, (**c**) Q-membership proportion of K = 5 genetic clusters of founders (black dots), and breeding individuals of the Xcaret Mesoamerican scarlet macaw (*Ara macao cyanoptera*) based in six microsatellites markers. The length of the colored segment indicates the proportion of the individual's composition in specific clusters showing admixture in the population.

**Figure 3.** Gene clusters found with the DAPC analysis showing some structure in the captive population of scarlet macaws of Xcaret.

**Figure 4.** Ritland's relatedness (r) index of the Xcaret Mesoamerican scarlet macaw (*Ara macao cyanoptera*) in founders and in breeding pairs groups.

#### **4. Discussion**

The estimates of the population genetic parameters for founders and breeding pairs from the Xcaret scarlet macaw population were very similar. The slightly larger number of alleles in breeding pairs probably reflects the fact that we were not able to obtain samples from all of the founder individuals, since some had died before our study was carried out. Reproduction in the species extends until they are 35 years old, with a longevity record of up to 64–65 years [33]; thus, the number of generations in the population is small. Reproduction begins when individuals are approximately four years old (Xcaret, pers. com.) and generations overlap; this feature has helped preserve the original genetic diversity of the population up to the present time. Populations in captivity have often been observed to

depart from the HW equilibrium [34], and in this ex situ population, significant departures were noticeable in the breeding pairs in comparison with the founders. These results are perhaps accounted for by nonrandom mating management in captivity in this population.

The genetic diversity of some wild populations of *A. macao* have already been studied using the same microsatellites. In addition to trying microsatellites for different species of *Ara* and *Amazona*, Gebhardt and Waits [17] tested *Ara m. macao* from Peru. From a wild population of *A. m. macao,* scarlet macaws were sampled in Pará (Brazil) with 10 microsatellites [35]. From a population of the Mayan Forest in Guatemala, 11 microsatellites were tested [36]. Additionally, two populations from Costa Rica *(A. m. macao)* were studied [37] using six microsatellites (Table 2).


**Table 2.** Population origin, sample size (N), number of alleles (A), and expected (He) and observed (Ho) heterozygosities of microsatellite loci in the two subspecies of scarlet macaws studied to date.

Moderate levels of genetic diversity were found in the Costa Rican wild populations, similar to those of Guatemala, with indications of imbalance possibly due to genetic erosion caused by anthropogenic factors that are demographically affecting them. More stable populations in the Amazon (Peru and Brazil) have more genetic variability, leading to the interpretation that the scarlet macaw is a generalist species and until recently was widely distributed, exhibiting high genetic diversity in relation to other species of more specialized macaws with less diversity [35,38].

Comparing these data on wild populations with those of the Xcaret founders and breeding pairs, heterozygosity and number of allele values are very similar, but wild populations in Guatemala (*cyanoptera*) and Costa Rica (subspecies *macao*) have slightly lower values, whereas those of *macao* populations in Brazil and Peru, where populations are still large, are slightly higher. This similarity in the genetic variability values of the founding and reproductive populations of Xcaret with those of wild populations indicates that the ex situ population maintains levels of genetic diversity comparable to wild populations of the subspecies. However, these comparisons are not tested statistically.

Relatively large population sizes [39], high dispersal [40], and longevity [33,41] may have helped buffer scarlet macaw populations against genetic erosion. In fact, the Xcaret population is now two times larger than the Mayan Forest population, as estimated by Boyd and McNab [11].

Between the captive groups of founders and breeding pairs, there is a lack of population structure, which was evident with the PCoA and the Bayesian analysis, in which admixture is evident in the genetic composition of the individuals; hence, there is no genetic differentiation between founding and current breeding pairs in the captive population. However, the statistics obtained with STRUCTURE indicate that there are five genetic lineages that may be the result of a previous genetic structure in the wild populations where the founders came from. A similar population structure was found with the DAPC analysis, a method that emphasizes the differences between clusters. Of the eight clusters identified, five were not distinguishable in the graphic but three others were, perhaps reflecting a historical population structure from the original wild populations. The genetic

diversity of the ex situ populations is determined by the gene pool of the founders and their reproductive success [42]. Using mitochondrial DNA sequence data, Schmidt et al. [13] showed there was geographic clustering of haplotypes that might suggest genetic structuring of the wild populations of *A. m. cyanoptera* whom the founders of the Xcaret population came from.

Ex situ conservation programs strive to maintain genetic diversity that is comparable to a wild population by capturing a sufficient number of founders and managing matings to select individuals with underrepresented genes [43]. The objective of any reintroduction program is to provide enough genetic diversity to circumvent negative effects of natural selection within the new population, assuming that the reintroduced population can grow rapidly. If a source population has low genetic diversity, the reintroduced population that is derived from it will have similarly low diversity [44]. Our data show that the captive macaw population at Xcaret has captured and maintained similar levels of genetic variability to wild macaw populations of the Selva Maya. An important next step would be to compare the breeding population to the wider natural population. This analysis might reveal that there are unsampled populations in the wild such as The Chimalapas (Oaxaca, or Selva Maya W in [36]); as such, the captive breeding program may be improved by bringing in further birds from the wild to increase the genetic representation of the captive breeding population. Another possibility would be that representatives of this western population survive in captivity in some other aviaries and could be traced.

With limited pedigree information, at least initially, the reproductive management of the captive reproductive population has followed the strategy of reducing kinship to favor reproductive success and, unintentionally, conserving maximum genetic diversity within the population [45]. Hence, with the reproductive management carried out in Xcaret, it has been possible to maintain the gene pool of the founders in the reproductive population. Therefore, it is not surprising that an underlying structure was also found with microsatellites with five clusters that are well marked but intertwining.

The inbreeding coefficient (F) was low (<0.03) for both groups (founders and breeding pairs), which indicates that matings between closely related individuals is infrequent in both populations; however, the pairwise relatedness value in the breeding population showed a slight increase. Although this increase was not significant, it could indicate that ongoing careful management of breeding pairs will be required to avoid inbreeding. Although the breeding pairs group have generally avoided inbreeding, and the population has maintained levels of genetic variability comparable to those found in wild populations, it is recommended to identify individuals with a high degree of kinship in order to prevent breeding attempts by closely related individuals. The kinship information obtained in this study with the breeding pairs, complemented by pedigree information for the descendants of these pairs, should be used for further growth of the captive colony, as well as for the selection of group compositions for reintroduction projects with the aim of providing maximum genetic variability in the new populations, avoiding bottlenecks and promoting the success of reintroductions from a genetic viewpoint.

In order to secure the long-term persistence of reintroduced populations, it is also important that ex situ breeding programs endeavor to minimize time in captivity. If programs exceed the limit of 10–15 generations, relaxed selection could incur genetic costs [46]. With these guidelines, the Xcaret and its allies' opening of reintroduction projects of the subspecies in two separate sites in Mexico (Palenque, Chiapas and Los Tuxtlas, Veracruz) is timely.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/d14010054/s1, Figure S1: Genetic clusters obtained from the STRUCTURE analysis; Table S1: Genotypic linkage disequilibrium; Table S2: Summary ofor Hardy-Weinberg Equilibrium; Table S3: Evanno summary; Table S4: RMSD in KinInfor; Table S5: Relatedness estimates in Coancestry; Table S6: Ritland's relatedness estimator; Table S7 Table of alleles.

**Author Contributions:** P.E.-P. designed the study, coordinated the work, and participated in the analysis and interpretation of results as well as the writing of the manuscript. N.M.-F. performed the laboratory assays, analyzed the data and participated in the interpretation of results and writing of the manuscript. P.R.-E. carried out part of the laboratory assays and participated in the interpretation of results and writing of the manuscript. G.L.-M., K.S.-G. and R.R.-F. maintained the reproductive population, helped collect samples, and participated in the interpretation of results and writing of the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** Part of the funding for this work was granted by the Priority Species Program of the National Commission of Natural Protected Areas (SEMARNAT, grant PROCER/DGOR/26/2014).

**Institutional Review Board Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available in Supplementary Table S7: Table of alleles.

**Acknowledgments:** We wish to thank Bosque Antiguo AC for its support to obtain and handle the resources of the project, as well as Jonathan Morales-Contreras and Pilar Luna for their committed assistance in the lab and the aviary. We also thank Laura Márquez-Valdelamar for her assistance with the genotyping of the data. We likewise thank Guadalupe Amancio Rosas for her support in performing part of the analysis. We also thank comments from three reviewers that improved the original manuscript substantially.

**Conflicts of Interest:** There is no conflict of interest from the authors.

#### **References**


### *Opinion* **Reducing the Extinction Risk of Populations Threatened by Infectious Diseases**

**Gael L. Glassock 1, Catherine E. Grueber 1, Katherine Belov <sup>1</sup> and Carolyn J. Hogg 1,2,\***


**Abstract:** Extinction risk is increasing for a range of species due to a variety of threats, including disease. Emerging infectious diseases can cause severe declines in wild animal populations, increasing population fragmentation and reducing gene flow. Small, isolated, host populations may lose adaptive potential and become more susceptible to extinction due to other threats. Management of the genetic consequences of disease-induced population decline is often necessary. Whilst disease threats need to be addressed, they can be difficult to mitigate. Actions implemented to conserve the Tasmanian devil (*Sarcophilus harrisii*), which has suffered decline to the deadly devil facial tumour disease (DFTD), exemplify how genetic management can be used to reduce extinction risk in populations threatened by disease. Supplementation is an emerging conservation technique that may benefit populations threatened by disease by enabling gene flow and conserving their adaptive potential through genetic restoration. Other candidate species may benefit from genetic management via supplementation but concerns regarding outbreeding depression may prevent widespread incorporation of this technique into wildlife disease management. However, existing knowledge can be used to identify populations that would benefit from supplementation where risk of outbreeding depression is low. For populations threatened by disease and, in situations where disease eradication is not an option, wildlife managers should consider genetic management to buffer the host species against inbreeding and loss of genetic diversity.

**Keywords:** genetic rescue; genetic restoration; supplementation; disease; genetic diversity; inbreeding; conservation

#### **1. Introduction**

Species extinction is a serious and pressing environmental challenge [1]. Loss of biodiversity disrupts ecosystem functioning, damages ecosystem services, and impacts human wellbeing [2,3]. Along with habitat fragmentation, small population sizes, invasive species, and climate change, emerging infectious diseases (hereafter diseases) contribute to species' decline [4,5]. Whilst there are no known instances of species extinction solely due to disease, black-footed ferrets (*Mustela nigripes*), Polynesian tree snails (*Partula nodosa*), Hawaiian honeycreepers (*Carduelinae*), African lions (*Panthera leo*), and American chestnuts (*Castanea dentata*) are some of the species that have experienced population crashes due to disease [6,7]. By decreasing the size of host populations, disease outbreaks can reduce their fitness and adaptability, predisposing them to extinction from other threats [4,5,8]. The reduced viability of disease-affected populations occurs because small populations are susceptible to inbreeding and genetic drift, which can result in increased homozygosity, fixation of deleterious alleles, and loss of genetic diversity [9–12]. Fragmented habitats can further exacerbate the impact of disease by reducing the movement of individuals between populations [13]. Whilst disease is a known threatening process that contributes to the endangerment and extinction of species [4,5], it can be difficult to mitigate.

**Citation:** Glassock, G.L.; Grueber, C.E.; Belov, K.; Hogg, C.J. Reducing the Extinction Risk of Populations Threatened by Infectious Diseases. *Diversity* **2021**, *13*, 63. https:// doi.org/10.3390/d13020063

Academic Editor: Michael Wink Received: 30 November 2020 Accepted: 2 February 2021 Published: 4 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Current efforts to reduce the impact of diseases are generally focused on direct interventions, such as vaccines and/or host treatments [14]. Human intervention using these management practices can be successful in preserving populations affected by disease [5,14]. For example, the Arctic fox (*Vulpes lagopus*) variant of rabies was eradicated in eastern Ontario, Canada, following delivery of an oral rabies vaccine [15]. In contrast, some diseases are difficult to eradicate [16]. It is currently not possible to eliminate the devastating amphibian pathogen, chytrid fungus (*Batrachochytrium dendrobatidis*), due to environmental reservoirs and multiple host species sustaining the pathogen within the environment [17]. As a result, current management practices may limit the impact of a disease, but not eradicate it. Therefore, what should happen when it is not possible to directly reduce the prevalence and impact of a disease? Host populations may recover if a treatment is developed or pathogen–host coevolution occurs [18,19]. However, until some degree of pathogen immunity emerges, whether naturally or via human intervention such as a vaccine, many populations that are suffering significant disease outbreaks remain vulnerable and at low densities. Repopulating an area with disease-free individuals following disease-induced local population extinction has been successful in some instances [20]. However, this method is often unrealistic because it is dependent on the existence of a healthy source population. We suggest that, when pathogen eradication is not a viable management strategy, an alternative to preventing population extinctions is to genetically manage host populations until the disease can be more effectively controlled.

#### **2. The Tasmanian Devil**

#### *2.1. Devil Facial Tumour Disease (DFTD)*

Most infectious diseases are caused by pathogens, an organism capable of being transmitted from host-to-host where it causes disease. However, disease threats come in many forms. Tasmanian devils (*Sarcophilus harrisii*) are an example of a species that has suffered widespread decline following the outbreak of a disease. Endemic to the island state of Tasmania, devils are the world's largest surviving carnivorous marsupial. They are threatened by devil facial tumour disease (DFTD), which is a transmissible cancer, where the cancer cell line is the infectious agent and is passed as an allograft between individuals [21]. DFTD is characterised by large, ulcerating facial tumours [22], and is nearly always fatal [23,24]. Since DFTD was first detected in the mid-1990s, devil numbers have declined by 80 percent in the wild [25]. Precocial breeding by one-year-old females appears to be maintaining devil populations, despite the presence of DFTD [25,26]. However, contemporary modelling has shown that many of these populations are susceptible to small population pressures, including inbreeding [27], making them less adaptable and more vulnerable to other threatening processes. Tasmanian devils are listed as endangered both internationally on the International Union for Conservation of Nature (IUCN) Red List [28], and nationally under the Environment Protection and Biodiversity Conservation Act 1999, and require management to mitigate the threat of DFTD and to maintain the species in the landscape. Although immunisation trails have successfully induced an anti-DFTD antibody response in disease-free devils [29], research into a vaccine against DFTD is ongoing. As such, it is currently not possible to directly alleviate the negative impact DFTD has on devil populations in the wild.

#### *2.2. Genetic Managment*

The Save the Tasmanian Devil Program (STDP) is the official government response to DFTD. Established in 2003, one of its aims is to maintain devils and their ecological function in the wild. In the absence of effective treatments or vaccines for DFTD, the decision was made to manage devil populations in the presence of disease, rather than the disease itself as per general disease management strategies [30]. The remnant devil populations are fragmented, with most showing low connectivity [27,31], low genetic diversity [32] and population decline [25]; therefore, they are considered a candidate for genetic management.

Translocations involve the human-mediated movement of individuals from one location to another, with the aim of supplying a population-level or species-level benefit [33]. Supplementation is a specific type of translocation where individuals are released into an existing population of conspecifics [33]. Supplementation offers a potential mechanism to artificially implement gene flow into disease populations and reverse the negative genetic changes that have occurred due to low population densities [34–38]. Introducing novel alleles via supplementation can improve allelic richness and increase genetic diversity, boosting the adaptability of the population and preventing deleterious alleles from becoming fixed [34,37,39–43]. Supplementation may also increase the number of individuals in the recipient population, buffering it against genetic drift and stochastic events [44,45]. Some have argued that the persistence of a few small, isolated populations in the absence of gene flow is evidence against the need for supplementing some populations with low genetic diversity [46,47]. However, these arguments are not supported by substantial empirical evidence and may be damaging to future conservation efforts [48]. Supplementation may be particularly beneficial for populations threatened by disease, because of gene flow into a host population, providing additional host alleles upon which selection can act [49].

The STDP established the Wild Devil Recovery Project in 2015 to investigate the use of supplementations to genetically manage declining wild devil populations [30]. Modelling has predicted that ongoing supplementation of DFTD-affected wild devil populations will prevent the loss of neutral rare alleles from the population, and increase the probability of population persistence over 50 years [27]. To date, six separate supplementations of four DFTD-affected wild devil populations have occurred, using devils sourced from a disease-free insurance metapopulation [50]. If available, genetic data should be used to select individuals for translocation to minimize inbreeding and maximize diversity [12]. Most devils in the insurance metapopulation have tissue samples taken which undergo reduced representation sequencing. These genetic data have been used to select devils for release; devils that best complement the wild sites by introducing novel alleles into the supplemented population [50]. Currently, the STDP and the University of Sydney are undertaking a large research study to investigate the impacts of these supplementations on population fitness and the prevalence of DFTD, with preliminary data showing positive signs [30]. Part of this research project includes measuring immune gene diversity, such as diversity at major histocompatibility complex (MHC) loci, in admixed devils following supplementation. Diversity at immune genes is important, because low MHC diversity in Tasmanian devils is believed to have contributed to the emergence and rapid transmission of DFTD [51].

Whilst the supplementations performed by the STDP aim to increase overall genetic variation, Kelly and Phillips (2015) suggest using a more aggressive approach, called targeted gene flow. Targeted gene flow introduces resistant alleles into a population by translocating individuals with beneficial adaptations. This technique has not been adopted into devil conservation efforts primarily because: (1) it is a relatively new concept with some aspects still speculative [52]; (2) investigations into genomic regions associated with DFTD resistance are still under investigation [53]; and (3) there are concerns that translocating individuals selected specifically for disease resistance may increase adaptability towards the disease but reduce future adaptive potential.

#### *2.3. Benefits of Supplementing Populations*

Performing supplementations to genetically manage populations threatened by disease is an emerging concept. Currently, there is little direct evidence regarding the ability of supplementations to reduce the extinction risk of diseased populations. Supplementation to preserve non-disease threatened populations [34,40–42] may provide some insight into the potential benefits of supplementing DFTD-affected devil populations.

The mountain pygmy-possum (*Barramys parvus*) is a prominent example of a threatened species, without specific known disease threats, which has benefitted from supplementation. The mountain pygmy-possum is a small, Australian marsupial, located

within three, genetically distinct populations that have limited gene flow [42]. The Mount Buller population was under serious threat of extinction due to habitat clearing, which had reduced the size, genetic diversity, and fecundity of the population. To improve the demographic and genetic integrity of this population, it was supplemented with a small number of males (*n* = 11) sourced from one of the two other remaining populations, between 2011 and 2014 [44]. Following supplementation, the population exhibited a 68 percent growth in population size, increased genetic diversity, and integration of novel alleles into the gene pool over a five-year period [42]. There was evidence of hybrid fitness, seen in admixed individuals being larger and producing more offspring [42]. The response of the population to supplementation demonstrated that supplementation provided distinct genetic benefits to the Mount Buller population, as well as boosting population fitness and demographic parameters.

The response observed in the mountain pygmy-possum is indicative of genetic rescue an increase in population fitness (growth) of a population which is suffering inbreeding depression owing to the immigration of new alleles [43]. Genetic rescue occurred, partially due to the supplementation being performed in conjunction with efforts to restore the possum's degraded habitat [42]. These actions align with IUCN guidelines that, when performing supplementations, the threat that caused the population's decline should be minimized [33].

However, a key point of comparison between supplementing disease and non-diseaseaffected populations is that, whilst many threats to populations can at least be partially addressed, as noted above, disease threats can be difficult to mitigate. As such, population growth in diseased populations following supplementation is likely to be slower, resulting in genetic restoration. Genetic restoration is an increase in genetic variation and relative, but not absolute, fitness owing to the immigration of new alleles [43]. Whilst genetic restoration does not necessarily produce the same demographic recovery achieved by genetic rescue, it nevertheless boosts the diversity of the population and helps ensure its future adaptive potential. Measuring genetic change in a population following supplementation (e.g., inbreeding, and neutral and functional diversity) is a useful measure of supplementation success in populations threatened by disease.

For Tasmanian devils, this means that supplementing populations under the Wild Devil Recovery Project may not lead to significant demographic recovery due to the ongoing presence of disease, but may lead to genetic benefits for populations which are showing signs of increasing inbreeding [27]. At this time, it is undetermined if these supplementations will benefit or hinder the species' ability to co-evolve with the disease. Although evidence of a selective response in devils to DFTD has been inconsistent [53,54], the suggestion of possible evolutionary change in devil populations exposed to DFTD [53] has prompted concern that sourcing the released devils from the DFTD-free insurance population will be counter-productive for the evolution of host resistance to the disease [55]. It should be noted, however, that the insurance metapopulation has been acquiring orphan devils from diseased populations since 2009 [56,57], so much post-DFTD wild diversity is likely to be represented in the insurance metapopulation.

#### **3. Extending Principles Adopted in Tasmanian Devil Conservation Efforts to Other Populations Suffering from Disease**

*3.1. Genetic Management of Disease-Affected Populations*

Tasmanian devils are one of the few species primarily suffering from decline due to disease that have genetic management via supplementation incorporated into their conservation program. It is unfortunate that the genetic management of disease-threatened populations has not been more widely accepted, because other species impacted by disease could benefit from supplementation. For example, mountain yellow-legged frogs (*Rana muscosa*) have declined, in part, due to chytrid fungus [58]. They are currently listed as endangered by the IUCN [59]. There are nine small populations of frogs, persisting in southern California, U.S.A., that require management to avoid extinction [60]. These populations are structured and possess low within-population variation, indicating that the populations are genetically isolated [61]. Currently, there is no evidence of inbreeding within these populations [61]; however, given that gene flow between populations is limited, it may emerge. Implementing strategies, similar to those adopted in devil conservation efforts, could avoid loss of alleles due to genetic drift and inbreeding. Captive frogs involved in San Diego Zoo's successful breeding program [61] could potentially be an appropriate source of individuals for supplementation. Augmenting gene flow may reduce extinction risk in the mountain yellow-legged frog, and other species subject to significant disease-induced population decline, by maintaining genetic diversity and providing alternate alleles upon which selection can act.

#### *3.2. Fear of Failure*

There is a hesitancy to accept supplementation as a conservation strategy even in non-diseased populations [12]. From a genetic perspective, a primary perceived risk associated with supplementation is outbreeding depression [12]. Outbreeding depression occurs when the mixing of two genetically distinct populations leads to reduced fitness in hybrid offspring, due to the disruption of co-adapted gene complexes [9,43]. However, outbreeding depression is rare [62], and mainly seen when the mixed populations are highly divergent or show high genetic structure [12,62]. An example is the Tanta Mountain ibex (*Capra ibex ibex*), where a population was supplemented with individuals sourced from two related subspecies. Hybrids of the incumbent and introduced ibex showed altered reproductive habits that led to offspring death and population extinction [63]. To combat this, a decision-making framework has been developed to predict the likelihood of outbreeding depression occurring [64]. The outcome of mixing populations cannot be definitively known until post-supplementation, and therefore there is no guarantee that a supplementation will not have negative impacts. However, existing knowledge surrounding the probability of outbreeding depression [12,62,64] and the characteristics of a suitable source population [9,35] allows for better selection of candidates for supplementation. Coupled with the ever-increasing global production of wildlife genomes and modelling methods, we are in a better position than ever to predict the impacts of supplementation on threatened populations, both with and without disease.

Outbreeding depression is not the only risk that needs to be considered when making the decision about whether to supplement disease-threatened populations. Other concerns include the loss of local adaptations, loss of species purity due to mixing of gene pools, genetic replacement, and disease spread [12,44]. In situations where the disease threat is difficult to mitigate, conservation managers may be hesitant to release healthy individuals into the wild where they may be exposed to the disease, especially if the size of the disease-free source population is small. As conducted with the Tasmanian devil, modelling and simulations, such as population viability analyses, which incorporate the demographic effects of disease, can be useful to weigh risks and options when considering supplementation [27]. In addition, wildlife disease risk analyses offer a structured method to identify, prevent, and mitigate disease risks associated with supplementation prior to implementation [65].

A well-intentioned fear of doing harm may prevent supplementation being used to genetically manage populations threatened by disease. However, a decision to not implement genetic management, or other conservation actions, is still an active conservation action with its own ramifications [12,17,36,42]. For example, the decision was made to preserve the taxonomic integrity of the dusky seaside sparrow (*Ammodramus maritimus nigrescens*) rather than outcross the remaining individuals to a closely related subspecies [66]. This decision resulted in a detrimental outcome: the species became extinct, and the unique diversity that otherwise might have been preserved was lost. This does not necessarily mean that this outcrossing event would have prevented the extinction of the dusky seaside sparrow, but we shall never know. Instead, this example encourages us to recognize that species have become extinct after active decisions not to intervene were made.

There will always be some degree of risk associated with implementing a management strategy. However, if supplementation is done to serve a conservation purpose in response to genetic isolation, inbreeding, and low genetic diversity, the rewards will often outweigh the risks, especially when facing the ultimate risk of species extinction. For a diseased population that is declining, the fear of failure needs to be weighed against the potential consequences of inbreeding, loss of genetic diversity, and decline in adaptive potential if the decision to not supplement the population is made.

#### **4. Conclusions**

There are a range of threatened species, and subsequently a range of management actions to ensure their persistence. For many species suffering infectious diseases, conservation management actions are further complicated by the disease. Here, we suggest that genetically managing disease-affected populations may assist in reducing their extinction risk when the disease threat cannot be easily mitigated. This is not a definitive solution but may buy the species time to co-evolve with their disease. As has been implemented with the Tasmanian devil, supplementation may lead to genetic restoration of these diseasethreatened populations, alleviating loss of genetic diversity and maintaining their adaptive potential. Although it is possible that not all populations are suitable for this type of genetic management [64], these populations can generally be identified by preliminary screening [12]. In addition, the primary genetic risk of outbreeding depression is rare, usually manageable, and often less threatening than the extinction risk these populations face in the absence of augmented gene flow [12]. Future supplementations should be performed in conjunction with long-term monitoring to expand the existing knowledge of genetic rescue/restoration activities and gather empirical evidence of its success in diseased populations. Rather than delaying action due to a fear of provoking harmful consequences, genetic management should be recognized as a potentially beneficial conservation technique, allowing for the development of more effective management practices where action rather than inaction is favoured.

**Author Contributions:** Conceptualization, G.L.G., C.E.G., K.B. and C.J.H.; investigation, G.L.G.; resources, G.L.G.; writing—original draft preparation, G.L.G.; writing—review and editing, C.E.G., K.B. and C.J.H.; supervision, C.E.G., K.B. and C.J.H.; project administration, C.J.H.; funding acquisition, C.E.G., K.B. and C.J.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Australian Research Council Linkage (LP180100244).

**Acknowledgments:** Thanks to members of the Australasian Wildlife Genomics Group who participated in the many discussions regarding genetic rescue in the presence of disease.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Communication* **Pug-Headedness Anomaly in a Wild and Isolated Population of Native Mediterranean Trout** *Salmo trutta* **L., 1758 Complex (Osteichthyes: Salmonidae)**

**Francesco Palmas 1,\*, Tommaso Righi 2, Alessio Musu 1, Cheoma Frongia 1, Cinzia Podda 1, Melissa Serra 1, Andrea Splendiani 2, Vincenzo Caputo Barucchi <sup>2</sup> and Andrea Sabatini <sup>1</sup>**


Received: 28 August 2020; Accepted: 12 September 2020; Published: 15 September 2020

**Abstract:** Skeletal anomalies are commonplace among farmed fish. The pug-headedness anomaly is an osteological condition that results in the deformation of the maxilla, pre-maxilla, and infraorbital bones. Here, we report the first record of pug-headedness in an isolated population of the critically endangered native Mediterranean trout *Salmo trutta* L., 1758 complex from Sardinia, Italy. Fin clips were collected for the molecular analyses (D-loop, *LDH-C1\** locus. and 11 microsatellites). A jaw index (JI) was used to classify jaw deformities. Ratios between the values of morphometric measurements of the head and body length were calculated and plotted against values of body length to identify the ratios that best discriminated between malformed and normal trout. Haplotypes belonging to the AD lineage and the genotype *LDH-C1\*100*/*100* were observed in all samples, suggesting high genetic integrity of the population. The analysis of 11 microsatellites revealed that observed heterozygosity was similar to the expected one, suggesting the absence of inbreeding or outbreeding depression. The frequency of occurrence of pug-headedness was 12.5% (two out of 16). One specimen had a strongly blunted forehead and an abnormally short upper jaw, while another had a slightly anomaly asymmetrical jaw. Although sample size was limited, variation in environmental factors during larval development seemed to be the most likely factors to trigger the deformities.

**Keywords:** small isolated population; Mediterranean native trout; morphological deformities

#### **1. Introduction**

Skeletal anomalies are commonplace characteristic in farmed fish all over the world [1–3]. In contrast, naturally originated malformations show a smaller incidence in wild fish populations due to the decreased viability of the abnormal fish in natural habitats [4,5]. The extent of the skeletal deformities in fish species has been reported to affect different anatomical body parts, such as the vertebral column, fins, and skull [1,6,7]. In particular, skull malformations involve mainly the splanchnocranium, hyoid arch, and gill cover [1,3].

Among these deformities, the pug-headedness anomaly is an osteological condition that results in the deformation of the maxilla, pre-maxilla, infraorbital bones, and ethmoid region. This condition determines bulging eyeballs, acutely steep foreheads, and incomplete closure of the mouth due to projection of the lower jaw [8,9]. Pug-headedness can lead to starvation and rising mortality, due to the jaw deformity, especially during the larval stages [10]. Several causes have been suggested to explain the pug-headedness anomaly, including low genetic variability and epigenetic factors, such as embryonic development disorders and aberrations induced by environmental factors variation [11,12]. Among these, traumatic shock caused by daily variations in temperature, light cycles, salinity, and dissolved oxygen concentration during the early development seem to be the most important factors [2,9]. A higher incidence of pug-headedness has been found in polluted waters [12,13]. The pug-headedness deformity is widely documented in different captive fish species of both marine and freshwater habitats [14–18], but it is rare among adults in wild populations [19,20]. Among natural populations, jaw deformities are mostly restricted to different families of marine fish species, such as Moronidae [21], Pomacanthidae [22], Rachycentridae [23], Sparidae [24], Epinephelidae, and Cichlidae [20,25].

The pug-headedness malformation has also been reported in wild salmonid populations in only a few known cases [26]. To the best of our knowledge, this is the first documented occurrence of pug-headedness in a native Mediterranean trout population. Although the taxonomy of the Mediterranean trout has not yet been resolved [27], the species is listed with the name of *Salmo cettii* Rafinesque, 1810, and is considered to be critically endangered by the International Union for Conservation of Nature [28].

In particular, in this paper we report (1) a brief morphological characterisation of the head deformities, (2) the genetic characterisation of the population using nuclear (*LDH-C1\** and 11 microsatellites) and mitochondrial (entire mtDNA control region, ~1 Kb) markers, and (3) a comparison of malformed and normal specimens.

#### **2. Materials and Methods**

During the monitoring programme for the compilation of the official Fish Inventory of the Sardinia Region [29], among 116 native trout analysed in 10 rivers, two malformed specimens were found only in the Furittu Stream in June 2017 and June 2019 (39◦26.248 N, 9◦22.036 E) (Figure 1). The stream is located in the south-eastern sector of Sardinia, running for about 10 km to merge with the River S'Acqua Callenti to form the Flumendosa River. The Furittu stream is located in the mountainous area of Monte Genis, which is an area with very low anthropogenic pressure, is hard to reach and is not subjected to wind coming from polluted areas. The Furittu stream watershed area is 21 km2, and the land use is largely natural and composed of forests, Mediterranean shrubland and very little pasture land. Bi-seasonal climatic features, with hot arid summers and a rainy autumn/winter season, determine a periodic of hydrographic isolation (up to four months per year) for the upper part of the stream. The stream is 0.5–4.5 m wide and 0.10–2.00 m deep with an average slope of 3.8% throughout its whole length. It has the typical geomorphologic characteristics of a Mediterranean mountain stream, with a complex and fragmented mesohabitat dominated by pools (72%), riffles (20%) and small cascades (8%). The streambed consists of rocks, boulders, gravel, rubble, and woody debris, with riparian vegetation composed of trees adjacent to the stream characterised by holm oak (*Quercus ilex*) and oleander (*Nerium oleander*).

A total of 16 trout were captured in the upper region of the stream using low-frequency, pulsed DC electrofishing and stored in cool, aerated water. All specimens were measured for total weight (TW, g) and total length (TL, cm), placed in a narrow transparent tank filled with water, and photographed from the left side. From each fish, a small fin clip was removed and conserved in 95% ethanol until DNA extraction. After processing, the fish were placed in large containers and released in the stream. Estimates of the total number of fish in a 100 m section of stream and trout estimated densities (N fish/m2) were obtained using two-pass depletion method [30]. The Sardinian specimens were compared with another trout from Neia River (42◦38.31 N, 13◦14.96 E; Tronto basin of Italian Apennine area) that showed similar jaw anomalies.

**Figure 1.** Geographic position of the sampling station in Furittu Stream, south-eastern Sardinia (Italy). Lower right panel: location of Furittu stream (-) and Neia River (•).

Genetic analyses were performed by using both mitochondrial (D-loop) and nuclear markers (*LDH-C1\** and 11 microsatellite loci). The entire mitochondrial control region (D-loop) was PCR amplified and sequenced using the primers 28RIBa [31] and HN20 [32]. PCR amplifications were performed in 20 μL of reaction mixtures (approximately 80 ng of template DNA, 0.15 units MyTaq polymerase (Bioline GmbH, Luckenwalde, Germany), 1X MyTaq Buffer (Bioline GmbH, Luckenwalde, Germany) and 5 pmol of each primer) for 30 cycles (95 ◦C, 45 s; 53 ◦C, 30 s; 72 ◦C 90 s). Cycling was preceded by a 3 min denaturing step at 95 ◦C and followed by a 7 min final extension at 72 ◦C. The mtDNA sequences were aligned using the computer program ClustalW [33] and compared with reference *S. trutta* CR sequences from GenBank using BLASTn (https://blast.ncbi.nlm.nih.gov/Blast.cgi). The specimens were genotyped at the *LDH-C1*\* gene, coding the LDH enzyme [34].

A 440 bp segment of the *LDH-C1*\* nuclear locus was PCR-amplified with the primers Ldhxon3F and Ldhxon4R [34] and digested with a BseLI restriction enzyme. *LDH-C1*\* allows the discrimination of north-western European populations, characterised by the \*90 allele, from native Mediterranean population (\*100 allele), and hybrids that present both alleles [34]. PCR amplifications were performed in 25 μL of reaction mixtures (approximately 200 ng of template DNA, 0.15 units MyTaq polymerase (Bioline GmbH, Luckenwalde, Germany), 1× MyTaq Buffer (Bioline GmbH, Luckenwalde, Germany)and 5 pmol of each primer) for 30 cycles (95 ◦C, 60 s; 64 ◦C, 60 s; 72 ◦C 60 s). Cycling was preceded by a 5-min denaturation step at 95 ◦C and followed by a 10-min final extension at 72 ◦C. Amplicon digestion was performed using BseLI restriction enzyme (Thermo Fisher Scientific, Waltham, MA, USA) following the manufacturers' protocol and visualised on 2% agarose gel. Following [35], eleven non-coding microsatellite loci were labelled with fluorescent dye and multiplexed in two separate reactions. PCR amplifications were performed in 15 μL of reaction mixtures containing approximately 80 ng of template DNA, 0.4 unit MyTaq polymerase (Bioline GmbH, Luckenwalde, Germany), 1X MyTaq Buffer (Bioline GmbH, Luckenwalde, Germany), and 2.5 pmol of each primer. A Touchdown protocol was used to optimise the amplification. PCR amplicons were electrophoresed using an ABI-PRISM 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA, USA) [36]. ARLEQUIN 3.5.1.3 [37] was used to calculate the observed (Ho) and expected (He) heterozygosity and deviations from the Hardy–Weinberg equilibrium, while the inbreeding coefficient (FIS) was estimated in FSTAT [38]. The software COLONY software [39] was used to determine family structure in the Furittu stream population (length of run = 3, level of precision likelihood = high). The pair likelihood score (PLS) /full likelihood (FL) combined (=FPLS) algorithm was selected to establish only full-sibs listings. Finally, individual admixture coefficient (*q*) were estimated for Furittu population using the software STRUCTURE [40] A domestic sample from two hatcheries was used for comparison. The domestic

ancestry in Furittu population was calculated performing a run assuming a K = 2 (i.e., domestic vs. native) adopting a burn-in of 1,000,000 iterations, followed by 500,000 iterations leaving the other parameters as default.

To verify jaw deformities, we used a modified version of the jaw index (JI) [41] (Figure 2a). We categorised the specimens to different degrees of upper jaw index (UJI) as follow: values above 1 were categorised as normal jaw deformity (NJ), from 0.99 to 0.90 as mildly jaw-deformed (MJD) and below 0.89 severely jaw-deformed (SJD).

**Figure 2.** Specimens from Furittu stream. (**a**) Locations of L1 and L2 measurements used to calculate the upper jaw index (UJI), body length (BL) and body depth (BD), (**b**) skull of trout and in red bones involved in the pughead deformity (pmx = premaxilla; mx = maxilla; deth = dermethmoid; nas = nasal; pf = prefrontal; la = lacrimal (suborbital 1); ju = jugal (suborbital 2); so3 = suborbital 3; so4 = suborbital 4), (**c**) measurements taken for morphometric comparison (Table S1, Supplementary Materials), (**d**) examples of normal jaw (NJ), middle jaw deformity (MJD) and severe jaw deformity (SJD). Pughead specimen of Mediterranean trout (**e**).

Since the affected species often show other skull bones deformities (Figure 2b), morphological variables of the head [42] (Figure 2c) were collected from each trout after setting the scale factor using TPSDIG2 v2.31 [43]. Ratios between values of morphometric measurements of the head and body length (BL) were calculated and plotted against values of BL, to identify the ratios that best discriminated between the malformed (MJD, SJD) and normal trout (NJ). To estimate the changes in nutritional conditions among normal and deformed specimens, we used the residuals analysis in linear regression between LT and TW after logarithmic transformation of the data [17].

#### **3. Results**

In the Furittu stream, a total of 16 adult trout were examined (12.5–25.5 cm and 24.9–183.1 g) (Table S1, Supplementary Materials). The estimates of the total number of fish in the 100 m of section varied between 15 ± 4 and 18 ± 0.5 in June 2017 and 2018, respectively. Estimated densities were relatively low in both campaigns and ranged from 0.06 ind/m2 to 0.09 ind/m2.

The incidence rate of jaw deformities of Furittu stream trout was 12.5% (2 of 16 native trout, 95% confidence interval [CI] = 1.6–38.6%). One specimen, which showed a strongly blunted forehead and an abnormally short upper jaw (UJI = 0.81) was categorised as SJD, while the another one manifested a slightly asymmetrical jaw (UJI = 0.98, categorised as MJD). The remaining specimens showed values of UJI above 1 and were considered normal (NJ) (Figure 2d). The Neia trout with an UJI of 0.95 was categorised as MJD (Table S1, Supplementary Materials).

The mtDNA analysis showed that all Sardinian trout shared the same haplotype (AD-Tyrr7) belonging to the AD lineage (Adriatic) (sensu [44]), which was observed for the first time in this population. The new sequence was deposited in GenBank under accession number MT503201. In all samples, we observed the genotype *LDH-C1\*100*/*100* fixed in the Mediterranean native population. The trout population from the Neia River (central Apennine) showed high frequency of the non-native allele *LDH-C1\* 90* (0.77) and the presence of AT non-native mitochondrial lineage with a frequency of 0.23. The 11 microsatellites showed higher levels of observed heterozygosity in the Neia population (0.80) compared to the Furittu stream population (0.50) (Table 1). In Furittu stream, the mean *q* values were close to 1 with a very narrow CI (mean = 0.99, CI= 0.97–1).

**Table 1.** Mitochondrial lineage and *LDH-C1\** observed frequencies, observed (HO) and expected (HE) heterozygosity and inbreeding coefficient (FIS) estimated from 11 microsatellite loci in the Furittu and Neia populations. AD (Adriatic), ME (Mediterranean) and AT (Atlantic) lineages of *Salmo trutta*. Standard deviation (s.d.).


No significant departure from the Hardy–Weinberg equilibrium was found in the studied populations. Analysis of family structure suggested that the Furittu stream population was composed of eight families of generally one or two individuals (Table S2, Supplementary Materials). The two abnormal specimens were found to be unrelated.

In the Furittu stream, one trout showed anomalies in skull measurements typical of pug-headedness osteological malformation (SJD), with a shortened neurocranium and upper jaw (Figure 2e), while another specimen (MJD) had slightly shorter upper jaw and greater head. In particular, the pug-headed trout (SJD) showed a greater depth of the head (DH) (Figure 3a), while the specimen with the middle jaw deformity (MJD) had the longest head (LHL) (Figure 3b). Surprisingly, the SJD individual also showed a longer LHL compared to normal specimens (NJ) (Figure 3b). The longer LHL found in the specimen affected by the pug-headedness (SJD) was thought to be a consequence of a slightly greater operculum length (LO) (Figure 3c). In the SJD specimens, the snout was almost absent due to the curving of the ethmoid region (deth, nas), and maxillary bones (mx) (Figure 2b). These malformations were confirmed by the smallest snout (LS) in SJD, while the MJD trout showed the longest LS (Figure 3d). The anomalies of infraorbital bones (pf, la, ju, so3, so4) in SJD trout determined larger and bulbed eyeballs compared to MJD and NJ specimens (Figure 3e,f) despite the fact that the ratio between HO and VO didn't show differences from the regular circular shape of infraorbital bones. As shown in the MJD specimen, the deformed trout from the Neia River exhibited a slightly prominent lower jaw in comparison to normal trout from Sardinia (Figure 3b).

**Figure 3.** Relationship between body length (BL) and the ratios among values of head depth (DH) (**a**), lower head length (LHL) (**b**), operculum length (LO) (**c**), snout length (SL) (**d**), horizontal orbital length (HO) (**e**) and vertical orbital length (VO) (BL) for specimens with normal jaw deformity (NJ), middle jaw deformity (MJD) and severe jaw deformity (SJD) from Furittu stream and one specimen from Neia river (**f**).

Despite this deformity, the residuals generated by linear regression (LT/TW) revealed no relevant differences between normal and deformed specimens, indicating that the deformed trout were robust and healthy (Table S1, Supplementary Materials).

#### **4. Discussion**

The pug-headedness deformity in adults wild brown trout was first described in 1929 by the American ichthyologist Eugene Wills Gudgers [26]. Here, we present the first scientific report of head skeletal deformities in a wild population of Mediterranean native trout and one of the few cases reported for adult specimens of the genera *Salmo*.

In the Furittu stream, the head malformations were observed in two specimens out of 16 Mediterranean trout (12.5% of occurrence) captured during two sampling campaigns conducted in June 2017 and June 2018. One specimen (SJD) showed the typical malformation of pug-headness, with the antero-posterior compression of the ethmoid region and upper jaw, while the other specimen (MJD) had a slightly shorter upper jaw and a longer head. The Neia trout (MJD) also exhibited an asymmetry between the lower and upper jaw.

In unpolluted natural habitats, the occurence of pug-headedness in other genera is generally less than 1% [45]. The occurrence of head deformities in the Furittu Stream exceeded the rates observed in the polluted habitats (from 0.5% to 3%) [13], while it was comparable with that of hatchery-reared fish larvae, which have a much greater frequency of skeletal skull abnormalities, including pug-headedness, compared to their wild populations [1,3,46].

The periodic hydrographic isolation that occurs in the downstream part of the river could represent an impassable barrier that prevents the connectivity with domestic trout that have been stocked in the Flumendosa River since the beginning of the 1960s. In fact, the trout population of the Furittu stream is immune by genetic introgression with non-native traits, as detected by the genetic analysis. Before this analysis, in Sardinia, ancestral trout populations (AD lineage, *sensu* [44]) were found in two watersheds of southern Sardinia (Cixerri and Pula basins) [47–50]. Although these populations exist in small headwater habitats isolated by artificial or natural barriers, morphological analysis revealed no presence of trout with head deformations [49]. Furthermore, in a comparative study in morphological characterisation of Corsican and Sardinian trout, it was highlighted that relatively larger heads are present in native trout populations compared to Atlantic and *S. macrostigma* specimens, while no head skeletal aberrations were found [51].

Even though many studies have reported a wide range of factors that trigger such deformities, the exact cause may be difficult to determine. Possible sources of such an aberration could include a wide range of epigenetic factors, such as temperature and oxygen fluctuation during egg incubation [9,52], prenatal stress of mature females induced by environmental changes [53], influences of diet composition on larval phases [54,55], and environmental pollution [13,14]. However, endogamy in small populations is indicated as the most likely factor to trigger the deformity [17]. Low genetic variability, as a result of inbreeding depression, is known to occur in fragmented and small salmonid populations [56,57]. Salmonid populations with morphological deformities, due to the loss of genetic variation and inbreeding depression, show a lower survival rate compared to the normal conspecific populations [58,59]. In this context, similar morphological deformities were detected in two watersheds showing significant differences in levels of genetic introgression with non-native traits and genetic diversity. High levels of non-native variants and genetic variability (HO = 0.80) characterise the Neia River population. Contrarily, lower genetic variability (HO = 0.44) was observed in the Furittu stream population, similarly to what has been observed in other isolated populations on Sardinia [49] and elsewhere (e.g., [35,36]). Additional analysis performed with the colony seems to suggest that the abnormal fish are unrelated. Moreover, in both populations, the observed heterozygosity (HO; Furittu = 0.49; Neia = 0.80) was similar to the expected heterozygosity (HE; Furittu = 0.45; Neia = 0.83), suggesting the absence of inbreeding or outbreeding depression. Ultimately, though the small number of markers used does not allow us to exclude that the pugheaded malformation may have a genetic basis, our preliminary analyses suggest inbreeding or outbreeding depression is not likely to be the cause of the observed deformities.

However, we can also exclude the presence of pollutants and pathogens as possible causative factors of this malformation. In fact, the Furittu stream is a well-preserved river with very low anthropogenic pressure. Although sample size was limited, unfavourable abiotic conditions, such as variations in environmental factors such as hypoxia, solar radiation, and temperature during larval development seem to be the most likely factors to trigger the observed deformities. In this context, the Mediterranean streams are often subject to prolonged droughts, which reduce and fragment the trout habitat, with a possible increase in water temperature. The results of this study indicate a need for investigation into the causes and control of head malformation in Mediterranean trout population. Practical conservation management measures should include long-term monitoring programmes in order to estimate the population size and abundance, ecological requirements, and protection of stream habitats. The dissemination of information regarding native trout conservation status and the involvement and education of local people and regional authorities are also crucial for conservation of the species.

*Diversity* **2020**, *12*, 353

**Supplementary Materials:** The following are available online at http://www.mdpi.com/1424-2818/12/9/353/s1, Table S1: Morphometric data, Upper Jaw Index (UJI) and residuals of sixteen native Mediterranean trout of the Furittu stream (south eastern Sardinia) and one specimen from Neia river. L1. and L2 measurements used to calculate the Upper Jaw Index (UJI), (1) upper jaw depth (DUJ), (2) snout length (LS), (3) orbital horizontal diameter (HO), (4) head depth (HD), (5) orbital vertical diameter (VO), (6) length of maxilla (LM), (7) upper jaw length (LUJ), (8) lower jaw length (LLJ), (9) premaxilla to preoperculum length (LPP), (10) head length at upper jaw (LHU), (11) head length at lower jaw (LHL) and (12) operculum length (LO). Table S2: Parentage analyses performed with COLONY for Riu Furittu population. We have chosen the Pair-Likelihood-Score (PLS)/Full-Likelihood (FL) combined (=FPLS) algorithm and used only full-sibs listing to give the results below.

**Author Contributions:** Conceptualization, F.P., A.M. and A.S. (Andrea Sabatini); methodology, F.P., A.S. (Andrea Sabatini), A.M., T.R., A.S. (Andrea Splendiani) and V.C.B.; software F.P., A.S. (Andrea Sabatini), A.M., T.R., A.S. (Andrea Splendiani) and V.C.B.; validation, A.S. (Andrea Sabatini), F.P., A.M., C.F., C.P., M.S., T.R., A.S. (Andrea Splendiani) and V.C.B.; formal analysis, F.P., A.S. (Andrea Sabatini)., A.M., T.R., A.S. (Andrea Splendiani) and V.C.B.; investigation, F.P., A.M., C.F., M.S., C.P. and A.S. (Andrea Sabatini); resources, A.S. (Andrea Sabatini); data curation, F.P., A.M., A.S. (Andrea Sabatini), T.R., A.S. (Andrea Splendiani) and V.C.B.; writing—original draft preparation, F.P., A.M., A.S. (Andrea Sabatini), T.R., A.S. (Andrea Splendiani) and V.C.B.; writing—review & editing, F.P., A.M., A.S. (Andrea Sabatini), C.P., T.R., A.S. (Andrea Splendiani) and V.C.B.; visualization, F.P. and A.M.; supervision, A.S. (Andrea Sabatini) and V.C.B.; project administration, A.S. (Andrea Sabatini); funding acquisition, A.S. (Andrea Sabatini). All authors have read and agreed to the published version of the manuscript.

**Funding:** The research was a part of the Regional Fish Inventory supported by the Regione Sardegna (Redazione di una Carta Ittica delle acque dolci della Sardegna con particolare riferimento ai siti di popolamento della forma geneticamente pura della trota sarda (*Salmo cettii* ex *Salmo trutta macrostigma*), specie autoctona a grave pericolo di estinzione; Grant No. 27002-1 18/12/2015).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Diversity* Editorial Office E-mail: diversity@mdpi.com www.mdpi.com/journal/diversity

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com

ISBN 978-3-0365-4442-7