Transcriptome-Wide Genetic Variations in the Legume Genus Leucaena for Fingerprinting and Breeding

Han, Yong; Abair, Alexander; van der Zanden, Julian; Nageswara-Rao, Madhugiri; Vasan, Saipriyaa Purushotham; Bhoite, Roopali; Castello, Marieclaire; Bailey, Donovan; Revell, Clinton; Li, Chengdao; Real, Daniel

doi:10.3390/agronomy14071519

Open AccessEditor’s ChoiceArticle

Transcriptome-Wide Genetic Variations in the Legume Genus Leucaena for Fingerprinting and Breeding

by

Yong Han

^1,2,*

,

Alexander Abair

³,

Julian van der Zanden

¹,

Madhugiri Nageswara-Rao

⁴,

Saipriyaa Purushotham Vasan

²,

Roopali Bhoite

¹,

Marieclaire Castello

¹,

Donovan Bailey

³,

Clinton Revell

¹,

Chengdao Li

^1,2

and

Daniel Real

¹

Department of Primary Industries and Regional Development, South Perth, WA 6151, Australia

²

Western Crop Genetics Alliance, College of Environmental and Life Sciences, Murdoch University, Perth, WA 6150, Australia

³

Department of Biology, New Mexico State University, Las Cruces, NM 88003, USA

⁴

USDA Agriculture Research Service, Subtropical Horticulture Research Station, Miami, FL 33158, USA

^*

Author to whom correspondence should be addressed.

Agronomy 2024, 14(7), 1519; https://doi.org/10.3390/agronomy14071519

Submission received: 30 May 2024 / Revised: 8 July 2024 / Accepted: 10 July 2024 / Published: 12 July 2024

(This article belongs to the Special Issue Advances in Crop Molecular Breeding and Genetics)

Download

Browse Figures

Versions Notes

Abstract

Leucaena is a versatile legume shrub/tree used as tropical livestock forage and in timber industries, but it is considered a high environmental weed risk due to its prolific seed production and broad environmental adaptation. Interspecific crossings between Leucaena species have been used to create non-flowering or sterile triploids that can display reduced weediness and other desirable traits for broad use in forest and agricultural settings. However, assessing the success of the hybridisation process before evaluating the sterility of putative hybrids in the target environment is advisable. Here, RNA sequencing was used to develop breeding markers for hybrid parental identification in Leucaena. RNA-seq was carried out on 20 diploid and one tetraploid Leucaena taxa, and transcriptome-wide unique genetic variants were identified relative to a L. trichandra draft genome. Over 16 million single-nucleotide polymorphisms (SNPs) and 0.8 million insertions and deletions (indels) were mapped. These sequence variations can differentiate all species of Leucaena from one another, and a core set of about 75,000 variants can be genetically mapped and transformed into genotyping arrays/chips for the conduction of population genetics, diversity assessment, and genome-wide association studies in Leucaena. For genetic fingerprinting, more than 1500 variants with even allele frequencies (0.4–0.6) among all species were filtered out for marker development and testing in planta. Notably, SNPs were preferable for future testing as they were more accurate and displayed higher transferability within the genus than indels. Hybridity testing of ca. 3300 putative progenies using SNP markers was also more reliable and highly consistent with the field observations. The developed markers pave the way for rapid, accurate, and cost-effective diversity assessments, variety identification and breeding selection in Leucaena.

Keywords:

crossing; genetic variants; hybridity; molecular marker; next-generation sequencing (NGS); phylogeny

1. Introduction

The mimosoid legume genus Leucaena includes a diverse set of New World shrub and tree species with a long history of human use [1]. A subset of these versatile species has been widely spread in tropical regions across the globe. In particular, the value of Leucaena leucocephala (Lam.) de Wit. as a vigorous nitrogen-fixing species has been broadly recognised, resulting in its widespread usage in erosion management, multipurpose cropping systems [2], phytoremediation, as a source of paper [3] and even biofuel [4]. L. leucocephala foliage is highly nutritious, with a 15–18% higher protein content than tropical grasses and cereal straws [5]. Therefore, the crop has been extensively planted for livestock feeding in some regions, such as Queensland in Australia [6]. Nevertheless, L. leucocephala is regulated as an invasive weed in over 50 countries [7], and in northern Western Australia, its use is constrained by its weediness status [8].

Developing non-flowering or sterile Leucaena by the interspecific crossing of species has long been of interest [8,9]. One of the more promising approaches involves crosses between parents of different ploidy levels, creating odd-numbered polyploid offspring that can be seed sterile. Recently, over 1400 interploidal crosses were made to generate seedless triploids and thus promote the forage crop in Australia and elsewhere [10]. Massive efforts are going into this work (and related efforts elsewhere), including the process of crossing, seed collection, seeding, transplantation to the field, nursing plants for at least 2.5 years prior to the onset of flowering [10], and the subsequent phenotyping of flowering traits across individuals. Hence, an accurate, reliable, rapid, and high-throughput diagnostic approach for hybrid detection is required to identify Leucaena triploids in the early seedling stage to avoid wasting downstream efforts on non-hybrid individuals.

Molecular markers and marker-assisted selection (MAS) have been successfully implemented for plant breeding over the decades to rapidly identify favourable species/genotypes and pinpoint the genomic regions in plants that express traits of interest, thus driving from phenotype to genotype-based selection [11]. Markers developed against chloroplast, nuclear ribosomal DNA internal transcribed spacers [12], sequence-characterised amplified regions within nuclear-encoded loci [13], and simple sequence repeats (SSRs; [14]) have been applied for phylogenetic and evolutionary studies in Leucaena. However, such sets of markers do not represent the genome broadly and are therefore of limited utility in modern high-throughput plant genotyping and breeding practices. Moreover, most species of Leucaena are likely to have many heterozygous loci because of self-incompatibility and therefore an out-crossing habit (i.e., L. diversifolia, [15]), hindering the selection of homozygous and polymorphic markers between parental lines for interspecific hybrid detection.

The advent of next-generation sequencing (NGS) technology allows the detection of millions of sequence variations within the genome and transcriptome of a plant genotype, which can be utilised to assess and characterise genetic diversity within populations, species, or germplasm collections. Molecular markers based on NGS have been developed in many tree species worldwide and used as valuable genetic tools for non-model organisms [16,17]. Indeed, SNP (single nucleotide polymorphism) markers are the most abundant, with genome-wide coverage, high stability and Mendelian inheritance [18], and they are compatible with high-throughput genotyping technologies such as microarray and KASP (kompetitive allele-specific PCR). However, such a platform has not been developed for Leucaena to aid in genetic studies and breeding practices.

In the present study, mRNA-based transcriptome-wide genetic variants among 21 Leucaena taxa have been identified for the first time. The large-scale sequence variations revealed the molecular identity of Leucaena species. As a proof of concept, a core set of SNPs with an even allele frequency among Leucaena species was selected for KASP marker development and genotyping. Over 3000 Leucaena individuals from interspecific crossings were tested for hybridity using polymorphic markers between parental lines. The developed high-throughput and cost-effective platform can rapidly identify triploids at the seedling stage, saving time and cost for breeding programs. Moreover, the variant database can be adopted for future genetic diversity, mapping and fingerprinting studies, MAS and breeding in Leucaena.

2. Materials and Methods

2.1. Plant Materials

For sequencing and variant calling, 20 diploids (L. collinsii, L. cruziana, L. cuspidata, L. esculenta, L. greggii, L. lanceolata, L. lempirana, L. macrophylla ssp. macrophylla and ssp. istmensis, L. magnifica, L. multicapitula, L. matudae, Leucaena pueblana, L. pulverulenta, L. retusa, L. salvodorensis, L. shannonii, L. trichandra, L. trichodes, and L. zacapana) and one tetraploid species L. diversifolia were used (Table 1). Plant samples were collected from the Leucaena living collection at New Mexico State University, USA.

At the Western Australia Department of Primary Industries and Regional Development (DPIRD), 224 accessions of Leucaena from nine diploids (L. pulverulenta, L. collinsii, L. zacapana, L. shannonii, L. macrophylla, L. retusa, L. greggii, L. trichandra, and L. trichodes) and three tetraploids (L. leucocephala ssp. leucocephala and ssp. glabrata, and L. diversifolia) were selected for interspecific crossing in a glasshouse in 2018 (Generation 1) and 2019 (Generation 2), respectively [10]. About 3000 seedlings of the cross progeny were raised in a naturally lit glasshouse at DPIRD, Perth, and subsequently sampled for genotyping with molecular markers.

2.2. RNA-seq and Variant Calling

Given the evolutionary history of hybridisation/polyploidisation in Leucaena, the next-generation transcript sequencing-based approach was employed [19], designed to account for the problem of paralogy by selecting for SNPs only from orthologous genes. For all species except L. pueblana, the total RNA was extracted from whole seedlings. Whole seedlings with three true leaves were sampled to obtain a wide range of expressed genes. RNA quality was assessed using Nanodrop (Thermo Fisher Scientific Inc., Waltham, MA, USA), Qubit (Thermo Fisher Scientific Inc., USA), and agarose gel electrophoresis before library preparation. Illumina RNA-TruSeq libraries (Illumina Inc., USA) were generated for each sample using 4 µg total RNA following the manufacturer’s instructions. Libraries were quantified using the Qubit and checked using Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA). Each library was sequenced to a depth of 40 M read pairs (100 bp each) using 101 bp reads (Axeq Technologies, Seoul, the Republic of Korea). No living material was available for L. pueblana, so Illumina-based gDNA genome skimming was employed using the same approach but with DNA extracted with a Qiagen DNeasy Kit (Qiagen, Germantown, MD, USA) from silica-dried leaves, Illumina TruSeq library, and sequenced via Illumina HiSeq 2000. For all samples, paired-end 100 bp paired reads were trimmed of their Illumina adaptors and filtered for a minimum read length of 65 bases using Trimmomatic v0.34 [20].

This study utilised a chromosomal-scale genome assembly for Leucaena trichandra (Bailey et al., unpublished) as the reference genome. Trimmed sequence data were aligned to the reference genome using STAR (v2.4.0.1, [21]). Using the SAMtools (v1.9; [22]) view with the “-q10” option, uniquely mapped reads were extracted from the total set of accepted hits. This step filtered out paralogous genes. The SAMtools varFilter.pl utility was used for variant calling. The first step in this process was the generation of an index from the genome using the SAMtools faidx command. All accepted hit files were then set up for variant calling using the SAMtools mpileup command. VCFtools [23] generated a BCF file that was converted to a human-readable VCF file containing all the variants detected.

Preliminary filtering on the VCF file was conducted using VCFtools. Reads with mapping quality Phred scores below 30 were removed from the VCF file. A fraction of the results were manually verified in Geneious (Version R11). A visual inspection of the first 15 positions from a randomly selected contig (tig00000853) from L. collinsii in the original STAR mapping showed positive results. All positions compared between the visual output from the Geneious alignment viewer and the filtered VCF file were concordant.

Variant statistics were visualised in charts by SigmaPlot 14.0.

2.3. Variant Filtration, Marker Development, and Phylogenetic Analysis

To test the differentiation power in Leucaena species, sets of variants were filtered out by PLINK 1.9 (https://www.cog-genomics.org/plink/, accessed on 14 February 2019) analysis according to allele frequency and genome location, including a pool of 4,492,396 variants after trimming monomorphic variants with an allele frequency below 0.1 and missingness over 0.25 in the whole variant database; a pool of 75,001 variants evenly distributed in each 10 kb region within the L. trichandra reference genome; a pool of 48 indels with a minimum 6 bp difference from a random 100 Mb sequence interval (tig4612_575177–tig5770_674183; Table S1); and a pool of 1533 homozygous SNP variants with an allele frequency of 0.4–0.6 (Table S2). Principal component analysis (PCA) was performed using PLINK 1.9, and plots were drawn with the R package “ggplot2” [24]. Molecular markers were developed based on the indel/SNP variations and then validated across species of Leucaena. Indel markers that resulted in different amplicon sizes were visualised by 2% agarose or 6% polyacrylamide gel electrophoresis. For SNP screening, the KASP genotyping assay, a novel competitive allele-specific PCR for SNP scoring based on dual FRET (fluorescent resonance energy transfer), was employed. The highly conserved variants in Western Australian crosses were further selected (Table S3) and KASP markers were designed for about 100 bp length amplicons with sequences referring to L. trichandra using Geneious Prime (Version 2019.1).

An SNP-based phylogenetic tree was developed to illustrate the ability of the marker system to reflect the phylogenetic history of the genus. The all-species VCF file was first filtered for biallelic SNPs with in-house scripts and then converted into phylip format using vcf2phylip.py [25]. The interspecific hybrid tetraploid taxon L. diversifolia was removed from the matrix to avoid violating the assumption of a bifurcating phylogenetic tree. A maximum likelihood tree was created using raxml-ng (v0.9.0, [26]) with GTR + G to build both the best tree and to conduct convergent bootstrap analysis.

2.4. Genomic DNA Isolation

Leucaena leaves can be rich in polyphenols and polysaccharides that inhibit polymerase chain reaction (PCR), so different DNA extraction methods were used according for experimental purposes (see “Results and Discussion”). For polymorphic marker screening between parental accessions, high-quality genomic DNA was extracted from about 400 mg of soft root tissue using the cetyltrimethylammonium bromide (CTAB) method [27] after sampling, cleaning and crushing with beads using TissueLyser II (Qiagen, USA).

Quick DNA isolation from the young leaves was also conducted for all F₁ seedlings using the AquaGenomic kit (MoBiTec, Göttingen, Germany), following the manufacturer’s instructions, or a modified sodium dodecyl sulphate (SDS) method described below.

For the SDS method, about 100 mg of fresh leaf tissues were collected into 1.2 mL strip tubes (SSIbio, Lodi, CA, USA) assembled in 96-well racks and lysed by a tissue lyser with beads in 300 μL of AP1 solution (100 mM Tris-HCl, 50 mM EDTA, 0.5 M NaCl, and 1.5% SDS), incubated at 65 °C for 30 min, and then mixed well with 100 μL of AP2 solution (5 M potassium acetate, 2% polyvinyl pyrrolidone mol. wt. 10,000, pH 6.2). After 5 min on ice, the 96-well extraction plates were centrifuged at 4000 rcf for 10 min at 4 °C (Allegra X-15R, Beckman Coulter, Brea, CA, USA). DNA was precipitated by mixing 100 μL of supernatant with an equal volume of isopropanol and incubated at −20 °C for 15 min. After centrifugation (4000 rcf), the DNA pellet was washed with 70% ethanol twice and finally dissolved in 50 μL of Milli-Q water. DNA concentration and quality were checked using NanoDrop One (Thermo Scientific, USA) at the Western Australia State Agricultural Biotechnology Centre.

2.5. Polymerase Chain Reaction (PCR) and Scoring

Indel and KASP markers were screened for polymorphisms and picked to differentiate the expected genotype for each cross (Table S4). Markers were then applied to both a set of positive control DNA (consisting of the two parental accessions, an artificial DNA mix (1:1 by weight), and the corresponding F₁ individuals (test cases).

For indel markers, the PCR reaction mixture was composed of 1 μL of 10× Bioline buffer, 1 μL of GC buffer, 0.8 μL of Bioline 50 mM MgCl₂, 0.2 μL of 10 mM dNTPs, 0.5 μL of forward and reverse primer at 10 μM, 0.04 μL of BioTaq DNA polymerase, 3 μL of Cresol red dye, 2.5 μL of MilliQ water, and 50 ng (1 μL) of template DNA in a 10 μL reaction volume. PCR was carried out in the Applied Biosystems Veriti Dx 96-Well Fast Thermal Cycler, starting with a denaturation step at 94 °C for 3 min, followed by 35 cycles at 94 °C for 30 s, 56 °C for 30 s, 72 °C for 30 s, and finishing with an extension step at 72 °C for 5 min. PCR products were analysed on 6% polyacrylamide gel (PAGE).

For SNP markers, a KASP master mix (LGC Biosearch Technologies, Hoddesdon, UK) consisting of Taq polymerase, nucleotides, MgCl₂, universal FRET (fluorescence resonant energy transfer) cassettes, and ROX™ passive reference dye were used. The forward primers are allele-specific, each accommodating a unique tail sequence corresponding to the FRET cassette. The PCR reaction and procedure were performed as described by Real et al. [10]. Results were scored using QuantStudio Real-Time PCR software (Version 1.3).

2.6. Field Trials and F₁ Phenotyping

Young seedlings germinated from two crossing generations were transplanted to the field plots at Kununurra (latitude—15.65, longitude 128.72) and Carnarvon (latitude—24.86, longitude 113.73) in 2020 (Generation 1) and 2021 (Generation 2), respectively. The trial arrangement, management, and irrigation were described by Real et al. [10]. The flowering and fruiting behaviour of plants were observed one or two years after transplanting at both sites.

3. Results and Discussion

3.1. Transcriptome-Wide Genetic Variants in Leucaena Detected by RNA Sequencing

Overall, 16396328 unique SNPs and 816282 indels were identified from 21 taxa, including 20 diploids L. collinsii, L. cruziana, L. cuspidata, L. esculenta, L. greggii, L. lanceolata, L. lempirana, two subspecies of L. macrophylla, L. magnifica, L. multicapitula, L. matudae, L. pulverulenta, L. retusa, L. salvodorensis, L. shannonii, L. trichandra, L. trichodes, L. zacapana, L. pueblana (genomic DNA), and one tetraploid, L. diversifolia (Table 1). Variants were relatively evenly distributed across species, with an average of 5.6 million SNPs and 0.47 million indels per accession, respectively. The alleles were displayed at diverse frequencies, and the results showed that variants at the 0.68–0.77 window only occupied about 0.2% of the total for both SNP and indels (Table S5). Interestingly, total RNA-seq mapping rates ranged from 87 to 95% with uniquely mapped read rates ranging from 69 to 77% among all species, indicating no distinct difference between samples. The mapping parameters of RNA-seq reads to the genome vary in plants, depending on the genome complexity, reference availability and quality, and the mapping quality threshold. The high mapping rate in Leucaena was comparable to the values in the model plants Arabidopsis [28] and wheat [29], suggesting the high quality of RNA-seq and thus the reliability of variant calling.

Among the point mutations, transitions (Ts) were more abundant than transversions (Tv), with a Ts/Tv ratio of 1.22, which was similar to the findings in Rhododendron species [30] and sunflower [31] based on RNA-seq. In particular, interchanges between purines (A/G) and pyrimidines (C/T) accounted for 27.5% each among the SNPs, while the transversions of C to G or vice versa were the rarest and had a percentage of only 3.7% (Figure 1a). For indels, changes of 1–3 bp were dominant and made up about 74% of the total variations (Figure 1b). Relative to the L. trichandra reference genome, the number of indel sites generally decreased with increasing indel lengths up to ±20 base pairs. Notably, larger deletions (over 15 bp) were more common than insertions with the same length, which were six times more in the Leucaena population. Indels are more prevalent than other structural variants such as SSRs, and the genotyping of indel markers is technologically less demanding compared with SNP detection, which usually requires expensive reagents and equipment [32]. Consequently, indels are preferred for cost-effective gel electrophoresis-based testing where small sequence polymorphisms are less likely to score accurately. However, both sequencing techniques and bioinformatics tools used for NGS analysis affect the sensitivity and specificity of indel detection in silico [33]. Hence, the reliability of diagnostic markers needs to be further confirmed through experimentation.

In crop science and breeding, genomic resources can significantly promote the identification of genetic controls of key agronomic traits and the development of breeding tools to accelerate the selection process [34]. For Leucaena, L. trichandra is the common putative progenitor of four tetraploid species, L. confertiflora, L. pallida, L. involucrata and L. diversifolia [35], and its draft genome has been completed by the Bailey Lab at New Mexico State University based on PacBio and Illumina sequencing (Bailey et al., unpublished). Together with other transcriptomic data from a few species and a mitochondrial genome of L. trichandra (summarised by Abair et al. [36]), such resources pave the way for future applied and basic research on Leucaena. Notably, this study presented near genus-wide RNA-seq data, which provides comprehensive molecular information on the species identities and, as proof of concept, can be used for phylogenetic analysis and breeding selection in Leucaena.

3.2. The Genus-Wide Sequence Variations Reflect the Phylogenetic History of Leucaena and Serve as a Genotyping Reference for Research and Breeding

Species diversity and diversification in Leucaena have been investigated through a variety of phylogenetic and population genetic analyses that include Leucaena and relevant outgroup analyses. Most recently, these include markers for SCAR-based nuclear loci, nrDNA ITS, cpDNA regions, AFLPs, and SSRs [13,14,35]. Previous results for Leucaena phylogenetics recovered three diploid clades (Clades 1, 2, and 3) that correlate well with morphology and geographic distribution (reviewed by Abair et al. [36]). Here, an SNP-derived variant phylogeny was developed to confirm its consistency with Leucaena’s current phylogeny, population biology, geography, and morphology and the suitability of generating breeding markers based on genus-wide sequence variation. The phylogeny, rooted internally with L. cuspidata, is interpretable as consistent with prior results, with each of the three clades potentially diagnosable along the tree (Figure 2a). Each of those putative groups has the same species and nearly all of the same potential species’ relationships therein. The reconstruction of relationships among species using the SNPs developed here supports the idea that these markers are appropriate for fingerprinting species in plant breeding studies.

The ability to correctly discriminate every species from one another was investigated through PCA analysis of detected variations (Figure 2b). Principal components 1 and 2 were sufficient to separate the three clades with allopatric geographic distributions (sensu Govindarajulu et al. [13]). Removing the monomorphic variants with frequencies less than 0.1 and missing values over 0.25 helped distinguish the most widespread Clade 1 species (Figure 2c). To investigate whether there were conserved or highly polymorphic genomic regions across the genus, we filtered out 75,001 variants spanning 10 kb each and regrouped all species (Figure S1). Even though less than 0.5% of total variants were captured, the species grouping by PC1 and PC2 was identical to that using the whole dataset, indicating that the core set of variants is comprehensive for further genetic analysis and fingerprinting. This set of variants can be further developed into molecular markers that differentiate every sampled species with the correct capture of identities and speciation for breeding practices.

The results revealed the similarities and relationships among the clades and species. Notably, taking advantage of the reference genome, a core set of about 75,000 variants can be genetically mapped and transformed into genotyping arrays/chips for the conduction of population genetics, diversity assessments, and genome-wide association studies (GWAS) to assess the prospect of it occurring in other crops like biennial caraway (Carum carvi [37]). Despite the limited population size in this study and the heterozygous background in Leucaena, the detected variants help predict the linkage disequilibrium loci across genus/species, which is vital in implementing genome-wide association studies for prediction accuracy [38].

3.3. SNP Markers Outperformed Indels for Leucaena Genotyping

It has long been known that extracting high-quality DNA from plants can be seriously complicated by the presence of various secondary metabolites and phenolics [39]. In this study, we tested different methods to purify genomic DNA from Leucaena tissues (Figure S2). DNA extracted from leaves with the AquaGenomic kit or CTAB was not of high quality and had polysaccharide/polyphenol contamination (Figure S2a). The presence of those mucilaginous exudates led to inaccurate NanoDrop and flow cytometric readings and inhibited PCR amplification efficiency. Therefore, the leaf tissues were replaced by roots to obtain pure and clean DNA after CTAB extraction (Figure S2b). However, digging up roots is tedious, and sometimes the woody tissues cannot be completely lysed. A quick and high-throughput SDS method was developed to isolate DNA with acceptable quality for PCR from young Leucaena leaves (Figure S2c), by which 800 samples were processed by a single person daily.

Indels can be easily visualised and scored through agarose gel electrophoresis. An indel pool with 48 variants over 5 bp sequence differences in a random 100 Mb interval was filtered, which can distinguish all 21 sequenced taxa in silico (Figure S3; Table S1). The PAGE analysis showed that selected indel markers can distinguish species such as L. diversifolia and L. pulverulenta (Figure 3a indel 30). However, they were not necessarily selective for accessions in the same species, such as in L. pulverulenta using indels 9 and 30 (Figure 3a, Lane 10–15). Although it was expected that the hybrid tetraploid species L. diversifolia displayed bands consistent with higher heterozygosity (Figure 3a indel 30), some L. pulverulenta accessions also showed multiple bands on the gel (Figure 3a indels 6 and 9), in line with the out-crossing character of diploid Leucaena, thus posing challenges in hybridity scoring. Given that the indels have lower levels of variation among closely related species (L. pulverulenta and L. diversifolia), this finding is consistent with previous reports that suggest larger indels tend to be conserved at higher phylogenetic levels [40,41].

For hybridity testing, the heterozygous nature of the genus makes it increasingly complex to distinguish the origin of triploid samples from the parents as both parents need to be homozygous. A very limited number of indels with polymorphisms between individuals (i.e., indel 6) were further selected from the 48 indel pool for hybridity screening to confirm the heterozygous characteristics of both male and female parent bands (Figure 3b). Generally, indels are not highly polymorphic between the species, and the sizes are not consistent with in silico predictions. Possible reasons for this include the short-read nature of NGS, which is false positive when aligned with the reference genome, and the bioinformatic tools used for variation mining such as SAMtools [22] and GATK [42] are not tailored for indel discovery, where the algorithms are restricted to identify indels less than 10 bp [32]. Notably, long indels (>15 bp) and structural variations can be more effectively detected through high-coverage whole-genome resequencing or pan-genome sequencing to facilitate the fine mapping of loci for traits of interest, as evident in another perennial tree plant tea (Camellia sinensis (L.) Kuntze, [43]). However, it still requires high costs and calls for international research collaboration to achieve this in Leucaena.

RNA-seq offers a high coverage of SNP discovery compared to whole-genome or whole-exome sequencing [44]. Therefore, genotyping by SNPs was pursued as the more accurate and reliable approach, while short-read Illumina sequencing was used for variant detection and calling. In the present study, a total of 1533 homozygous SNP variants with an allele frequency of 0.4–0.6 were selected for genotyping (Table S2). The core set can effectively distinguish most Leucaena taxa, especially L. diversifolia, which was intensively used as a tetraploid parent in Western Australia’s crosses (Figure 3c). Overall, the 57 most popular SNPs were further selected for the crossings between nine diploid (L. pulverulenta, L. collinsii, L. zacapana, L. shannonii, L. macrophylla, L. retusa, L. greggii, L. trichandra, and L. trichodes) and two tetraploid species (L. leucocephala ssp. leucocephala and ssp. glabrata, and L. diversifolia) in Western Australia (Table S3). Among those, 39 KASP markers were successfully developed based on the L. trichandra reference sequence (Table S4).

3.4. Reliability of Marker Fingerprinting Validated by Field Observations

KASP markers were classified as effective when they distinguished the parental lines used in crossings and grouped the triploids either with heterozygous genotypes from bi-parents or close to the pollen donor allele (Figure 4a and Figure S4). For example, KASP marker 9 (variant tig2946_pilon_26087) successfully distinguished female parent L.pulverulenta accession 84.3, allele 2-specific, and male parent L.diversifolia accession 80.1, specific to allele 1 (Figure S4). Triploids were dominant to allele 1 due to their polyploid nature, and more gene copies were received from the pollen donor allele, which was a tetraploid. An artificial hybrid that mixed an equal amount of DNA from bi-parents was used as a genotype scoring reference. The DNA mix dominated allele 1, like other triploids generated through the crossing. KASP genotyping also identified F1 individuals carrying female alleles only (L. leucocephala ssp. glabrata) that could be selfings (Figure 4a). Every offspring generated from a crossing and the corresponding parental lines were treated as a unique entity.

From the 285 (Generation 1) and 3144 (Generation 2) plants derived from two generations of crosses, 235 and 1444 F1 individuals were confirmed as triploids, respectively (Figure 4b). They accounted for 82.5% and 45.9% of the total tested plants and the values were very close to the field observations of their flowering characteristics. The average percentages of sterile or partially sterile F1s planted at Kununurra and Carnarvon, Western Australia, were 81.7% and 48.5% for two crossing generations, respectively (Figure 4b). The results indicated that the selected SNPs and developed KASP markers are robust and reliable for Leucaena identification and fingerprinting.

The transcriptomic approach (RNA-Seq) has been used to generate large-scale and high-density molecular markers in non-model plants, such as sunflower (Helianthus annuus L., [31]), Rhododendron species [30], turnip canola (Brassica napus L., [45]) and grapes (Vitis vinifera L., [46]), for genotyping arrays, genetic mapping and breeding studies. These commonly used SNPs have a relatively high level of cross-species transferability compared with indels, suggesting that they are ideal for constructing high-resolution genetic maps and analysing genetic diversity and population structure in Leucaena. However, the conversion of SNPs into robust KASP markers has been primarily restricted to major crops, such as maise [47], rice [48], and wheat [49], identifying that the KASP markers developed here are an important step forward in this non-model system for multipurpose agriculture across the tropics. It is likely that more reference genomes for each Leucaena species will be captured to promote robust marker development and PCR amplification efficiency.

4. Conclusions

Leucaena represents a complex non-model natural system with deep interest for use in a variety of tropical agricultural practices [1,9]. High levels of interspecific crossability provide extensive possibilities for trait improvements using classical plant breeding methods [50]. However, the invasive nature of some Leucaena species creates an urgent need to combine classical breeding with modern molecular genetic breeding practices to efficiently identify offspring of interest for both their potential hybrid sterile nature (e.g., triploid) and useful parental characteristics. Here, relatively easily generated RNA-seq data are used to develop both SNP and indel markers to aid genetic studies and molecular breeding for Leucaena. The 39 KASP markers generated and tested here represent a major step forward for Leucaena breeding work.

These new tools have considerable potential to provide rapid advances in Leucaena genotyping, but those adopting these makers will want to pay attention to two potentially confounding factors that can complicate the usage of such tools in many plant species, including 1) the quality of the DNA extracted [39] and 2) the heterozygous nature of parental taxa used in the specific crosses being investigated [10,15]. Clean DNA extractions, free of difficult Leucaena mucilage, are critical for reliable PCR results. The root-derived DNA and/or rapid extraction method noted herein are further advances along these lines. In relation to heterozygosity, it is critical that KASP markers are tested on the specific parent accession each time new crosses are conducted so that heterozygosity for a parental marker is not confused with a successful crossing between species.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agronomy14071519/s1, Figure S1: Principal component analysis of 21 Leucaena taxa based on 75,001 variants evenly distributed in each 10 kb region within the L. trichandra reference; Figure S2: Leucaena species DNA extracted with different methods; Figure S3: Principal component analysis of 21 Leucaena taxa based on 48 indel variants filtered from a random 100 Mb interval in the L. trichandra reference with length over five base pairs; Figure S4: Genotyping a cross from L. pulverulenta (pollen receiver) and L.diversifolia (pollen doner); Table S1: 48 Indels and the variations in Western Australia’s representative crossing species; Table S2: 1533 SNP homozygous variants filtered out for fingerprinting; Table S3: SNPs selected for Western Australia’s interspecific crossings; Table S4: KASP and indel marker primers used for genotyping; Table S5: Counts of variants at different allele frequencies among 21 Leucaena taxa.

Author Contributions

Conceptualisation, D.R. and C.L.; methodology, Y.H., D.B. and C.L.; formal analysis, Y.H., A.A., M.N.-R., S.P.V. and D.B.; investigation, A.A., M.N.-R., S.P.V., R.B. and C.R.; resources, D.R., C.R., M.C. and D.B.; data curation, Y.H., A.A. and D.B.; writing—original draft preparation, Y.H.; writing—review and editing, J.v.d.Z., D.B., D.R. and C.L.; supervision, D.R.; funding acquisition, D.R., C.R. and C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Meat & Livestock Australia (MLA) and the WA Department of Primary Industries and Regional Development (DPIRD) for the ‘Sterile Leucaena project’. The Leucaena trichandra genome and broader transcriptome data were supported by the US National Science Foundation grant 1238731 (to D.B.).

Data Availability Statement

Transcriptome sequencing data and Leucaena voucher information have been deposited at NCBI (https://www.ncbi.nlm.nih.gov/sra) and are available by searching corresponding accession numbers listed in Table 1.

Acknowledgments

We thank Gaofeng Zhou and Hoang Viet Dang (Western Crop Genetics Alliance, Murdoch University) for instructions on KASP genotyping and PCA analysis, respectively, and Mengistu Yadete (DPIRD) for helping with the sample collection.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hughes, C.E. Leucaena: A Genetic Resources Handbook; Tropical Forestry Paper No 37; Oxford Forestry Institute: Oxford, UK, 1998. [Google Scholar]
Sithole, N.; Tsvuura, Z.; Kirkman, K.; Magadlela, A. Nitrogen source preference and growth carbon costs of Leucaena leucocephala (Lam.) de Wit saplings in South African grassland soils. Plants 2021, 10, 2242. [Google Scholar] [CrossRef]
Khanna, N.K.; Shukla, O.P.; Gogate, M.G.; Narkhede, S.L. Leucaena for paper industry in Gujarat, India: Case study. Trop. Grassl.-Forrajes. Trop. 2019, 7, 200–209. [Google Scholar] [CrossRef]
Alemán-Ramirez, J.; Okoye, P.U.; Torres-Arellano, S.; Mejía-Lopez, M.; Sebastian, P. A review on bioenergetic applications of Leucaena leucocephala. Ind. Crop. Prod. 2022, 182, e114847. [Google Scholar] [CrossRef]
Jube, S.; Borthakur, D. Development of an Agrobacterium-mediated transformation protocol for the tree-legume Leucaena leucocephala using immature zygotic embryos. Plant Cell Tiss. Org. 2009, 96, 325–333. [Google Scholar] [CrossRef]
Buck, S.; Rolfe, J.; Lemin, C.; English, B. Adoption, profitability and future of Leucaena feeding systems in Australia. Trop. Grassl.-Forrajes. Trop. 2019, 7, 303–314. [Google Scholar] [CrossRef]
Campbell, S.; Vogler, W.; Brazier, D.; Vitelli, J.; Brooks, S. Weed Leucaena and its significance, implications and control. Trop. Grassl.-Forrajes. Trop. 2019, 7, 280–289. [Google Scholar] [CrossRef]
Real, D.; Han, Y.; Bailey, C.D.; Vasan, S.; Li, C.; Castello, M.; Broughton, S.; Abair, A.; Crouch, S.; Revell, C. Strategies to breed sterile Leucaena for Western Australia. Trop. Grassl.-Forrajes. Trop. 2019, 7, 80–86. [Google Scholar] [CrossRef]
Brewbaker, J.L. Breeding Leucaena: Tropical multipurpose leguminous tree. In Plant Breeding Reviews; Janick, J., Ed.; Wiley-Blackwell Inc.: Hoboken, NJ, USA, 2016; Volume 40, pp. 43–123. [Google Scholar] [CrossRef]
Real, D.; Revell, C.; Han, Y.; Li, C.; Castello, M.; Bailey, C.D. Successful creation of seedless (sterile) Leucaena germplasm developed from interspecific hybridisation for use as forage. Crop Pasture Sci. 2022, 74, 783–796. [Google Scholar] [CrossRef]
Lema, M. Marker-assisted selection in comparison to conventional plant breeding. Agric. Res. Technol. 2018, 14, e555914. [Google Scholar] [CrossRef]
Hughes, C.E.; Bailey, C.D.; Harris, S.A. Divergent and reticulate species relationships in Leucaena (Fabaceae) inferred from multiple data sources: Insights into polyploid origins and nrDNA polymorphism. Am. J. Bot. 2002, 89, 1057–1073. [Google Scholar] [CrossRef]
Govindarajulu, R.; Hughes, C.E.; Bailey, C.D. Phylogenetic and population genetic analyses of diploid Leucaena (Leguminosae; Mimosoideae) reveal cryptic species diversity and patterns of divergent allopatric speciation. Am. J. Bot. 2011, 98, 2049–2063. [Google Scholar] [CrossRef]
Rajarajan, K.; Uthappa, A.R.; Handa, A.K.; Chavan, S.B.; Vishnu, R.; Shrivastava, A.; Handa, A.; Rana, M.; Sahu, S.; Humar, N.; et al. Genetic diversity and population structure of Leucaena leucocephala (Lam.) de Wit genotypes using molecular and morphological attributes. Genet. Resour. Crop. Evol. 2022, 69, 71–83. [Google Scholar] [CrossRef]
Walton, C.S. Leucaena (Leucaena leucocephala) in Queensland; Queensland Department of Natural Resources and Mines: Brisbane, Australia, 2003. Available online: https://www.daf.qld.gov.au/__data/assets/pdf_file/0009/57294/IPA-Leucaena-PSA.pdf (accessed on 13 November 2023).
Russell, J.R.; Hedley, P.E.; Cardle, L.; Dancey, S.; Morris, J.; Booth, A.; Odee, D.; Mwaura, L.; Omondi, W.; Angaine, P.; et al. tropiTree: An NGS-based EST-SSR resource for 24 tropical tree species. PLoS ONE 2014, 9, e102502. [Google Scholar] [CrossRef]
Tan, J.; Guo, J.J.; Yin, M.Y.; Wang, H.; Dong, W.P.; Zeng, J.; Zhou, S.L. Next generation sequencing-based molecular marker development: A case study in Betula alnoides. Molecules 2018, 23, 2963. [Google Scholar] [CrossRef]
Mammadov, J.; Aggarwal, R.; Buyyarapu, R.; Kumpatla, S. SNP markers and their impact on plant breeding. Int. J. Plant. Genomics 2012, 2012, e728398. [Google Scholar] [CrossRef]
Nagy, I.; Barth, S.; Mehenni-Ciz, J.; Abberton, M.T.; Milbourne, D. A hybrid next generation transcript sequencing-based approach to identify allelic and homeolog-specific single nucleotide polymorphisms in allotetraploid white clover. BMC Genomics 2013, 14, e100. [Google Scholar] [CrossRef]
Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 2014, 30, 2114–2120. [Google Scholar] [CrossRef]
Dobin, A.; Davis, C.A.; Schlesinger, F.; Drenkow, J.; Zaleski, C.; Jha, S.; Batut, P.; Chaisson, M.; Gingeras, T.R. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 2013, 29, 15–21. [Google Scholar] [CrossRef]
Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009, 25, 2078–2079. [Google Scholar] [CrossRef]
Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and VCFtools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef]
Wickham, H. ggplot2: Elegant Graphics for Data Analysis. 2016. Available online: https://ggplot2.tidyverse.org/ (accessed on 11 March 2024).
Ortiz, E.M. vcf2phylip v2.0: Convert a VCF Matrix into Several Matrix Formats for Phylogenetic Analysis. Available online: https://zenodo.org/records/2540861 (accessed on 9 September 2019).
Kozlov, A.M.; Darriba, D.; Flouri, T.; Morel, B.; Stamatakis, A. RAxML-NG: A fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 2019, 35, 4453–4455. [Google Scholar] [CrossRef]
Murray, M.G.; Thompson, W.F. Rapid isolation of high molecular weight plant DNA. Nucleic Acids Res. 1980, 8, 4321–4325. [Google Scholar] [CrossRef]
Conesa, A.; Madrigal, P.; Tarazona, S.; Gomez-Cabrero, D.; Cervera, A.; McPherson, A.; Szcześniak, M.W.; Gaffney, D.J.; Elo, L.L.; Zhang, X.; et al. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016, 17, 13. [Google Scholar] [CrossRef]
Pearce, S.; Vazquez-Gross, H.; Herin, S.Y.; Hane, D.; Wang, Y.; Gu, Y.Q.; Dubcovsky, J. WheatExp: An RNA-seq expression database for polyploid wheat. BMC Plant Biol. 2015, 15, 299. [Google Scholar] [CrossRef]
Wang, S.; Li, Z.; Guo, X.; Fang, Y.; Xiang, J.; Jin, W. Comparative analysis of microsatellite, SNP, and InDel markers in four Rhododendron species based on RNA-seq. Breeding Sci. 2018, 68, 536–544. [Google Scholar] [CrossRef]
Bachlava, E.; Taylor, C.A.; Tang, S.; Bowers, J.E.; Mandel, J.R.; Burke, J.M.; Knapp, S.J. SNP discovery and development of a high-density genotyping array for sunflower. PLoS ONE 2012, 7, e29814. [Google Scholar] [CrossRef]
Lv, Y.; Liu, Y.; Zhao, H. mInDel: A high-throughput and efficient pipeline for genome-wide InDel marker development. BMC Genomics 2016, 17, e290. [Google Scholar] [CrossRef]
Sehn, J.K. Insertions and deletions (Indels). In Clinical Genomics; Kulkarni, S., Pfiefer, J., Eds.; Elsevier Inc.: Amsterdam, The Netherlands, 2015; pp. 129–150. [Google Scholar] [CrossRef]
Karunarathne, S.; Walker, E.; Sharma, D.; Li, C.; Han, Y. Genetic resources and precise gene editing for targeted improvement of barley abiotic stress tolerance. J. Zhejiang Univ. Sci. B 2023, 24, 1069–1092. [Google Scholar] [CrossRef]
Govindarajulu, R.; Hughes, C.E.; Alexander, P.J.; Bailey, C.D. The complex evolutionary dynamics of ancient and recent polyploidy in Leucaena (Leguminosae; Mimosoideae). Am. J. Bot. 2011, 98, 2064–2076. [Google Scholar] [CrossRef]
Abair, A.; Hughes, C.E.; Bailey, C.D. The evolutionary history of Leucaena: Recent research, new genomic resources and future directions. Trop. Grassl.-Forrajes. Trop. 2019, 7, 65–73. [Google Scholar] [CrossRef]
von Maydell, D.; Beleites, C.; Stache, A.; Riewe, D.; Krähmer, A.; Marthe, F. Genetic variation of annual and biennial caraway (Carum carvi) germplasm offers diverse opportunities for breeding. Ind. Crop. Prod. 2024, 208, e117798. [Google Scholar] [CrossRef]
Uffelmann, E.; Huang, Q.Q.; Munung, N.S.; De Vries, J.; Okada, Y.; Martin, A.R.; Martin, H.C.; Lappalainen, T.; Posthuma, D. Genome-wide association studies. Nat. Rev. Methods Primers 2021, 1, 59. [Google Scholar] [CrossRef]
Porebski, S.; Bailey, L.G.; Baum, B.R. Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant Mol. Biol. Rep. 1997, 15, 8–15. [Google Scholar] [CrossRef]
Nagy, L.G.; Kocsubé, S.; Csanádi, Z.; Kovács, G.M.; Petkovits, T.; Vágvölgyi, C.; Papp, T. Re-mind the gap! Insertion—Deletion data reveal neglected phylogenetic potential of the nuclear ribosomal internal transcribed spacer (ITS) of fungi. PLoS ONE 2012, 7, e49794. [Google Scholar] [CrossRef]
Ashkenazy, H.; Cohen, O.; Pupko, T.; Huchon, D. Indel reliability in indel-based phylogenetic inference. Genome Biol. Evol. 2014, 6, 3199–3209. [Google Scholar] [CrossRef]
McKenna, A.; Hanna, M.; Banks, E.; Sivachenko, A.; Cibulskis, K.; Kernytsky, A.; Garimella, K.; Altshuler, D.; Gabriel, S.; Daly, M.; et al. The Genome Analysis Toolkit: A MapReduce framework for analysing next-generation DNA sequencing data. Genome Res. 2010, 20, 1297–1303. [Google Scholar] [CrossRef]
Liu, S.; An, Y.; Tong, W.; Qin, X.; Samarina, L.; Guo, R.; Xia, X.; Wei, C. Characterization of genome-wide genetic variations between two varieties of tea plant (Camellia sinensis) and development of InDel markers for genetic research. BMC Genomics 2019, 20, 935. [Google Scholar] [CrossRef]
Piskol, R.; Ramaswami, G.; Li, J.B. Reliable identification of genomic variants from RNA-seq data. Am. J. Hum. Genet. 2013, 93, 641–651. [Google Scholar] [CrossRef]
Huang, Z.; Peng, G.; Gossen, B.D.; Yu, F. Fine mapping of a clubroot resistance gene from turnip using SNP markers identified from bulked segregant RNA-Seq. Mol. Breed. 2019, 39, 131. [Google Scholar] [CrossRef]
Muñoz-Espinoza, C.; Di Genova, A.; Sánchez, A.; Correa, J.; Espinoza, A.; Meneses, C.; Maass, A.; Orellana, A.; Hinrichsen, P. Identification of SNPs and InDels associated with berry size in table grapes integrating genetic and transcriptomic approaches. BMC Plant Biol. 2020, 20, e365. [Google Scholar] [CrossRef]
Chen, Z.; Tang, D.; Ni, J.; Li, P.; Wang, L.; Zhou, W.; Li, C.; Lan, H.; Li, L.; Liu, J. Development of genic KASP SNP markers from RNA-Seq data for map-based cloning and marker-assisted selection in maize. BMC Plant. Biol. 2021, 21, 157. [Google Scholar] [CrossRef] [PubMed]
Yang, G.; Chen, S.; Chen, L.; Sun, K.; Huang, C.; Zhou, D.; Huang, Y.; Wang, J.; Liu, Y.; Wang, H.; et al. Development of a core SNP arrays based on the KASP method for molecular breeding of rice. Rice 2019, 12, 21. [Google Scholar] [CrossRef] [PubMed]
Kaur, B.; Mavi, G.S.; Gill, M.S.; Saini, D.K. Utilization of KASP technology for wheat improvement. Cereal Res. Commun. 2020, 48, 409–421. [Google Scholar] [CrossRef]
Tao, D.; Kalendar, R.; Paterson, A.H. Editorial: Interspecific hybridisation in plant biology. Front. Plant Sci. 2022, 13, e1026492. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Representation of different types of SNPs and indels detected across 21 Leucaena taxa. (a) Counts and percentages of each substitution type; (b) percentages of various indel sizes among Leucaena species. Variant calling was completed after RNA sequencing and mapping against an L. trichandra reference genome.

Figure 2. Phylogenetic and principal component analysis using transcriptome-wide genetic variants in Leucaena following the clade nomenclature of Govindarajulu et al. [13]. (a) Maximum likelihood (ML) tree for species topology in Leucaena using all SNPs; (b) PCA plots for species clustering using all mapped variants. The hybrid tetraploid L.diversifolia was indicated by “*” and L. cuspidata by “×”, respectively. The proportion of variance explained by the principal components was shown on the axis labels. (c) PCA plots of 13 Clade 1 diploids based on 4,492,396 variants after trimming monomorphic variants with an allele frequency of 0.1 and missingness of 0.25.

Figure 3. Development of indel markers and a core SNP set for fingerprinting. (a) Testing indels with different L. diversifolia and L. pulverulenta accessions for reciprocal crosses. Lane 1, L. diversifolia 71.4; Lane 2–5, L. pulverulenta 63.5, 84.3, 87.2 and 84.4; Lane 6, L. diversifolia 72.3; Lane 7–8, L. pulverulenta 75.2 and 83.2; Lane 9, L. diversifolia 80.1; Lane 10–15, L. pulverulenta 83.3, 86.2, 88.3, 77.2, 87.3 and 87.4. White boxes indicated distinct genotypes. (b) PAGE gel of indel marker 6 for a selection of parental lines and triploids generated. P10, L.pulverulenta 83.3; P9, L.diversifolia 80.1; T1 to T4, F1 individuals. (c) Grouping of 21 sequenced Leucaena taxa with 1533 homozygous SNPs (Table S2).

Figure 4. Development of KASP markers based on SNPs and genotyping Leucaena populations. (a) Genotyping a cross from L. leucocephala ssp. glabrata (pollen receiver) and L. pulverulenta (pollen doner). KASP marker 24 was used, and the DNA mix contained an equal amount of both parental accessions. Each sample had two technical repeats, as shown in the ovals. (b) A summary of genotyping results compared with phenotype observation two years after transplanting at both Kununurra and Carnarvon, WA. Sterile or partially sterile refers to plants that never flowered, flowered without pods, flowered with few pods or flowered with aborted seeds. The phenotype results were recalculated from Real et al. [10].

Table 1. SNP and indel counts in 20 diploid and one tetraploid Leucaena taxa mapping against an L. trichandra draft genome.

Taxon	nRefHom	nNonRefHom	nHets	nTransitions	nTransversions	nIndels	NCBI Accession Number
Leucaena collinsii Britton & Rose (2×)	8428728	5,228,368	2,806,318	4,440,773	3,593,913	450,406	SRX2719653
Leucaena cruziana Britton & Rose (2×)	8,284,936	5,330,027	2,839,134	4,515,216	3,653,945	459,723	SRX2719651
Leucaena cuspidata Standl. (2×)	7,554,107	5,839,938	3,033,407	4,891,943	3,981,402	486,368	SRX2719650
Leucaena diversifolia (Schltdl.) Benth. (4×)	7,686,304	4,977,243	3,791,795	4,830,917	3,938,121	458,478	SRX2719649
Leucaena esculenta (DC.) Benth. (2×)	7,024,119	6,160,890	3,218,074	5,159,900	4,219,064	510,737	SRX2719648
Leucaena greggii S. Watson (2×)	7,282,246	6,090,881	3,048,606	5,037,017	4,102,470	492,087	SRX2719647
Leucaena lanceolata S. Watson (2×)	8,312,577	5,360,102	2,782,799	4,505,372	3,637,529	458,342	SRX2719645
Leucaena lempirana C.E. Hughes (2×)	8,429,399	5,192,417	2,842,191	4,444,316	3,590,292	449,813	SRX2719644
Leucaena macrophylla subsp. istmensis C.E. Hughes (2×)	8,244,333	5,462,404	2,743,421	4,536,361	3,669,464	463,662	SRX2719641
Leucaena macrophylla subsp. macrophylla Benth. (2×)	8,208,257	5,382,254	2,857,991	4,554,355	3,685,890	465,318	SRX2719640
Leucaena magnifica (C.E. Hughes) C.E. Hughes (2×)	8,465,544	5,150,898	2,848,108	4,415,835	3,583,171	449,270	SRX2719639
Leucaena matudae (Zarate) C.E. Hughes (2×)	6,982,398	6,187,703	3,231,807	5,181,555	4,237,955	511,912	SRX2719638
Leucaena multicapitula Schery (2×)	8,372,304	5,299,403	2,788,487	4,474,205	3,613,685	453,626	SRX2719637
Leucaena pueblana Britton & Rose (2×)	5,793,382	8,948,739	1,578,880	5,790,542	4,737,077	592,819	SRX2719610
Leucaena pulverulenta (Schltdl.) Benth. (2×)	7,251,662	6,145,143	3,024,462	5,046,850	4,122,755	492,553	SRX2719635
Leucaena retusa Benth. (2×)	7,248,430	6,151,778	3,021,196	5,057,547	4,115,427	492,416	SRX2719634
Leucaena salvadorensis Standl. ex Britton & Rose (2×)	8,350,557	5,157,831	2,950,991	4,481,168	3,627,654	454,441	SRX2719633
Leucaena shannonii Donn.Sm. (2×)	8,391,792	5,170,550	2,899,176	4,459,365	3,610,361	452,302	SRX2719632
Leucaena trichandra (Zucc.) Urb. (2×)	8,773,909	4,785,556	2,925,521	4,257,895	3,453,182	428,834	SRX2719631
Leucaena trichodes (Jacq.) Benth. (2×)	8,285,059	5,390,552	2,777,911	4,516,372	3,652,091	460,298	SRX2719630
Leucaena zacapana (C.E. Hughes) R. Govind. & C.E. Hughes (2×)	8,419,838	5,192,514	2,851,655	4,448,285	3,595,884	449,813	SRX2719629
Total number of unique variants	SNPs 16,396,328			9,024,214	7,372,114	816,282

Abbreviations: nRefHom, number of homozygous alleles that are same as the reference; nNonRefHom, number of homozygous alleles that are different from the reference; nHets, number of heterozygous alleles; nTransitions, number of nucleotide transitions; nTransversions, number of nucleotide transversions; nIndels, number of insertion/deletion alleles. Transcriptome sequencing data and voucher information have been deposited at NCBI (https://www.ncbi.nlm.nih.gov/sra) and are available by searching corresponding accession numbers.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Han, Y.; Abair, A.; van der Zanden, J.; Nageswara-Rao, M.; Vasan, S.P.; Bhoite, R.; Castello, M.; Bailey, D.; Revell, C.; Li, C.; et al. Transcriptome-Wide Genetic Variations in the Legume Genus Leucaena for Fingerprinting and Breeding. Agronomy 2024, 14, 1519. https://doi.org/10.3390/agronomy14071519

AMA Style

Han Y, Abair A, van der Zanden J, Nageswara-Rao M, Vasan SP, Bhoite R, Castello M, Bailey D, Revell C, Li C, et al. Transcriptome-Wide Genetic Variations in the Legume Genus Leucaena for Fingerprinting and Breeding. Agronomy. 2024; 14(7):1519. https://doi.org/10.3390/agronomy14071519

Chicago/Turabian Style

Han, Yong, Alexander Abair, Julian van der Zanden, Madhugiri Nageswara-Rao, Saipriyaa Purushotham Vasan, Roopali Bhoite, Marieclaire Castello, Donovan Bailey, Clinton Revell, Chengdao Li, and et al. 2024. "Transcriptome-Wide Genetic Variations in the Legume Genus Leucaena for Fingerprinting and Breeding" Agronomy 14, no. 7: 1519. https://doi.org/10.3390/agronomy14071519

APA Style

Han, Y., Abair, A., van der Zanden, J., Nageswara-Rao, M., Vasan, S. P., Bhoite, R., Castello, M., Bailey, D., Revell, C., Li, C., & Real, D. (2024). Transcriptome-Wide Genetic Variations in the Legume Genus Leucaena for Fingerprinting and Breeding. Agronomy, 14(7), 1519. https://doi.org/10.3390/agronomy14071519

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Transcriptome-Wide Genetic Variations in the Legume Genus Leucaena for Fingerprinting and Breeding

Abstract

1. Introduction