**Genome-Wide Association Mapping for Stripe Rust Resistance in Pakistani Spring Wheat Genotypes**

#### **Madiha Habib 1, Faisal Saeed Awan 1,\*, Bushra Sadia <sup>1</sup> and Muhammad Anjum Zia <sup>2</sup>**


Received: 13 May 2020; Accepted: 15 June 2020; Published: 19 August 2020

**Abstract:** Stripe rust caused by the pathogen *Puccinia striiformis* f. sp. *tritici* (*Pst*) is a major threat for wheat, resulting in low yield and grain quality loss in many countries. Genetic resistance is a prevalent method to combat the disease. Mapping the resistant loci and their association with traits is highly exploited in this era. A panel of 465 Pakistani spring wheat genotypes were evaluated for their phenotypic response to stripe rust at the seedling and adult plant stages. A total of 765 single nucleotide polymorphism (SNP) markers were applied on 465 wheat genotypes to evaluate their stripe rust response against nine races during the seedling test and in three locations for the field test. Currently, twenty SNPs dispersed on twelve chromosomal regions (1A, 1B, 1D, 2A, 2B, 4A, 4B, 5B, 6A, 6B, 6D and 7B) have been identified that were associated with rust race-specific resistance at the seedling stage. Thirty SNPs dispersed on eighteen chromosomal regions (1A, 1B, 1D, 2A, 2B, 2D, 3A, 3B, 3D, 4B, 5A, 5B, 6A, 6B, 6D, 7A, 7B and 7D) are associated with adult plant resistance. SNP loci IWB3662 was linked with all three Pakistani races, and likewise IWA2344 and IWA4096 were found to be linked with three different USA races. The present research findings can be applied by wheat breeders to increase their resistant capability and yield potential of their cultivars, through marker-assisted selection.

**Keywords:** wheat; genome-wide association studies; association mapping; SNP; stripe rust

#### **1. Introduction**

*Triticum aestivum* L., commonly known as bread wheat, is the world's major food crop for about 2 billion people. It fulfills the nutritional requirements of almost 1/3 of the world's population (about 35%) by providing half of their protein and above half of their calorie requirement [1]. Wheat contains a high quantity of dietary nutrients which includes carbohydrates (70%), crude fibers (22%), proteins (12%), water (12%), fats (2%) and minerals (1.8%) [2].

Wheat can withstand a wide range of climate changes but it is also prone to different biotic and abiotic stresses. Among the biotic stresses (rust, smut, mildew, bunt etc.), rust has global economic and historic significance. Rust (leaf, stripe and stem) is the most devastating fungal disease that is threatening the overall world wheat production [3] and its cyclic rotation is considered responsible for famine in many parts of the world.

Wheat in its growing season, if faced with a cool environment, is mostly arrested by stripe (yellow) rust which is a major foliar disease caused by *Puccinia striiformis* f. sp. *tritici* (*Pst*) [4], but nowadays wheat growing in warmer regions is also prone to stripe rust epidemics [5,6]. Mostly, 0.1–5% wheat yield losses are observed due to stripe rust disease and rarely, losses expand up to 25% [4]. The yield loss due to stripe rust can exceed up to 100% for susceptible wheat landraces. Wheat infected with stripe rust causes yield loss by reducing the kernel quality, the kernel number per spike, the low-test weight and the plant height [7,8].

Pakistan is ranked among the top 10 wheat-producing countries. Globally, Pakistan ranked at seventh with 25.5 million tonnes annual wheat production [9]. In Pakistan, almost 70% of the wheat cultivation area (5.8 MH) is susceptible to stripe rust [10]. Northern and central-west areas of Pakistan are majorly threatened with stripe rust. A major yellow rust epidemic in Pakistan was observed in 1995, which caused a 20% loss in the affected areas [11]. Due to climate change, the risk factor of disease decreased but it is still at the doorstep of Pakistan. For the years 2009 and 2010, due to the high prevalence of disease in close neighbors like Uzbekistan, significant yield losses were observed. Stripe rust epidemic losses mounted to almost 360M\$ in the USA for the year 2004, 100M\$ in Pakistan for the year 2005 [12], 127M\$ in Australia for the year 2009 [13] and 30M\$ in Morocco for the year 2009 [14].

Wheat rust can be controlled by fungicides, cultural practices, and planting resistant varieties [15]. Fungicidal spray at the appropriate time can reduce the risk of loss caused by pathogens but it requires a healthy finance. Despite fungicidal spray, the areas with a high disease pressure are prevalent to substantial loss if susceptible cultivars are present [16]. In Pakistan, fungicidal control is also not a sensible choice for >8 MH area due to its high cost, spray machinery requirement, the alarming conditions of water resources, as well as air and soil pollution.

Genetic resistance is the most prevailing approach to protect the crop against the yield losses caused by diseases [17]. Resistance to stripe rust is characterized as seedling resistance (all stage resistance) and adult plant resistance (APR). All stage resistance is mostly controlled by a single resistant gene (race specific) and provides high resistance throughout the plant's development but is readily overcome due to changes in the virulence by the emergence of new rust pathogens [18]. Alternatively, wheat resistance against stripe rust can be improved by adult plant resistance (APR), or race-nonspecific resistance or partial resistance. This nonspecific resistance is efficiently controlled by minor and effective multiple loci which are quantitatively inherited [19] and mostly appears at the later growing stages of the plant's development [20]. Pyramiding four to five race-nonspecific resistance genes [21] makes the genotypes more durable at the later growing stages then the race-specific resistance. Gene pyramiding can be achieved by using a molecular marker closely linked with the APR and hence can be further used in breeding through marker-assisted selection (MAS).

Molecular mapping helps in the identification of phenotypic traits at the genomic level by using molecular markers closely linked with the required traits, and hence used in marker-assisted selection and gene pyramiding. Recently, single nucleotide polymorphisms (SNPs), due to their high-density form, shown by the iSelect array, have been proven as valuable mapping markers across the whole wheat genome. The usage of SNPs in genome-wide association studies, based on linkage disequilibrium (LD), increases the efficiency of finding linked loci to the desired traits in the diverse population. Association mapping (AM) or genome-wide association studies (GWAS) utilizes the LD to spot the association among genetic polymorphism and phenotypic variation [22,23]. LD is the principle behind AM which studies the nonrandom association of alleles at different loci [22]. AM has its importance to map QTL (quatitative trait loci) due to the availability of a faster, higher density and cheaper molecular marker. It also minimizes the time and cost by utilizing the diverse populations to determine the linkage disequilibrium between the alleles and to identify the marker–trait association, over the biparental population development [24].

The aim of the present research was to identify the genetic divergence pattern of stripe rust resistance resources in Pakistani spring wheat genotypes. The present study also focused on the identification and mapping of seedling and field resistance minor loci for wheat stripe rust resistance using single nucleotide polymorphism (SNP) markers.

#### **2. Results**

#### *2.1. Phenotypic Variation to Stripe Rust Response*

The phenotypic characterization of 465 spring wheat genotypes was done with nine *Pst* races (six races selected from the United State of America and three races from Pakistan). The seedling test was performed in a greenhouse (controlled condition) and the rust-infection score (infection type (IT)) collected is summarized in Figures 1 and 2. The frequency of resistance and susceptibility varied with all the stripe rust races. About 15, 56, 72, 106, 31 and 57 Pakistani accessions were resistant (IT score = 0–3) to the USA races PSTv-37, PSTv-198, PSTv-51, PSTv-40, PSTv-14 and PSTv-4, respectively. Most of the genotypes were susceptible (IT score = 7–9) as 384, 268, 224, 156, 200 and 220 to the USA races of stripe rust PSTv-37, PSTv-198, PSTv-51, PSTv-40, PSTv-14 and PSTv-4 respectively (Figure 1).

**Figure 1.** Infection distribution of the 465 spring wheat genotypes at the seedling stage based on their infection type (IT) score, tested with six stripe rust races from the USA.

**Figure 2.** Infection distribution of the 465 spring wheat genotypes at the seedling stage based on their infection type (IT) score, tested with three stripe rust races from Pakistan.

Wheat genotypes showed a slightly different behavior with the Pakistani stripe rust races such that the numbers of resistant genotypes were greater when compared to the USA stripe rust races (Figure 2). The phenotypic analysis revealed 72 resistant genotypes to stripe rust races PK07-4 and PK07-12, while 98 genotypes showed a resistance response against the PK08-2 stripe rust race. Most of the genotypes were susceptible to the USA races PSTv-37 and PSTv-198 and the Pakistani race PK07-12. Most of the genotypes showed resistance to the USA races PSTv-40 and PSTv-51 and to the Pakistani race PK08-2 (Figures 1 and 2).

In a field trial, the adult stage response of the wheat genotype against the stripe rust was slightly different as compared to the controlled conditions in the greenhouse. In a field, the resistance behavior against stripe rust infection of the genotypes was observed, namely that of PAK-University of Agriculture, Faisalabad (UAF)17 (IT = 421, SEV = 414), PAK-UAF16 (IT = 402, SEV = 407), USA-Mount Vernon (MTV)18 (IT = 292, SEV = 270) and USA-Palouse conservation field station (PCFS)18 (IT = 234, SEV = 317) with both their infection type (IT) scores and disease severity (SEV) scores (Figures 3 and 4). The genotypes susceptibility frequencies were higher in the USA locations than in Pakistan, hence the response varied with the environment. In Pakistan, due to the warm environment, the disease impact was slightly less but was not negligible as compared to the cold environment near Pullman, WA, USA. The estimation of the variance component showed a highly significant behavior (*p* < 0.0001) of the genotypes and the genotype x environment interaction across all the environments. Non-significant responses of environmental variance indicate the variable climate conditions and the influence of different races in the disease development. A high range of the coefficient of determination and broad sense heritability of (H2) 87% for the IT and 93% for the SEV indicate the reliability of the dataset for GWAS (Table 1).

**Figure 3.** Infection distribution of 465 spring wheat genotypes at the adult plant stage and the stripe rust infection type (IT) score across three environments. Scoring for the adult plant response performed in three different environments (two years in Pakistan considered as one location and two locations in the USA).

#### *2.2. Population Structure and Linkage Disequilibrium*

The population structure was performed on a total 465 entire wheat genotypes by the Bayesian clustering approach. An admixture-based model was used to cluster the genotype into three subpopulations based on the ΔK (Figure 5; Supplementary Table S4) value. Population structure reduces the false positive association among the markers and traits. Subpopulation one (Q1) contains 85 individuals. Similarly, subpopulation two (Q2) contains 179 and subpopulation three (Q3) contains 201 individuals. Each subpopulation is shown by a different color in the cluster analysis. The length of each color represents the estimated contribution of each sample to the subgroups.

After the filtration of minor allelic frequency, 765 SNP markers out of 1500 SNPs were used for the linkage disequilibrium analysis (LD). Linkage disequilibrium (LD) depends on many factors, including the population structure, genetic drift, chromosomal region and natural selection. LD decay relies on the value of *r*2, whose value is calculated for all chromosomes. Critical *r*<sup>2</sup> value 0.12 was identified for all 465 spring wheat genotypes by taking the 95th percentile of the coefficient square (showed by the red line in Figure 6). The mean *r*<sup>2</sup> value across the genome was found 0.03 with 50 cm distance. The highest number of pair-markers were found on the A genome (48%) followed by the B genome (46%) and the D genome (5%). Chromosome 1A had the highest number of pairs (2138) and 4D had the minimum number (nine) of pair markers. The LD decay was constructed using chromosomal distance and the critical *r*<sup>2</sup> value as the threshold to indicate the LD decay length, which attained 1.25 cm for the whole genome (Figure 6).

**Figure 4.** Infection distribution of 465 spring wheat genotypes at the adult plant stage and the stripe rust disease severity (SEV) score across three environments. Scoring for the adult plant response performed in three different environments (two years in Pakistan considered as one location and two locations in the USA).


**Table 1.** Mean, range and the variance components of *Puccinia striiformis* under different environments.

Asterisks \*\* significant at *p* < 0.0001 and \* significant at *p* < 0.005, **ns**: non-specific, **IT**: infection type, **SEV**: disease severity, **Var**: variance, **R2**: coefficient of determination, **SD**: standard deviation, **H2**: broad sense heritability.

#### *2.3. Marker–Trait Association at Seedling Stage*

Genome-wide association studies (GWAS) of SNP markers with IT scores of nine stripe rust races identify twenty SNPs associated with rust resistance at the seedling stage (Table 2). Manhattan and Q-Q plots, showing the marker–trait association of stripe rust-resistance response at the seedling stage, with all nine races IT scores are provided in the Supplementary Figures S1 and S2. Twelve chromosomal regions (1A, 1B, 1D, 2A, 2B, 4A, 4B, 5B, 6A, 6B, 6D and 7B) were found to be significantly associated (*p* < 0.0001) with the IT scores of nine rust races at the seedling stage. A total of fourteen SNP loci were identified to be linked with resistance against USA races and twelve loci were identified to be linked with stripe rust-resistance response to Pakistani races. One SNP present on chromosome 1A (IWB3662) was found to be associated with all three Pakistani rust races, namely PK07-4, PK07-12 and PK08-2. A SNP on the 1B chromosome (IWB12258) was linked with the IT score of PK08-2. At the chromosome 1D, two SNPs IWA7171 (linked to PK07-12) and IWA4344 (linked to PSTv-4) were identified. Two SNP markers, IWB50806 linked with PK07-12 and IWA7638 linked with PSTv-14, were identified on chromosome 2A. The targeted SNP IWA2344 was associated with three rust races PSTv-198, PSTv-51 and PSTv-40 at chromosome 2B. Similarly, for the three rust races, PSTv-198, PSTv-51 and PSTv-4, one SNP (IWA4097) marker on the chromosome 1B was identified. Two SNPs were identified at chromosome 2B associated with six stripe rust race. At chromosome 4A, one SNP (IWA3361) identified associated with PSTv-4. Three SNPs identified on the 4B chromosome, IWB12434, IWA2031 and IWA408, were associated with PK07-12, PSTv-14 and PK08-2, respectively. The SNP IWB5781 was associated with PSTv-4, identified at chromosome 5B. At 6A three SNPs identified IWA8022 linked with PSTv-40 and IWA3463, IWB25252 both linked with PK07-12. One SNP (IWB7615) on chromosome 6B was associated with PSTv-51. IWB12259 was linked with PK07-4 at the location 6D. Finally, two SNPs (IWA6857 and IWB10895) were associated with PK07-12 and PSTv-40 on chromosome 7B. Seven SNPs were identified at genome A, ten SNPs at genome B and three SNPs at genome D. The largest number of SNPs (six) was identified to be linked to PK07-12. The identified SNP markers associated with rust race-specific resistance can be linked to different Yr genes based on the virulence and avirulence formula of races (Supplementary Table S2 and S3).

**Figure 5.** Estimated population structure of the 465 spring wheat genotypes (K = 3) based on the Q matrix using the single nucleotide polymorphism (SNP) markers. Q: three (K = 3) different subpopulations as Q1, Q2 and Q3.



**Figure 6.** Scatter plot of the linkage disequilibrium (LD) decay with the critical *r*<sup>2</sup> value and the genetic chromosome distance (cm) for the whole genome. The red line shows the critical *r*<sup>2</sup> value i.e., 0.12.

#### *2.4. Marker–Trait Association at Adult Plant Stage*

Genome-wide association studies of 765 SNPs with the IT and SEV responses of 465 wheat genotypes against stripe rust at three different locations were performed. The association with the IT scores at different locations, namely PAK-UAF16, PAK-UAF17, USA-MTV18, USA-PCFS18 and with the BLUE (best linear unbiased estimator) value identifies twenty-two SNPs that were associated with stripe rust resistance at the adult plant stage. Significant SNPs was identified based on *p* < 0.001. A total of fifteen chromosomal regions (1A, 1B, 1D, 2A, 2B, 2D, 3A, 3B, 4B, 5A, 5B, 6A, 6D, 7A and 7B) were mapped with the identified SNPs. The highest number of SNPs with an IT score was identified in the USA environment, whereas with the SEV response, twenty-two SNPs were mapped in sixteen chromosomal regions (1A, 1B, 1D, 2A, 2B, 2D, 3A, 3B, 3D, 4B, 5A, 6A, 6B, 7A, 7B and 7D). Overall, thirty significant SNPs were identified with both IT and SEV response. The highest number of loci (eight) was mapped for the 2A chromosomes with the SNPs IWA11136, IWB12554 and IWA6798 at 9.41 cm, 143.6 cm and 150.1 cm respectively. The SNP name, chromosome, position and the probability of its association with IT and SEV, are discussed in detail in Table 3. Five chromosome regions (1D, 3D, 4B, 6D and 7A) were identified that were mapped with significant SNPs in the Pakistan environments with both the IT and SEV disease scores of the genotypes. Nine regions were identified with both the IT and SEV response in the environment USA-MTV18 and eight regions were mapped with significant SNP markers in the environment USA-PCFS18 (Table 3). In the present work, the SNP marker, IWB11136 (2A at 9.41 cm), was found to be linked with all the USA locations with both IT and SEV resistant scores of stripe rust. Manhattan and Q-Q plots showing the marker–trait association with the stripe rust resistance response IT and SEV scores, at all three locations and with the BLUE value, are provided in Supplementary Figures S3 and S4.


Chromosome number and the position of the associated SNP marker according to [25], MAF: Minor allelic frequency, *p*-value of marker–trait association with the IT scores of all locations, g *p*-value of the marker–trait association with SEV scores of all locations. PAK-UAF: University of Agriculture, Faisalabad, Pakistan; MTV: Mount Vernon, USA; PCFS: Palouse conservation field station, USA; BLUE: Best linear unbiased estimator.

#### **3. Discussion**

#### *3.1. Phenotypic Variation to Stripe Rust Response*

The present study was conducted to study the spectrum of genetic diversity, as well as to map the resistance loci in 465 Pakistani spring wheat genotypes in response to stripe rust (*Puccinia striiformis* f. sp. *tritici*) at seedling and adult plant stage. Profiling with high-density SNP markers helps in the identification of genomic diversity, population structure among genotypes and marker–trait association to identify well-worn and novel resistance sources in germplasm. In previous reports, variable numbers of stripe rust races were used to compute the response against the specific race [19,26–28]. In the current study, a total of nine *Pst* races, including six from the USA and three of Pakistani origin were used to evaluate the genotypes response at the seedling stage also three different locations, tested for adult plant response (two years in Pakistan considered as one location and two locations in the USA).

In the current study, almost 50% intermediate response was observed at the seedling stage. Moreover, a high level of resistance (IT score = 0–3) response appeared at the adult plant stage and a low level of resistance phenotype appeared at the seedling stage. Similarly, a more susceptible response appeared at the seedling stage than at the adult plant stage [16,27–29]. Wheat genotypes were more susceptible to USA stripe rust races compared to Pakistani stripe rust races, where the resistant phenotype response was dominant. Pakistani wheat genotypes have long history of cultivation [30] before the emergence of modern wheat. This suggests that they have more interaction with *Pst* races, prevailing themselves as a major source of resistance to *Pst* [31].

Population structure, based on the Bayesian model, divides the 465 genotypes into three subpopulations (K = 3). Based on the genotype diversity, a different number of subpopulations were reported, as two subpopulations [16], three subpopulations [29], six subpopulations [32], seven subpopulation [31] and eight subpopulations [27]. Critical *r*<sup>2</sup> was used to estimate the extent of the LD decay with the line intersecting the smooth curve [19]. In present study, the critical *r*<sup>2</sup> value for the whole genome was 0.12 (for 765 SNPs) and we found LD decay at a confidence interval of 1.25, previously reported as 1.6 cm [31].

#### *3.2. Genome-Wide Association Analysis with Stripe Rust Response*

The major factors that influenced the association mapping were the population size, the germplasm choice and the marker density over whole genome [32]. Several association mapping studies were reported for stripe rust using different molecular markers as diversity arrays technology (DArT) [26,33,34], simple sequence repeats (SSR) [35,36] and SNP [19,27,29,37,38].

Marker–trait association at the seedling stage with nine different races identified twenty QTLs covering twelve chromosomes (1A, 1B, 1D, 2A, 2B, 4A, 4B, 5B, 6A, 6B, 6D and 7B) with *p* < 0.0001. The SNP loci *QYr.uaf-1D.2* associated with stripe rust resistance at the seedling stage was also found to be associated with rust resistance at the adult plant stage.

SNP loci *QYr.uaf-1A.1*, mapped on the short arm of 1A positioned at 10.69 cm, was found to be associated with resistance to all three Pakistani races (PK07-4, PK07-12 and PK08-2). Likewise, earlier it was reported that the position of SNP IWB3662 lay within the confidence interval of *YrEDWL-1AS,* associated with resistance to PSTv-14 and PSTv-37 in durum wheat [19] and *QYrid.ui-1A\_Rio* Blanco [39]. Three SNP markers (IWB12795, IWB20633 and IWB56353) associated with the loci *QYr.tsw-1A* were positioned on the 1A chromosome had already been reported [16] in close proximity to currently identified loci that were linked with resistance to PSTv-4 in spring wheat germplasm. Stripe rust-resistant locus *QYr.uaf-1B.1* tagged with SNP IWB12258 in current research findings was found in close proximity of IWA1191 [28]. The Chromosomal position 1B reported with many *Yr*-associated genes as *Yr9, Yr10, Yr15, Yr34*/*Yr26, Yr29, Y64, Yr65, YrAlp* and *YrH52* [31]. Chromosome 1B locus *QYr.uaf-1B.1* was found in close proximity to *QYrcau-1BS\_AQ24788-53* [40] from Chinese winter wheat and *Yr9* resistant gene [31]. In Pakistan, *Yr9* was first reported in 1994, and after that many cultivated varieties were developed with this gene, due to its linkage with other genes (*Lr26, Sr31,*

*Pm8*) and pleiotropic effect [41]. The successful translocation of the *Yr9* gene from rye and alongside high-yield potential, made *Yr9* a highly dominated gene in Pakistani germplasm [35]. Wheat varieties carrying this gene were highly cultivated during 1990s, but resistance breakdown after a few years made the resistant cultivars susceptible [42]. A major yellow rust epidemic was observed in 1995 with a 20% loss in the affected area in Pakistan. At that time, Pak81, also known as Veery#5 carrying *Yr9* gene, predominated. Two major cultivars, Pak81/Pirasabak 85 became susceptible during the period 1994–1995 due to the ineffectiveness of the *Yr9* gene and Inquilab 91 in 2002 due to the virulence occurrence of the *Yr27* gene [11].

Single stripe rust resistant QTL, *QYr.tsw-1D,* against the USA race Pstv-14, as reported by [16], was found in close proximity to SNP IWA7171 and IWA4344, positioned at 90.30 cm on the 1D chromosome. QTL *QYr.uaf-1D.1,* was linked with resistance to PK07-12 and PSTv-4 in Pakistan's spring wheat population. SNP loci *QYr.uaf-2B.1* (IWA2344) was positioned on the 2B at 96.99 cm, was found to be linked to all-stage resistance with the three stripe rust races PSTv-198, PSTv-51, PSTv-40, and was found in the confidence interval of *QYr.ucw-2B\_UC1110* [43], *QYr.inra-2B.1\_Camp* Remy [44] and *QYr.cim-2BS*\_*Francolin* [45]. It was found that the second identified loci of chromosome 2B (*QYr.uaf-2B.2*) positioned at 113.86 cm was also associated with the three rust races PSTv-198, PSTv-51 and PSTv-4. The locus (*QYr.uaf-2B.2*) was found in the confidence interval of *QYraq.cau-2BL\_Aquileja* [46] and *Yr53* [47].

Three QTLs *QYr.uaf-4B.1, QYr.uaf-4B.2, QYr.uaf-4B.3* at chromosome 4B were found in close proximity to *QYr.jic-4B\_Alcedo* [48], *QYr.jic-4B\_Guardian* [49], *QYr.vt-4BL\_VA00W-38* [50] and *YrEDWL-4BL,* in novel durum wheat linked to the stripe rust race PSTv-14 and PSTv-51 [19]. One QTL *QYr.uaf-5B.1* on the 5B chromosome positioned at 56.6 cm, was found within the confidence interval of *QYr.ufs-5B\_Cappelle-Desprez* [51]. Two SNP markers (IWA3463, IWB25252) linked the QTL *QYr.uaf-6A* to stripe rust resistance, and PK07-12 was found in close proximity to the previously reported QTL *QYr.cim-6AL\_Francolin* [45]. *QYr.uaf-7B.1* and *QYr.uaf-7B.2* were found to be linked with resistance to the stripe rust races PK07-12 and PSTv-40, respectively, and were found to be linked to the *QYr.sun-7B\_Kukri* [52], *Yr39* gene [53] and near to the marker IWA312 (76.1 cm) that was linked with resistance to the *Pst* races Pstv-37 and Pstv-40 [28].

#### *3.3. Genome-Wide Association Analysis for Adult Plant Response*

Genome-wide association (GWA) analysis of stripe rust using high-density SNP markers was carried out across the three locations. In total, thirty loci were identified that were significantly (*p* < 0.001) associated with the infection type (IT) and disease severity (SEV) score of stripe rust in multi-environments. These thirty loci were present in eighteen genomic regions, namely 1A, 1B, 1D, 2A, 2B, 2D, 3A, 3B, 3D, 4B, 5A, 5B, 6A, 6B, 6D, 7A, 7B and 7D. In the current study, three loci were mapped on the chromosomes 1A, 2A, 3A and 3B. Two QTLs were identified on each chromosome 1D, 4B, 5A and 7B. One QTL was identified on each chromosome 1B, 2B, 2D, 3D, 5B, 6A, 6B, 6D, 7A and 7D.

In the present research work, the QTL, *QYr.uaf-2A.3* (linked SNP IWB11136), identified to be positioned at 9.41 cm on the 2A chromosome, was significantly associated with six locations including the IT (USA-MTV18, USA-PCFS18, BLUE-IT) and SEV response (USA-MTV18, USA-PCFS18 and BLUE-SEV). The same SNP marker IWB11136 (*QYr.tsw-2A.3*) that was identified previously was also found significantly associated with all the tested locations for the stripe rust response [16]. This locus was also found within the confidence interval of *QYr.tam-2AS* [54] from the hard winter wheat TAM111 and *Yr17* genes [55]. The *Yr17* gene was developed by 2NS–2AS locus translocation (25 to 38 cm) from *T. ventricosum* (2NS), a famous wild *Triticeae* species to 2AS of wheat chromosome [55,56]. The 2NS–AS translocation was first carried out in winter bread wheat VPM1 and afterwards in California and Washington, where many winter wheat cultivars were developed such as Madsen, Hyak and Expresso (spring wheat). Furthermore, the 2B QTL *QYr.uaf-2B.3* positioned at 130.6 cm was linked with both the IT and SEV score at the USA location MTV, and was linked with the *Yr* genes *Yr53* and *Yr43* [47]. At chromosome 3B, two SNP markers (IWB11270 and IWB36652) were associated with the QTLs

*QYr.uaf-3B.2* and *QTL QYr.uaf-3B.3*, and were aligned with both the IT and SEV scores of USA-MTV, USA-PCFS18 and the BLUE value, which was found in close proximity to *QYr.cim-3B\_Pastor* and *QYr.inra-3Bcentr\_Renan* [57,58].

Presently, two QTLs *QYr.uaf-3B.2* (IWB11270) were positioned at 67.67 cm on 3B and *QYr.uaf-5A.2* (IWA589) was positioned at 123.21 cm on chromosome 5A, which was significantly associated (*p* < 0.001) with stripe rust resistance at four different locations. Locus *QYr.uaf-3B.2* was found in the confidence interval of *QYr.cim-3B\_Pastor* [58], whereas the QTL locus *QYrEDWL-5AL.2* reported in Ethiopian durum wheat was in close proximity to *QYr.uaf-5A.2* [19]. *Qyr.wpg-6B.1* (IWA7257) in winter wheat and *QYr.cim-6BL\_Pastor* was found in close proximity *to QYr.uaf-6B.2* (IWB26626) positioned at 113.67 cm on the chromosome 6B [37,57]. Five SNPs loci identified in the current study were linked with resistance to either the IT or SEV response of the stripe rust at three different locations. These included *QYr.uaf-1B.2*, *QYr.uaf-1D.2*, *QYr.uaf-2B.3*, *QYr.uaf-4B.2* and *QYr.uaf-5A.1*. The identified resistance source can be utilized as the breeding line for enhancing wheat resistance against disease.

#### **4. Materials and Methods**

#### *4.1. Collection of Genetic Material*

Four hundred and sixty-five (465) genotypes of bread wheat (*Triticum aestivum* L.) were collected from the Wheat Research Institute, Ayub Agricultural Research Institute (AARI), Pakistan, Faisalabad.

#### *4.2. Field Based Resistance Screening to Puccinia striiformis (Pst)*

The panel of 465 spring wheat genotypes was tested for stripe rust response under field conditions. All genotypes were grown at the experimental area of the Centre of Agricultural Biochemistry and Biotechnology (CABB), University of Agriculture, Faisalabad (31◦26 N, 73◦6 E) for two consecutive years 2015–2016 and 2016–2017. Each entry was planted ina1m long row by keeping a row-to-row distance of 36 cm and the sowing was done by planting two seeds per hole and maintaining 8 cm plant-to-plant distance. The highly susceptible variety of stripe rust i.e., Morocco was used as the susceptible check in the field experiments in Pakistan to increase the disease pressure. All 465 Pakistani wheat genotypes were also sown at two different locations in the USA at the Palouse conservation field station (PCFS), Pullman, WA (46◦43 59" N; 117◦10 19" W) and Mount Vernon (MTV), WA (48◦25 16" N; 122A◦20 2" W) in the year 2018 for the evaluation of stripe rust resistance under natural disease conditions. Spring wheat Avocet susceptible (AvS) was used as the susceptible check to increase the disease pressure at both locations planted after every twenty lines in the USA. Each line of the genotype was grown up in a 0.5 m long and a 0.3 m wide row. All standard agronomic practices were followed for the crop production.

The evaluation of *Pst* at the adult plant stage was done in a field with the natural conditions of disease epidemics at all three locations (two years in Pakistan considered as one location and two locations in USA). In the field, the data scoring was done by visualizing the impact of the disease on the flag leaf of the susceptible check that the IT score, which ranged from 7–9 and the disease severity score which ranged from 70–100% [19]. One location was in Pakistan at the University of Agriculture, Faisalabad (UAF) for two consecutive years 2015–2016 (PAK-UAF16) and 2016–2017 (PAK-UAF17). The other two locations data was recorded at the USA for year 2018 at Pullman, WA (USA-PCFS18) and at Mount Vernon, WA (USA-MTV18). Infection type (IT) and disease severity (SEV) were two disease phenotype scores that were recorded for the *Puccinia striiformis* (*Pst*) infection in the field. The IT score based on the 0–9 scale is explained in Supplementary Table S1 whereas the SEV recorded as the % age area of the flag leaf covered with disease and was scored from 0 to 100%.

#### *4.3. Greenhouse-Based Resistance Screening to Puccinia striiformis (Pst)*

The seedling response of 465 wheat genotypes was evaluated with isolates of six *Pst* USA races including PSTv-37, PSTv-198, PSTv-51, PSTv-40, PSTv-14, PSTv-4 [25] and three isolates of Pakistani races PK07-4, PK07-12, PK08-2 [59] under controlled greenhouse conditions. The virulence and avirulence formulae of the stripe rust race isolates are provided in the Supplementary Tables S2 and S3. All the stripe rust races were collected from the USDA, Wheat Health, Genetics and Quality Research Unit, Pullman, WA. Four to five seeds of each genotype were planted per well in a 96 wells tray filled with the number 1 sunshine mix growing medium (Sungro Horticulture, Bellevue, WA, USA). Trays were regularly watered and kept in a rust free greenhouse at 20 ◦C with 50% relative humidity (RH). The inoculation was done when the seedling reached the 2-leaf stage after approximately 9–10 days of sowing. The inoculation of each race was done by mixing the rust urediniospores with talcum powder. Inoculated plants were incubated in a dark dew chamber for 24 h at 10 ◦C and 100% relative humidity and then moved to the greenhouse having the 8 ◦C day, 16 ◦C night temperature and 16 h photoperiod. The reaction to *Puccinia striiformis* f. sp. *tritici* was scored after 18 to 20 days of inoculation, using the 0–9 scale for the infection type (IT) [60,61]. Based on the infection type, the genotypes were grouped as 0–3 = resistant; 4–6 = intermediate; 7–9 = susceptible. The scale of the infection type (IT) disease score is discussed in Supplementary Table S1.

#### *4.4. Statistical Analysis*

The range, mean, standard deviation, coefficient of determination (R2) were scored within and across the environments using JMP Genomics 15.1.0 (SAS Institute Inc., Cary, NC, USA, 2007). Broad sense heritability (H2) was calculated by the variance component obtained from REML (random effects model) computed using JMP software. The BLUE (best linear unbiased estimator) value for the IT and SEV scores of all environments was calculated using the PROC MIXED procedure in SAS v9.3 (SAS Institute Inc., Cary, NC, USA, 2007) considering the genotype as a fixed effect [16].

#### *4.5. DNA Extraction, SNP Genotyping*

Wheat genotypes sown in greenhouse and young leaves used for DNA extraction using a robotic system of oKtopureTM at the Western regional small grain genotyping laboratory (WRSGGL) (Washington State University, Pullman, WA, USA) [62] for SNP genotyping against stripe rust. Targeted amplicon sequencing (TAS) for stripe rust resistant genes was done using NextSeq® 500 (Illumina, Inc., Pullman, WA, USA). Genotypic calling and removing monomorphic as well as low quality SNPs, was carried out using GenomeStudio Software v2011.1. (Illumina, Inc., Pullman, WA, USA) to call bi-allelic SNPs AA, AB and BB, for this default clustering algorithm, was used. A total 1500 SNPs were yielded and subjected to TASSEL (trait analysis by association, evolution and linkage) software v.5.2.61 [63] to remove the SNPs with minor allelic frequency MAF < 0.05 and to made kinship matrix. A total 765 high-quality SNPs were selected and projected on to a consensus map of hexaploid wheat to order them based on the chromosome position [25]: these 765 SNPs were used for association analysis.

#### *4.6. Population Structure and Linkage Disequilibrium (LD)*

Major genetic structure of the selected genotypes was determined using 150 SNP markers with inter marker distances >5 cm from each other using the Bayesian model-based clustering algorithm in STRUCTURE v2.3.4 [64]. The number of subpopulations (K) was estimated by the running simulation from burn-in 10,000 iteration to 100,000 Monte Carlo Markov Chain (MCMC) replicates. K was run from 1–10 times and 10 independent runs were set for each run. The STRUCTURE results were visualized to determine the value of K (subpopulation) based on the ad hoc criterion by using the STRUCTURE HARVESTER [65,66].

The measurement of the linkage disequilibrium between the pairs of the SNP marker was estimated using the program TASSEL (v5.2.61). The LD parameters D' and *r*<sup>2</sup> among the loci and comparison-wise significance was computed by 1000 permutations. The critical *r*<sup>2</sup> value was determined by taking the 95th percentile of the unlinked markers [67]. The scatter plot among the *r*<sup>2</sup> and distance on chromosome, of all significant (*p* < 0.001) pairwise combinations, were used to fit the locally weighted polynomial regression curve (LOESS) to estimate the extent of the LD decay in the R environment [16] using the critical *r*<sup>2</sup> value.

#### *4.7. Genome-Wide Association (GWA) Analysis*

Integrated mixed-model (MLM) method for association mapping, which accounts for multiple levels of relatedness, was used to narrate the genetic polymorphism to important phenotypic variation in specific traits [68]. An association test was performed using both (1) the Genome Association and Prediction Integrated Tool (GAPIT) [69] and (2) fixed and random model circulating probability unification (FarmCPU) package [70] implemented in R software v.3.6.1 (https://www.r-project.org/). The MLM model utilized trait data, genotype data K (kinship) and PCA (principle component analysis) to find the marker–trait association. The model comparison was done to select the best model for the marker–trait association (MTA) with each trait as K + P (kinship and principal component) [44] and K + Q (kinship and population structure) [68]. The final results were analyzed by FarmCPU, selecting the model (K + Q) based on their respective Q-Q plots. Significant MTA was described based on the *p*-value. Markers with *p* < 0.0001 were considered significant for the seedling test and *p* < 0.001 for the field experiment. Marker–trait association was performed with all nine rust races data, scored at the seedling stage and with the field data of all the environments separately for the IT score, the disease severity score and with the best linear unbiased estimator (BLUE) value using the FarmCPU package implemented in R software v.3.6.1.

#### **5. Conclusions**

GWAS provides a good outline of the distribution and frequency of resistance genes over the whole world subpopulation. This spring wheat Pakistani germplasm was proved an efficient source of phenotypic diversification to combat stripe rust infection for both seedling and field experiments and to determine the yield QTLs related to the yield components. The genotypes possessing a higher fraction of resistance loci of stripe rust divulged themselves as a parental breeding line and hence can increase the breeding efficiency for stripe rust resistance. The present research findings can be exploited by wheat breeders to increase the resistant capability and yield potential of their cultivars by marker-assisted selection breeding.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2223-7747/9/9/1056/s1. Table S1: Stripe rust infection type (IT) recording scale. Table S2: Virulence and avirulence formula for the USA stripe rust races. Table S3: Virulence and avirulence formula for the Pakistan stripe rust. Table S4: ΔK value based on LnP(K) for 465 wheat genotypes. Figures S1 and S3: Manhattan representing the number of chromosome and their associated SNPs. Figures S2 and S4: Q-Q plot representing the number of chromosome and their associated SNPs.

**Author Contributions:** M.H. and F.S.A. were involved in planning and executing and manuscript writing; B.S., M.A.Z. contributed to the field data analysis. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Higher Education Commision, Pakistan 4616.

**Acknowledgments:** We pay our gratitude to Xianming Chen from WSU (Washington State University) for his assistance in genotyping and supporting greenhouse experiments and Deven R. See from WSU for providing computational support in analyzing genotypic data. We are also thankful to AARI, FSD for providing wheat seeds. This work was supported by the Higher Education Commission (HEC), Govt. of Pakistan.

**Conflicts of Interest:** The authors declare that they have no conflict of interest in the publication.

#### **Abbreviations**


#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

### **Assessment of Genetic Diversity for Drought, Heat and Combined Drought and Heat Stress Tolerance in Early Maturing Maize Landraces**

#### **Charles Nelimor 1,2,3, Ba**ff**our Badu-Apraku 2,\*, Antonia Y. Tetteh <sup>4</sup> and Assanvo S. P. N'guetta <sup>3</sup>**


Received: 27 September 2019; Accepted: 15 November 2019; Published: 17 November 2019

**Abstract:** Climate change is expected to aggravate the effects of drought, heat and combined drought and heat stresses. An important step in developing 'climate smart' maize varieties is to identify germplasm with good levels of tolerance to the abiotic stresses. The primary objective of this study was to identify landraces with combined high yield potential and desirable secondary traits under drought, heat and combined drought and heat stresses. Thirty-three landraces from Burkina Faso (6), Ghana (6) and Togo (21), and three drought-tolerant populations/varieties from the Maize Improvement Program at the International Institute of Tropical Agriculture were evaluated under three conditions, namely managed drought stress, heat stress and combined drought and heat stress, with optimal growing conditions as control, for two years. The phenotypic and genetic correlations between grain yield of the different treatments were very weak, suggesting the presence of independent genetic control of yield to these stresses. However, grain yield under heat and combined drought and heat stresses were highly and positively correlated, indicating that heat-tolerant genotypes would most likely tolerate combined drought and stress. Yield reduction averaged 46% under managed drought stress, 55% under heat stress, and 66% under combined drought and heat stress, which reflected hypo-additive effect of drought and heat stress on grain yield of the maize accessions. Accession GH-3505 was highly tolerant to drought, while GH-4859 and TZm-1353 were tolerant to the three stresses. These landrace accessions can be invaluable sources of genes/alleles for breeding for adaptation of maize to climate change.

**Keywords:** climate change; combined drought and heat stress; drought; heat; landraces; maize

#### **1. Introduction**

Climate change is predicted to increase global temperatures and reduce rainfall patterns, with adverse effects, particularly, on the critical stages of plant growth and development, resulting in yield losses. Productivity of maize, the major staple cereal crop in sub-Saharan Africa (SSA) is thus threatened [1,2]. Rainfall under future climate change scenarios in SSA will either occur late or stop earlier than usual [1], while temperatures in large areas have already exceeded the threshold for maize growth and productivity [2,3]. Maize is highly vulnerable to drought stress (DS) and heat stress (HS) during the reproductive stages [3,4]. Drought stress at flowering and grain-filling stages of maize causes delayed silking, increased anthesis-silking interval, and reduced kernel set [5], resulting in grain yield (GY) losses above 50% [6,7]. Under extreme conditions, DS at a few days before anthesis to the beginning of grain-filling period reduced GY of maize by as much as 90% [8]. High temperatures occurring at two weeks before flowering resulted in leaf firing, increased rate of leaf senescence [9] and premature lodging [7]. At the on-set of flowering, high temperature stress caused tassel blasting, leading to reduced pollen production and viability, reduced pollination rate, shortened grain-filling period, and reduced kernel and grain weight [9,10]. Together, yield losses due to these altered morpho-physiological traits was estimated at 45% or more [11,12].

Under HS, plants open their stomata to cool their leaves by transpiration, but when plants have to keep their stomata closed to reduce water loss during combined drought and heat stress (DSHS) conditions, the leaf temperature remain high, resulting in increased yield losses compared to the separate effects of the two stresses. For example, each degree rise in temperature above 30 ◦C resulted in 1% reduction in maize GY under optimal growing conditions (OGC), 1.7% under DS and up to 40% or more under DSHS [1]. Meseka et al. [7] found that while DS reduced GY by 58%, DSHS reduced GY by 77%. It was demonstrated that under extreme conditions, DSHS forced farmers to abandon their farmlands [13]. To cope with these stresses, farmers will need 'climate smart' maize varieties with low risk of failure. To develop such maize varieties, there is the need for continuous access to a wide range of alleles that are scattered in germplasm resources, particularly, the landraces.

Landraces of maize have been selected for several years for adaptation to local conditions, harbor abundant genetic diversity for important agronomic characteristics such as, phenology, growing season, resistance to diseases and insects and tolerance to abiotic stresses, including drought and/or high temperatures [14]. The availability of such genetic diversity is critical for genetic enhancement, allowing not only increased genetic gains and improved nutritional quality, but also for broadening the genetic base of elite varieties. Maize breeding efforts over the last few decades achieved remarkable success in terms of DS tolerance [3]. However, to ensure continued gains in genetic improvement, new DS tolerant source populations are needed [3]. Moreover, in contrast to DS, research on HS and DSHS has only recently began in tropical and subtropical maize [3,7,15,16]. Identifying new sources of variation for tolerance to DS, HS and DSHS will greatly complement these breeding efforts.

The primary objective of this study was to identify landraces that combine desirable secondary traits with good levels of tolerance to DS, HS and DSHS for use in maize breeding programs as sources of variation for breeding for climate change resilience.

#### **2. Results**

#### *2.1. Variation in Weather Conditions during the Trial Periods*

In 2018, the average night and day temperatures during the period when the HS and DSHS trials were carried out ranged from 17 to 25 ◦C and 36 to 40 ◦C, respectively (Table 1). During the evaluation period in 2019, average temperatures varied from 18 to 26 ◦C at night and from 31 to 39 ◦C at daytime. There were incidences of rains after grain-filling stages in May and June, with minor effects on trials. In each year, the lowest day and night temperatures were observed at the time of planting in February, and the highest in April (Table 1). In both years, the peak temperature occurred in April (Figure 1), which coincided with the flowering and grain filling stages. During this period, night temperatures varied from 18 to 27 ◦C in 2018 and 23 to 29 ◦C in 2019, while daytime temperatures ranged from 36 to 41 ◦C in both years.


**Table 1.** Monthly average temperature and rainfall recorded at Kadawa, Nigeria during the trial periods in 2018 and 2019.

**Figure 1.** Average day and night temperatures recorded at Kadawa during the flowering period in April.

#### *2.2. Variance and Heritability Estimates*

Broad-sense heritability estimates of GY of individual trials ranged from 0.37 to 0.87 under OGC, 0.66 to 0.83 under managed drought stress (MDS), 0.72 to 0.76 under HS and 0.51 to 0.61 under DSHS (Supplementary Table S1). The combined ANOVA of the 36 early maturing maize accessions evaluated under each of the stress conditions showed highly significant (*p* < 0.001) mean squares for GY and other measured traits for year, genotypes, and genotype × year interactions. Under OGC, year, and genotypes had significant effects (*p* < 0.05 or *p* < 0.001) on all measured traits except anthesis-silking interval (ASI) and husk cover (HC) (Table 2). Results of combined analysis across two years under MDS revealed that the effects of environments and genotypes were significant (*p* < 0.05 or *p* < 0.001) for all traits except environmental effects on GY and plant aspect (PASP) (Table 2). Similarly, under HS and DSHS, the combined ANOVA showed that year and genotypes and their interaction were significant (p < 0.005 or 0.001) for GY and most other traits (Table 3). Under each evaluation condition, the genotypes contributed the largest percentage to total sum of squares when compared with other sources of variation. Consequently, repeatability values of majority of the traits were high (>0.60).


#### *Plants* **2019**, *8*, 518

characteristic;

 RL: Root lodging; SL: Stalk lodging; TB: Tassel blast; LF: Leaf firing.


Anthesis-silking interval; PLHT: Plant height; EHT: Ear height; HC: Husk cover; EPP: Ears per plant; PASP: Plant aspect; EASP: EAROT: Ear rot; EASP:characteristic; RL: Root lodging; SL: Stalk lodging; TB: Tassel blast; LF: Leaf firing.

#### *Plants* **2019** , *8*, 518

#### *2.3. Phenotypic and Genetic Correlations between Treatments, and Trait Associations*

The genetic correlations between GY under HS and DSHS was very high and positive (0.94) whereas that between OGC and, HS and MDS were moderate and positive, 0.61 and 0.65, respectively. Genetic correlations observed between GY of the other treatments were very weak, ranging from −0.01 between MDS and DSHS to 0.29 for OGC and DSHS (Table 4). A similar trend was observed for the phenotypic correlations between GY of the treatments (Table 4). The phenotypic correlation between flowering traits [days to anthesis (AD), days to silking (SD) and ASI] of the different treatments were positive and ranged from very low (0.07) between ASI under MDS and HS to very high (0.84) between AD under HS and DSHS (Supplementary Table S2). The association between majority of the other traits including GY were highly significant (*p* < 0.001) and positive for HS vs. DSHS.

**Table 4.** Genetic correlations (down-diagonal) and phenotypic correlations (top-diagonal) between grain yield of 36 maize accessions under optimal growing conditions, managed drought stress, heat stress and combined drought and heat stress conditions.


\*\*, \*\*\* Significant at 0.01, and 0.001, respectively.

#### *2.4. Step-Wise Multiple Regression and Path Coe*ffi*cient Analyses*

Under MDS, the step-wise multiple regression analysis identified ear aspect (EASP) and PASP as traits with significant direct effects on GY, accounting for approximately 91% of the total variation in GY (Figure 2). of these two traits, EASP had the highest direct effect (−0.62) on yield, while PASP had the lowest direct effect on GY (−0.36). There were also two traits in the second order, including ears per plant (EPP) and ASI, each contributed indirectly to GY through EASP and PASP. Traits classified in the third order included stay green characteristic (SG), HC, SD, and plant height (PLHT). The indirect contribution of the remaining four third-order traits to yield through the second-order traits are clearly shown in Figure 2. Four traits, root lodging (RL), ear rot (EAROT), stalk lodging (SL) and ear height (EHT) were identified as the fourth-order traits with significant indirect effect on GY. While EAROT and SL contributed indirectly to GY through only PLHT and RL, EHT had indirect effects through two or more traits.

Across the HS trials, three traits (EASP, PASP and RL) were identified by the step-wise multiple regression in the first order, which explained about 86% of the total variation in GY (Figure 3). EASP had the highest direct effect on GY (−0.56), while RL was the least direct contributor to GY (−0.15). Four traits (EPP, HC, PLHT and AD) were categorized into second- order traits, each contributing indirectly to GY through one or two first order traits. The traits grouped in the third order included SD, ASI, EAROT, SG, tassel blast (TB) and EHT. Of these, ASI and EAROT had significant negative indirect effects on GY through EPP and AD. There were no fourth-order traits.

Under DSHS, only two traits, EASP and EPP were categorized as first order traits accounting for about 89% of the total variation in GY (Figure 4). While the contribution of EASP to variation in GY was negative (−0.52), that of EPP was positive (0.46). Two traits, namely SD and HC, were classified as second order traits, with each contributing indirectly to GY through the two first order traits. The traits classified into third order were AD, ASI, PASP and TB, all of which had positive indirect effects on GY. Five traits, EHT, SG, RL, SL and leaf firing (LF) were classified as fourth-order traits, each contributing to GY through two or more of the third order traits. Plant height was the only trait classified into the fifth order.

**Figure 2.** Path analysis diagram depicting the causal relationship of measured traits of the 36 maize accessions under managed drought stressed conditions. Note: Value written in bold is the error effects; the direct path coefficients are values in parenthesis and other values are correlation coefficients. R1 is error effects, R2 = coefficient of determination.

**Figure 3.** Path analysis diagram depicting the causal relationship of measured traits of the 36 maize accessions under heat stressed conditions. Note: Value written in bold is the error effects; the direct path coefficients are values in parenthesis and other values are correlation coefficients. R1 is error effects, R2 = coefficient of determination.

#### *2.5. Abiotic Stress Strongly A*ff*ected Traits and Reduced Grain Yield Levels*

Under OGC, GY averaged 3205 kg/ha (Supplementary Table S3). Grain yield was reduced to 1744 kg/ha under MDS, 1443 kg/ha under HS and to 1088 kg/ha under DSHS. Anthesis was, on average, reached at 55 days under OGC and MDS and about 60 days under HS (67) and DSHS (65). As a result

of reduced water availability, ASI averaged five days under MDS. Mean ASI under HS and DSHS, was three days, similar to that recorded under OGC (2 days). Plant height was reduced by approximately 25% under MDS, 14% under HS and by 7% under DSHS. Husk cover was less affected by stress conditions as indicated by the average rating value of 4 under the different evaluation conditions. A similar trend was observed for SG and PASP. Ear aspect averaged five under OGC and MDS, and six under HS and DSHS (Supplementary Table S3). While the number of ears per plant averaged 0.81, 0.57, and 0.53 under OGC, MDS, and HS, respectively, more barren plants, reflected as reduced EPP (0.40) were observed under DSHS. As a direct consequence, average yield reduction was higher under DSHS (66.04%) compared to MDS (45.60%) and HS (54.98%).

**Figure 4.** Path analysis diagram depicting the causal relationship of measured traits of the 36 maize accessions under combined drought and heat stressed conditions. Note: Value written in bold is the error effects; the direct path coefficients are values in parenthesis and other values are correlation coefficients. R1 is error effects, R2 = coefficient of determination.

#### *2.6. Germplasm Tolerant to Abiotic Stresses*

The means of GY of top 11 accessions (top 10 landraces and best check) and worst 5 landraces identified using the base index under DS, HS and DSHS are presented in Table 5. Under MDS, the index values varied from −13.66 for TZm-1441 to 14.98 for GH-3505. Top 10 landraces with positive index values yielded above 2000 kg/ha under both MDS and OGC, except TZm-1473 that yielded less (1952 kg/ha) under MDS. Nine of the top 10 performing landraces under MDS (GH-3505, TZm-1317, TZm-1307, GH-4859, GH-5756, TZm-1273, TZm-1353, TZm-1312 and TZm-1311) had a yield advantage of between 4% to 42% over the best check (Aburohemaa). Under HS, the index values ranged from −8.04 for TZm-1319 to 13.53 for Check 1 (2011 TZE-W DT STR Synthetic), whereas under DSHS, it varied from −10.85 for TZm-1277 to 13.66 for Check 1 (2011 TZE-W DT STR Synthetic). Based on index selection, the outstanding landraces under HS included GH-4859, TZm-1353, TZm-1488, TZm-1441, TZm-1466, TZm-1473, TZm-1309, TZm-1325 and TZm-1317). Of these, TZm-1353 out-yielded the best check (2011 TZE-W DT STR Synthetic) by approximately 20% while GH-4859 produced yield comparable to 2011 TZE-W DT STR Synthetic. Under DSHS, the top 10 landraces identified by the base index included GH-4859, TZm-1473, TZm-1325, TZm-1441, TZm-1466, TZm-1273, TZm-1551, TZm-1452 and TZm-1353. Two landrace accessions (GH-4859 and TZm-1353) combined high yield potential with desirable secondary traits (reduced ASI, TB and LF, and increased PHLT, SG, and EPP as well as good PASP and EASP).


*Plants* **2019** , *8*, 518


**Table 5.** *Cont*.

characteristics;

 EASP: Ear aspect; YR: Yield reduction; BI: base index. Check 1: 2011 TZE-W DT STR Synthetic; Check 2: Aburohemaa.

#### *2.7. Grouping of Accessions under Abiotic Stresses*

Phylogenetic constellation plots generated from the standardized data of grain yield, plant and ear aspect scores, anthesis-silking interval, ears per plant and stay green characteristics under MDS, HS and DSHS are presented in Figure 5. Under MDS, the accessions were classified into five major groups. The number of accessions in the clusters ranged from one in cluster I to 11 in clusters II and III. The accessions of clusters I, II and III were characterized by increased ears per plant, delayed senescence, and desirable plant and ear aspects that resulted in positive base index values (Supplementary Table S4). Under HS, the accessions were grouped into two clusters, each consisting of four sub-clusters (Figure 5). The first major cluster consisted of 15 accessions, which included the three checks. Majority of the accessions in this group had high grain yields, increased ears per plant, shorter ASI and good ear aspect (Supplementary Table S4). Consequently, the base index was positive for this cluster. The second major cluster consisted of 21 landraces that were largely barren with poor ear aspect scores and low grain yield, resulting in negative base index (BI) values averaging −3.79. Similarly, under DSHS, the 36 maize accessions were clustered into three major groups (Figure 5). Cluster I consisted of 12 accessions, which included two checks whereas clusters II and III were represented by 14 and 10 accessions, respectively. Most of the accessions in cluster I were characterized by increased ears per plant, good ear aspect, high grain yield and positive base index values. In contrast, the accessions of clusters II and III were generally barren with poor ear aspect scores, low yield resulting in negative base index values (Supplementary Table S4).

**Figure 5.** Phylogenetic constellation plots displaying the relationships between 33 maize landraces and three improved populations/varieties evaluated under managed drought stress (**left**), heat stress (**middle**) and combined drought and heat stress (**right**). Cluster I, II and III (**left**), and I (**middle** and **right**) are represented by drought, heat and combined drought and heat-tolerant accessions, respectively while the remaining clusters consisted of susceptible accessions.

#### **3. Discussion**

Under the prevailing and future conditions of climate change, DS, HS and DSHS stresses constitute the most important environmental factors constraining maize production in SSA [2,3]. Results of climate simulation studies showed that these stresses will most likely coincide with the flowering and grain filling stages of maize growth and development in large areas of SSA [3,12] and will result in huge yield losses on farmer's field. The value of landraces as potential donors of beneficial alleles for breeding for climate change resilience is well-recognized [14]. Our study aimed at identifying landraces with good levels of tolerance to DS, HS and DSHS for use in maize breeding programs as potential sources of beneficial alleles for developing cultivars with enhanced resilience to climate change as well as to identify key stress-adaptive secondary traits that could lead to genetic enhancement for grain yield under DS, HS and DSHS stressed environments.

The sites selected for this study were previously used for screening maize genotypes for high levels of tolerance to DS and/or HS for climate change adaptation [4,6,7]. As shown in Table 1, the HS trials were exposed to elevated temperatures, while the DSHS trials were subjected to prolonged DS at elevated temperatures. In particular, temperatures during the reproductive stages highly exceeded the optimal threshold value of 34 ◦C for lowland tropical maize (Figure 1), indicating that the environments used for this study were appropriate for screening the maize germplasm for tolerance to HS and DSHS.

The significant mean squares observed for GY and most other measured traits under each evaluation condition indicated that substantial genetic variation existed among the accessions, which should facilitate selection for DS, HS and DSHS tolerance and key secondary traits conferring tolerance under the research conditions. These observations corroborated the results of Gouesnard et al. [17] who suggested the presence of significant genetic variability for tolerance to abiotic stresses in tropical maize landraces. Moreover, the presence of significant means squares of genotype by environment interaction for most of the traits indicated that the accessions varied in their responses to the stresses of the different years. These findings are consistent with the results of Meseka et al. [7].

Although broad-sense heritability estimates of GY of single trials under the stresses were moderately high, these results were consistent with previous studies of maize under abiotic stress [3,6,18]. However, as indicated by Cairns et al. [3], broad-sense heritability values of single trials can be inflated because genetic variance and genotype × trial interaction variance are confounded. The high repeatability values (60%) observed for most measured traits including GY under the stressed and non-stressed environments suggested consistency in the expression of the traits under the research conditions. These results largely provided a good indication of the performance of the accessions for breeding purposes. Similar observations were reported in maize under multiple stresses [3,19].

Grain yield observed under OGC was to some extent predictive of GY under both MDS and HS conditions as indicated by the moderately positive genetic and phenotypic correlations. These observations were in agreement with the results of previous studies on abiotic stress in maize [3,7,19,20]. The GY under OGC and DSHS had weak positive genetic and phenotypic correlations (0.29 and 0.23, respectively), suggesting the presence of independent genetic factors controlling yield potential in the two conditions. These results are consistent with the findings of earlier studies by Cairns et al. [3]. The lack of both genetic and phenotypic correlations between GY under MDS and HS as well as between MDS and HS indicated that tolerance to these stresses were modulated by different genetic mechanisms. These results are in agreement with the findings of earlier studies [3,7,18], who found that tolerance to DS was genetically distinct from tolerance to DSHS. In contrast to the results of Cairns et al. [3] however, GY observed under HS and DSHS was positive and moderately high at both the phenotypic and genetic levels. This result was in agreement with the findings of Tandzi et al. [21] who demonstrated that HS tolerant maize genotypes were most likely to be tolerant to DSHS conditions. Furthermore, the significant phenotypic correlations between the same traits under the different research conditions indicated the existence of common genetic elements governing the expression of the measured traits under the different research conditions.

The stresses applied in this study had significant negative impacts on GY and other relevant traits. This result is in agreement with the reports of Cairns et al. [3] and Trachsel et al. [18]. In particular, while the days to anthesis was on the average unaffected under MDS, silking was significantly affected, resulting in delayed ASI that affected asynchrony between male and female flowers and eventually led to reduced ears per plant. Days to anthesis was on average delayed by 10 days under DSHS and by 12 days under HS whereas, silking was delayed by 13 and 11 days under HS and DSHS, respectively. These observations could be attributed to delayed emergence owing to the severe cold due to harmattan at the time of planting. The moderate reductions in plant heights observed under

HS and DSHS indicated that the plants were only affected by these stresses towards the end of the vegetative phase. Similarly, the lower average plant height observed under MDS could be attributed to the incidence of the drought stress at an early stage of plant growth and development. Indeed, the MDS was imposed during the early growth stages (25 DAP) compared to the HS and DSHS, which were imposed at 32 DAP. Traits such as EPP and EASP were most negatively affected under HS and DSHS compared to MDS. Consequently, average yield reduction under MDS was lower than that observed under HS and DSHS. Comparison of the yield loss under MDS and HS to DSHS revealed that the latter had hypo-additive effects (i.e., the effect of combined stress was higher than the individual effects but lower than their sum). These results were most probably due to the interaction of HS and DS on stomatal movements. Plants have to either close their stomata under DSHS at elevated temperature to prevent water loss or keep stomata opened to cool the leaves through transpiration [22]. Alterations in these morpho-physiological mechanisms under DSHS might have caused osmotic imbalances, resulting in the huge yield losses. These results are in agreement with the findings of previous workers who reported higher yield losses from the combined effects of DS and HS than DS and HS applied alone [7,13].

Selection based on grain yield alone under DS, HS and DSHS often limits selection gains because of its complex nature and the low heritability of the trait under stress [23]. Consequently, secondary traits that are highly heritable and are associated with GY have been widely used as selection criteria for improved yield potential under abiotic stresses. In maize for instance, significant genetic gains were reported under abiotic stresses such as low nitrogen and DS by complementing GY with key secondary traits [24]. In particular, ASI, senescence, tassel blast, ears per plant, kernel per row, tassel sterility, pollen viability, and stigma receptivity and other morpho-physiological traits, such as leaf firing were identified as key secondary traits associated with GY under DS, HS and DSHS [15,16]. In the present study, sequential multiple regression analyses revealed both ear and plant aspects, and to some extent, ears per plant and root lodging as the major determinants of yield potential, accounting for more than 85% of the observable differences in grain yield levels under the stresses. The repeatability values of these traits were moderately high. Therefore, both ear and plant aspects, ears per plant and root lodging have the potential to improve the selection efficiency for GY under the target stresses. These results partly corroborated the findings of Meseka et al. [7] who reported ear and plant aspects as well as ears per plant as the principal yield determinant traits under DS and DSHS.

An important objective of the present study was to identify landraces with outstanding performance under each of the stresses for use in maize breeding programs as sources of tolerance to the stress factors. The outstanding landraces identified by the base index under each treatment, could be invaluable sources of beneficial genes/alleles for improving DS and/or HS tolerance in elite maize germplasm. In particular, GH-3505 yielded approximately 4 tons ha−<sup>1</sup> under MDS, making it interesting for use in drought-prone areas. Moreover, the accessions that yielded above the best-improved checks under each research condition should be prioritized for introgression into breeding pipelines. Of special interest for breeders will be accessions GH-4859 and TZm-1353, which combined desirable secondary traits with good levels of tolerance (positive base index values) to all the applied stresses. The fact that only two accessions were tolerant to all the target stresses was most likely due to the fact that different genetic mechanisms underlie tolerance to the three stresses applied in the present study. Furthermore, cluster analysis based on phylogenetic constellation plots largely separated tolerant accessions from their susceptible counterparts indicating that the two classes of accessions were genetically dissimilar. Such clustering patterns of maize genotypes were previously reported under DS and DSHS [7] as well as DS, HS and DSHS [21].

#### **4. Materials and Methods**

#### *4.1. Genetic Materials*

One hundred and ninety-six (196) maize landraces, representing gene pools from Burkina Faso, Ghana and Togo, were sourced from germplasm banks at International Institute of Tropical Agriculture (IITA), Nigeria and the Plant Genetics Resources Institute of Ghana. The germplasm was evaluated for phenotypic variation during the main growing season in 2017 and 2018 [25]. Thirty-three landraces, comprising six each from Burkina Faso and Ghana, and 21 from Togo, were used for this study. The landraces were selected based on the expression of the adaptive traits such as shortness and early flowering under OGC [25]. Three improved populations/varieties with enhanced tolerance to DS and/or HS, which served as checks were obtained from the Maize Improvement Program at IITA (IITA-MIP), Ibadan, Nigeria. The list of the genetic materials evaluated in this study is presented in Table 6.

**Table 6.** List of the 36 maize accessions that were evaluated for tolerance to drought, heat and combined drought and heat stress between 2017 and 2019 at Ikenne and Kadawa, Nigeria.


IITA-MIP: Maize Improvement Program at the International Institute of Tropical Agriculture.

#### *4.2. Trial Establishment and Agronomic Management*

The 36 maize accessions (33 landraces plus 3 DS and/or HS tolerant populations/varieties) were evaluated for two years under OGC, MDS, HS and DSHS. The genetic materials, experimental design, and replications were the same for all evaluation conditions. The trials were arranged in 6 by 6-alpha lattice design (incomplete design) with two replications. A plot consisted of one row, 3 m long. Rows were spaced 0.75 m apart and the distance between hills was 0.40 m. Three seeds were planted in a hill and thinned to two per stand at two weeks after planting (WAP), resulting in a final plant population density of 66,666 plants ha<sup>−</sup>1. Pre-emergence weeds were controlled by applying gramoxone and atrazine at the rates of 1.5 L gramoxone and 2.5 L atrazine in 200 L of water ha-1. Subsequently, manual weeding was done to keep trials free from weeds.

In the first experiment, the accessions were evaluated at Ikenne (6◦53 N, 3◦42 E, 60 m altitude, 1200 mm annual rainfall), Nigeria, under MDS during the dry seasons of 2017/2018 and 2018/2019. The soil type at Ikenne is Eutric nitrisol [26]. The MDS at Ikenne was achieved by using a sprinkler irrigation system that provided 17 mm of water per week up to 25 days after planting (DAP). Thereafter,

the irrigation was withheld until maturity, so that the maize plants depended on stored water in the soil for growth and development.

In the second experiment, the accessions were evaluated for tolerance to HS and DSHS at Kadawa (11◦39 N, 8◦27 E, 500 m altitude), Nigeria, where drought stress at elevated temperature occur between February and June. The soil type at Kadawa is characterized as Regosols, with mainly sandy to clay loam texture [27]. At Kadawa, air temperature during the dry season often range from 33 to 45 ◦C [7]. This enabled establishment of trials under HS and DSHS, in which water supply was carefully controlled by a furrow irrigation system. The trials under HS and DSHS were laid in adjacent blocks, separated by 15 m to prevent spill-over of irrigation water. The trials were planted on the same day in mid-February each year, and flowering and grain-filling stages occurred in April when rainfall incidence was negligible and temperatures ranged from 36 and 41 ◦C, allowing for exposure of the accessions to HS and DSHS (Figure 1). A furrow irrigation system was used to supply water to the crop every four days during the first 32 DAP. Thereafter, irrigation water was withdrawn from the DSHS block while the HS block continued to receive irrigation water until physiological maturity. Irrigation was resumed one week after grain filling and was applied only once per week to avoid complete loss of DSHS trials. Meteorological data were recorded during the trial period with the aid of an automated weather station.

The same set of genetic materials were initially grown under OGC during the main growing seasons of 2017 and 2018 at Ikenne and Mokwa (9◦18 N, 5◦185 4 E, altitude 457 m, 1100 mm annual rainfall), Nigeria. The soil at Mokwa is luvisol with 0.27, 0.035 and 0.48% organic C, organic N and P content [26].

#### *4.3. Traits Measured*

Data were recorded on several traits at flowering including the number of days from planting to when 50% of the plants in a plot were shedding pollen and had emerged silks under DS, HS, DSHS and OGC, respectively. Anthesis-silking interval (ASI) was computed as the difference between days to 50% silking and anthesis. In addition, the number of plants showing leaf firing and tassel blasting were counted on plot basis at the flowering and grain filling stages on HS and DSHS trials and converted to percentages. Plant and ear heights (PLHT and EHT) were measured as the distance from the base of the plant to the height of the first tassel branch and the node bearing the upper ear, respectively. Root lodging (SL) (percentage of plants leaning more than 30◦ from the vertical), and stalk lodging (SL) (percentage broken at or below the highest ear node) were recorded as taking a few days to harvest. At 70 DAP, Plant Aspect (PASP) was scored based on the general architectural appeal of plants in a plot (standability, vigour, plant and ear height, uniformity of plants, ear placement and size, as well as disease damage and lodging) using a scale of 1 to 9, where 1 = excellent overall phenotypic appeal; 2 = very good overall phenotypic appeal; 3 = good overall phenotypic appeal; 4 = satisfactory overall phenotypic appeal; 5 = acceptable phenotypic appeal; 6 = undesirable phenotypic appeal, 7 = poor overall phenotypic appeal, 8 = very poor phenotypic appeal and 9 = completely undesirable phenotypic appeal. Similarly, husk cover (HC) was rated on a scale of 1 to 9, where 1 = husks tightly arranged and extended beyond the ear tip and 9 = exposed ears. In addition, stay green characteristic (SG) was recorded for the MDS, HS and DSHS plots at 70 DAP on a scale of 1 to 9, where 1 = 10% dead leaf area; 2 = 20% dead leaf area; 3 = 30% dead leaf area, 4 = 40% dead leaf area; 5 = 50% dead leaf area; 6 = 60% dead leaf area; 7 = 70% dead leaf area; 8 = 80% dead leaf area and 9 represented 90%–100% dead leaf area. At harvest, number of ears per plant (EPP) was obtained by dividing the total number of ears per plot by the total number of plants harvested. Ear aspect (EASP) was recorded based on general appeal of the ears without the husks (ear size and number; uniformity of size, colour and texture; extent of grain filling, insect and disease damage) using a scale of 1 to 9, where 1 = excellent (clean, uniform, large, and fully filled ears with no disease/insect damage); 2 = very good ears with no disease/insect damage and fully filled grains, one or two irregularity in cob size; 3 = good with no disease/insect damage and fully filled grains, one or two irregularity in cob size; 4= mild insect

damage, no disease, fully filled grains, one or two irregularity in cob size poor; 5= mild disease/insect damage and fully filled grains, one or two irregularity in cob size, 6 = severe disease/insect damage and fully filled grains, smaller cobs, non-uniform cob size; 7 = severe disease/insect damage, scanty grain filling, few ears, non-uniformity of cobs; 8= severe disease/insect damage, scanty grain filling, very few ears and 9 = only one or no ears. In the stressed trials, harvested ears from each plot were shelled to determine the percentage grain moisture. Grain yield in kg ha–1 was computed from the shelled grain weight and adjusted to 15% moisture content, whereas in the experiments under OGC, harvested ears of each plot were weighed and GY was estimated based on 80% shelling percentage and adjusted to 15% moisture content.

#### *4.4. Data Analysis*

Data from each evaluation condition (OGC, MDS, HS and DSHS) was analysed separately. Variance components were estimated using the mixed model procedure in SAS version 9.4 [28] based on the standard linear mixed model described by Vargas et al. [29]. All effects except for genotypes were considered random and the Best Linear Unbiased Predictor (BLUP) was estimated for all measured traits following the procedure of Robinson [30].

Broad-sense heritability (H2) of GY of each trial was estimated using META-SAS v4 [29]. Within treatment repeatability of the traits [31] were calculated on accession-mean basis using the following formula:

$$\mathcal{R} = \frac{\sigma\_{\mathcal{S}}^2}{\sigma\_{\mathcal{S}}^2 + \frac{\sigma\_{\mathcal{S}^c}^2}{c} + \frac{\sigma\_{\mathcal{S}}}{\pi}}$$

where σ<sup>2</sup> *<sup>g</sup>* is the genotypic variance, σ<sup>2</sup> *ge* is the variance of genotype × environment, σ*<sup>e</sup>* is the residual variance; *e* is the number of environments, and *r* is the number of replicates per environment.

The phenotypic and genetic correlations between GY of the different treatments were computed following the procedure of Cooper et al. [32] as:

$$\rho\_{\S} = \frac{\overline{\sigma\_{\S(jj')}}}{\overline{\sigma\_{\S(j)} \sigma\_{\S(j')}}}$$

where ρ*<sup>g</sup>* is the genetic correlation, σ*g*(*jj*) is the arithmetic mean of all pairwise genotypic covariances between environment *j* (say, optimum) and *j* (say, drought), and σ*g*(*j*)σ*g*(*<sup>j</sup>* ) is the arithmetic average of all pairwise geometric means between the genotypic variance components of the environment.

The sequential path coefficient analyses [33] was performed using the Statistical Package for Social Sciences, SPSS version 17.0 [34] to determine the secondary traits with significant contributions to GY under each research condition. A step-wise regression analysis was used to categorize the predictor traits into first, second and third order based on their individual contributions to total variation in GY with minimized multicollinearity [35]. The first step involved the regression of all the traits on GY and those with significant contributions to GY at *p* < 0.05 were identified as first order traits. Subsequently, traits that were not identified as first order traits were regressed on each of the first order traits to identify those with significant contributions to GY through each of the first order traits and were categorized as second-order traits. The procedure was repeated to identify traits in subsequent orders. The path coefficients were represented by the standardized b-values obtained from the regression analysis [33,35,36].

A base index (BI) that integrated superior grain yield, EPP, ASI, PASP, EASP, and SG was used to select the best and worst performing genotypes under each treatment [37]. Each trait was first standardized with standard deviation of 1 and a mean of zero to minimize the effect of the different scales prior to integrating into the BI. The BI was computed using the equation:

$$BI = \lfloor (2 \times \text{YLD}\_S) + EPP - ASI - PASP - EASP - SG \rfloor$$

where *YLDS* is GY under stress, *PASP* is plant aspect, *EASP* is ear aspect, *EPP* is ears per plant, *ASI* is anthesis-silking interval and *SG* is the stay-green characteristic. A positive *BI* value indicated tolerance to the applied stress while negative values indicated susceptibility [37].

Subsequently, all traits included in the *BI* were used as an input file to generate a phylogenetic constellation plot, which classified the accessions into genetic groups. The phylogenetic constellation plot was performed using JMP pro 14.10 [28] based on Ward's algorithm.

#### **5. Conclusions**

In conclusion, highly significant differences were observed among the accessions under each of the evaluation conditions thus, enabling identification of donors with good levels of tolerance to DS, HS and DSHS. Introgression of these outstanding donors into breeding materials will help maximize genetic gains under the stress conditions. In particular, accession GH-3505 was highly tolerant to DS while GH-4859 and TZm-1353 combined desirable secondary traits with good levels of tolerance to all the stresses. These accessions can immediately be used in tropical maize breeding programs to develop cultivars with enhanced tolerance to abiotic stress to sustain production and thus, food security in the face of climate change. The lack of correlation between the stress treatments and the poor overlap of genotypes performing well across all the applied treatments indicated that tolerance to these stresses were under independent genetic control, thus emphasizing the need to screen germplasm under each abiotic stress separately. However, the results also, showed that heat-tolerant accessions were most likely to tolerate combined drought and heat stress conditions. Identifying the genomic regions potentially underlying tolerance and the gene action controlling the stress-adaptive traits might further facilitate the introgression of these valuable landraces into breeding pipelines.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2223-7747/8/11/518/s1, Table S1: Genotypic and residual variance, and broad-sense heritability estimates of grain yield (kg/ha) of the nine individual trials. Table S2: Association of traits measured under optimal growing conditions with same traits under managed drought stress, heat stress and combined drought and heat stress, and those measured under managed drought stress with same traits under heat stress and combined drought and heat stress as well as with those measured under heat stress with the same traits under combined drought and heat stress. Table S3: Mean grain yield and other traits of 36 maize accessions evaluated under optimal growing conditions, managed drought stress, heat stress and combined drought and heat stress between 2017 and 2019, in Nigeria. Table S4: Cluster means (base index values and other secondary traits) of 36 maize accessions evaluated under optimal growing conditions, managed drought stress, heat stress and combined drought and heat stress between 2017 and 2019, in Nigeria.

**Author Contributions:** Conceptualization: A.S.P.N. and A.Y.T.; Methodology: C.N. and B.B.-A.; Supervision: B.B.-A., A.S.P.N. and A.Y.T.; Data analysis and manuscript draft: C.N.; Manuscript review and editing, B.B.-A., A.Y.T. and A.S.P.N.

**Funding:** This research was funded by the German Federal Ministry of Education, through the West African Science Service Centre on Climate Change and Adapted Land-use (WASCAL). This work was also supported by the Bill & Melinda Gates Foundation [OPP1134248] through the funding support to the Stress Tolerant Maize for Africa (STMA) Project.

**Acknowledgments:** We are grateful to the Genetic Resource Centre in IITA and the Plant Genetics Resources Institute of Ghana for providing maize accessions used in this study.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

### **Genetic Diversity, Nitrogen Fixation, and Water Use E**ffi**ciency in a Panel of Honduran Common Bean (***Phaseolus vulgaris* **L.) Landraces and Modern Genotypes**

**Jennifer Wilker 1, Sally Humphries 2, Juan Carlos Rosas-Sotomayor 3, Marvin Gómez Cerna 4, Davoud Torkamaneh 1, Michelle Edwards 1, Alireza Navabi 1,**† **and K. Peter Pauls 1,\***


Received: 31 July 2020; Accepted: 11 September 2020; Published: 19 September 2020

**Abstract:** Common bean (*Phaseolus vulgaris* L.) provides critical nutrition and a livelihood for millions of smallholder farmers worldwide. Beans engage in symbiotic nitrogen fixation (SNF) with Rhizobia. Honduran hillside farmers farm marginal land and utilize few production inputs; therefore, bean varieties with high SNF capacity and environmental resiliency would be of benefit to them. We explored the diversity for SNF, agronomic traits, and water use efficiency (WUE) among 70 Honduran landrace, participatory bred (PPB), and conventionally bred bean varieties (HON panel) and 6 North American check varieties in 3 low-N field trials in Ontario, Canada and Honduras. Genetic diversity was measured with a 6K single nucleotide polymorphism (SNP) array, and phenotyping for agronomic, SNF, and WUE traits was carried out. STRUCTURE analysis revealed two subpopulations with admixture between the subpopulations. Nucleotide diversity was greater in the landraces than the PPB varieties across the genome, and multiple genomic regions were identified where population genetic differentiation between the landraces and PPB varieties was evident. Significant differences were found between varieties and breeding categories for agronomic traits, SNF, and WUE. Landraces had above average SNF capacity, conventional varieties showed higher yields, and PPB varieties performed well for WUE. Varieties with the best SNF capacity could be used in further participatory breeding efforts.

**Keywords:** nitrogen fixation; symbiosis; bean; landrace; PPB; participatory breeding; climate resilient; Honduras

#### **1. Introduction**

The common bean (*Phaseolus vulgaris* L.) is the most important food legume grown and consumed worldwide. High in protein, fiber, and essential nutrients, the nutritional profile and affordability of beans relative to other protein sources make beans a dietary staple in developing economies.

A member of the Fabaceae family, common bean is a predominantly self-pollinating species with a genome size of 587 Mbp and ploidy of 2n = 2x = 22 [1]. The center of origin for common bean is in

present-day Central Mexico [2]. As a result of geographical dispersion, the ancestral bean diverged and evolved into two domesticated gene pools. Larger-seeded market classes evolved and were domesticated in South America and belong to the Andean gene pool, while smaller-seeded market classes evolved and were domesticated in Central America and belong to the Middle American gene pool [2–7].

Beans are traditionally grown as monocrops or with maize in either a relay cropping or an intercropping system in Honduras. There are two growing seasons—the rainy *Primera* (May–September) and the traditionally dry *Postrera* (October–January)—although climate change causes more fluctuation in precipitation levels and the duration of these seasons. Food insecurity is an issue in hillside communities and is a particular problem during the summer months before the *Primera* harvest. In some locations, this hungry period is termed *los junios* after the month during which food becomes scarce. More than 50% of bean production in Honduras takes place on steep hillside slopes (15–30◦ and greater) [8]. In addition, the country's infrastructure is poor, and less than 30% of the bean growing area is located within an hour's travel to a road, thus restricting market access [8].

Bean production in Honduras is affected by various biotic and abiotic stresses, and productivity is low and averaged 785 kg ha−<sup>1</sup> in 2018 compared to yields in Canada, which averaged 2888 kg ha−<sup>1</sup> in 2018 [9]. Bean diseases and insect pests comprise the primary biotic stresses of Honduran bean production. The most impactful diseases are Bean Golden Mosaic Virus (BGMV), rust (caused by *Uromyces phaseoli*), web blight (caused by *Rhizoctonia solani*), anthracnose (caused by *Colletotrichum lindemuthianum*), and angular leaf spot (ALS; caused by *Pseudoscercospora griseola*) [8]. The whitefly (*Bemisia tabaci*) is the vector for BGMV and is the most important insect pest of bean in Honduras. Weevils (*Acanthoscelides obtectus* and *Zabrotes subfasciathus*) are serious pests of stored beans, reducing marketability and damaging seed for planting. Climate change is expected to affect the impact of these biotic stresses in bean production and may lead to a shift in the complex of pests and diseases involved [10].

Extreme weather, such as high temperatures and flooding, including from hurricanes, reduces bean production. Climate models for Honduras predict higher temperatures and reduced overall rainfall but more extreme weather events increasing floods in the coming decades [11]. Another abiotic stress impacting bean production is soil health. Soils across Central America are deficient in available phosphorus (P), nitrogen (N), calcium (Ca), and organic matter, and aluminum (Al) and manganese (Mn) toxicity are exacerbated by low soil pH levels. Bean productivity is limited by soil nutrient availability, particularly N and P [12]. Nitrogen deficiency reduces grain yield because N is a structural component of various essential molecules, including chlorophyll, amino acids, nucleic acids, and lipids, required for the production of storage carbohydrates and proteins. Soils can be supplemented with nitrogen fertilizer throughout the growing season to avert yield losses; however, synthetic amendments are expensive, difficult to access, and generally not used by bean growers in Honduras. Instead, bean growers rely on organic forms of N, including that derived through symbiotic nitrogen fixation (SNF), as a nutrient source for their crop.

Beans are capable of generating their own organic nitrogen through SNF where nitrogen-fixing bacteria infect root nodules and reduce atmospheric nitrogen into forms useable by the host plant in exchange for carbohydrates [13]. SNF is a complex biological process and its efficiency is impacted by abiotic, biotic, and genetic factors, including soil nutrient levels, environmental conditions, the presence of efficient *Rhizobium* strains, and genetic constitution of the crop grown [12]. Recent studies have confirmed that SNF capacity in beans has a wide range and can reach high, yield-sustaining levels under optimal conditions [14–22]. For example, Kamfwa et al. [14] reported a range of 3.6 to 98.2% nitrogen derived from the atmosphere, and Aserse et al. [20] found that inoculated beans had comparable yields to those grown with nitrogen fertilizer.

Smallholder hillside farmers (0.5–5 ha) comprise approximately 70% of Honduran bean growers, and the remainder of production occurs in foothill and valley regions by larger-scale producers. Hillside farmers cultivate marginal land with steep slopes and low soil fertility, they tend towards

subsistence production, and produce primarily for household consumption following traditional practices and planting traditional crop varieties. These smallholder farmers have limited access to markets, which has a two-fold impact, reducing the influence of market demands on what growers produce and limiting access to modern bean varieties and production inputs. These constraints notwithstanding, hillside farmers market approximately 50% of their bean harvest.

Landraces, known locally as *criollos*, comprise the majority of varieties traditionally grown by hillside farmers in Honduras [23]. Ninety-five percent of beans produced across the country are small, light red beans, which are preferred by Hondurans [8], and are also exported to El Salvador and to the United States to meet the needs of Central Americans who have emigrated there. Some black beans are grown in Honduras and are primarily exported to neighboring Guatemala where that market class is favored [8]. Landraces have local genetic adaptation, high genetic diversity, and lack formal genetic improvement [24]. The genetic heterogeneity of bean landraces lends resilience and makes them able to adapt to the changeable growing conditions of mountain hillsides and other marginal areas where they are grown. Among the preferred traits of the landraces included in this study are adaptation to cultivation at a range of altitudes, more marketable seed coat color and appealing kinesthetic properties, and yield stability in a changeable climate. In addition to traditional landraces, hillside bean farmers also grow conventionally bred and participatory bred varieties.

Conventional bean breeding in Honduras has been primarily the responsibility of the *Programa de Investigaciones en Frijol (PIF)* in the Department of Agronomy at the *Escuela Agricola Pan-Americana* (Zamorano) since the late 1980s when government funding to the agricultural research department (*Dirección de Ciencia y Tecnología Agropecuaria, DICTA*) was reduced restricting agricultural research and extension services [25]. The International Center for Tropical Agriculture (CIAT), the Bean/Cowpea Collaborative Research Support Program (CRSP), and the Regional Cooperative Bean Program (*Programa Cooperativo Regional de Frijol*-PROFRIJOL) have also been involved in variety development and/or providing funding for bean research in Honduras. Zamorano's early breeding focus was on developing conventional varieties with BGMV resistance and improved heat and drought tolerance for the lowland and valley production regions of Honduras [25]. By the late 1990s, Zamorano took leadership in bean breeding for the region, developing small red varieties for Honduras, Guatemala, El Salvador, Nicaragua, and Costa Rica, as well as black bean varieties for Guatemala and Haiti [26]. Conventionally bred commercial varieties are adapted for cultivation across a wide geographic region and have disease resistance and agronomic traits, which can bolster their yields. Adoption of commercial varieties among hillside growers is limited for a number of reasons, including darker seed coat color and other culinary traits, which reduce their marketability, as well as poor yield performance compared to landraces [23]. In the mid 1990s, Zamorano embarked on collaborative research with social scientists from CIAT to explore the social and economic factors that impact adoption of conventional varieties [23]. As a result, participatory research has become an important method used in the Zamorano bean breeding program.

The term participatory plant breeding encompasses two main methods of plant variety development, 'participatory varietal selection' and 'participatory plant breeding'. Participatory varietal selection (PVS) involves farmers locally testing varieties or advanced breeding lines provided by a formal plant breeder and making selections based on their needs [27]. Participatory plant breeding (PPB) involves farmers locally testing early stage (F2-F3) breeding material and can further involve the farmers actively participating in choosing parents and driving variety development by selecting progeny that meet local needs and preferences. As in other reports on the subject, the term PPB will be used to refer to both PPB and PVS in this study [28,29].

Participatory bean breeding at Zamorano has been facilitated through collaboration with CIAT-initiated *comités de investigación agrícola local* (CIALs), which are village-level farmer research teams that create a space where applied agricultural research can be carried out. For this study, we collaborated with the *Fundación para la Investigación Participativa con Agricultores de Honduras* (FIPAH) and *Programa de Reconstrucción Rural* (PRR). FIPAH supports over 100 CIALs, backstopped by regional offices across the country (https://fipah-hn.org/). PRR is an NGO that works with smallholder farmers in Santa Barbara and Lempira and supports approximately 60 CIALs (https: //www.iaf.gov/grants/honduras/2017-prr/) [30]. CIAL members are trained in the scientific method, and most CIALs focus their research on obtaining higher-yielding and climate-resilient corn and bean varieties. The relationship between the CIALs and the bean breeding program at Zamorano is collegial and formal, responding to the needs of the farmers while the research is carried out with scientific rigor [23]. Traits of interest to the farmers are emphasized, and trials are performed using statistically valid designs and research methods. Landraces, conventional varieties, and germplasm from across the region are used in PPB efforts. PPB generates varieties that combine the local adaptation of landraces with improved traits from conventional genotypes such as disease resistance and higher yields. Other traits that factor into selection by farmer-researchers include seed color, appearance and size, pod length, plant architecture, even ripening, early maturity, and various culinary qualities [23]. Zamorano has developed some PPB varieties using landraces as parents in Honduras, Costa Rica, and Nicaragua [31]. Between 1994 and 2015, 24 PPB varieties were developed by Zamorano in collaboration with CIAL groups using participatory research methods, and one of these varieties, 'PM2-Don Rey', has been supported for national registration [29]. Adoption of the PPBs among CIAL members is above 60%, and PPBs are gaining wide acceptance among other farmers in communities where participatory research is carried out [29]. Extensive discussion of the development of 'Macuzalito', 'Cedron', 'Amilcar', 'Esperanceño', 'Chepe', and 'PM2-Don Rey' (representing both PPV and PPB methods of variety development) can be found in Humphries et al. [29].

Due to limited production resources and the threat of climate change, farmers in remote hillside communities would benefit from growing high-yielding common bean varieties that are climate resilient and have high nitrogen fixing capacity. To examine Honduran bean germplasm for these traits of interest, we curated a panel of Honduran bean genotypes representative of the traditional landraces and the participatory bred varieties grown by hillside bean farmers, as well as Honduran conventional and North American checks. The current study tests the hypothesis that bean landraces are a good source of germplasm with a high capacity for nitrogen fixation. The objectives of this study were to determine the genotypic and phenotypic diversity of the Honduran panel and to identify germplasm sources for breeding improved varieties suited to hillside production in Honduras.

#### **2. Results**

#### *2.1. Analysis of Genetic Relatedness*

Landrace and PPB plant material for the Honduran panel were sourced from six municipalities across west–central Honduras. The majority of genotypes came from Yoro (26) and Francisco Morazán (17), with less than 20 genotypes coming from Intibucá, Santa Bárbara, Comayagua, and Lempira combined (Figure 1). Descriptions of the genotypes can be found in Materials and Methods Section 4.1.

**Figure 1.** Map of west–central Honduras, outlining the six departments from which landrace and participatory bred (PPB) bean genotypes were sourced for the Honduran Panel. The chart at the right describes the number of landraces and PPB genotypes obtained from each department and the market classes to which those genotypes belong. The location of Yorito where the Honduran field trial was carried out, Zamorano where the *Escuela Agrícola Panamericana* is located, as well as the capital of Honduras, Tegucigalpa, are shown.

The genetic structure of the Honduran panel was explored to determine the evolutionary relatedness of the genotypes in the panel and the genetic composition of the genotypes. It is apparent from the topology of the phylogenetic tree (Figure 2A) that the landrace genotypes ('CRI') generally group into clusters of connected branches in the tree structure that are positioned in the left half of the figure, denoted as groupings I, II, and III (Figure 2A). The PPB genotypes ('PPB') grouped into separate clusters that are positioned in the right half of the figure, denoted as grouping IV (Figure 2A). Grouping I at the left of the tree, is comprised of the 'Milpero' genotypes, two landraces (HON70 and HON43), and Merlot (HON62). The Milpero landraces belong to diverse market classes, including black, small red, white, and carioca, and they included genotypes that did not flower at Elora in 2014. The remaining landrace genotype clusters were generally delineated by market class membership, with black genotypes comprising grouping II (including HON07, HON45, HON41, HON42, HON65, HON40, and HON43) and small red genotypes comprising grouping III (including HON08, HON09, HON11, HON27, HON68, HON67, HON66, HON51, HON34, HON48, HON38, HON46, HON50, and HON49). The landraces 'Concha Rosada' (HON02) and 'Rosado' (HON22) are displaced and found among the PPB branches of the tree. The North American check genotypes ('CHK'; including HON64, HON61, and HON59) formed a separate cluster that branched off between the Milpero landraces and the black landraces in grouping II. 'OAC Rosito' (HON63), clustered with the Honduran PPBs. All Honduran conventional genotypes ('CNV'; including black HON54, and small red HON77, HON52, HON80, and HON55) grouped with the PPBs (grouping IV), except 'Dorado' (HON56), which is found among the landraces (grouping III). Six PPB genotypes (including HON10, HON05, HON12, HON25, HON33, and HON72) were found within the landrace clusters of the tree (groupings II and III).

**Figure 2.** Analysis of genetic structure and relatedness of 72 genotypes of the Honduran panel. (**A**) Dendrogram of evolutionary genetic relatedness. Abbreviations are: participatory bred (PPB), Landrace (CRI), North American check (CHK), and Honduran conventional (CNV) genotypes. Market classes are denoted by representative beans. (**B**) Genetic structure plot using two genetic groupings (ΔK = 2). **C**. Principal component analysis indicating two genetic groupings in the panel. Genotype descriptions are found in Section 4.1. (Note: In section A, grouping names I–IV are assigned to natural subsections of the tree for descriptive purposes and do not correspond to the genetic groups presented in section B.).

The genetic similarity of genotypes in the panel is depicted in a STRUCTURE plot using two subpopulations (*K* = 2) (Figure 2B). Fourteen of the landraces (including all of the Milpero types) belong to one genetic subgroup at the left of the plot and the PPB varieties belong to the other subgroup at the right of the plot, with an intermediary admixed group (Figure 2B). The Honduran conventional genotypes, except 'Dorado' (HON56), group with the PPBs. The North American check genotypes are found among the admixed genotypes, along with some PPBs and landraces. The principle component analysis of the panel also indicates the relatedness of the genotypes using two genetic groupings (Figure 2C). PC1 divides the genotypes into PPB (green triangles) and landrace (red circles) categories. Along the PC2 axis, the landraces show wide dispersion, with the Milpero group forming a small cluster near the axis at the top of the plot, and the North American check genotypes are scattered among the landraces. In contrast, PC2 generates a denser cluster of PPB genotypes, and the Honduran Conventional genotypes are at the right edge of the plot.

#### *2.2. Nucleotide Diversity and Population Di*ff*erentiation: Landrace and PPB Categories*

Nucleotide diversity was measured in the two largest groupings within the Honduran panel, the landraces and the PPBs, to ascertain the genetic diversity of these groups. According to the π statistic, nucleotide diversity for the landrace category overall (<sup>π</sup> = 3.20 <sup>×</sup> 10−4) was significantly greater (*<sup>P</sup>* <sup>=</sup> 0.04, Welch two-sample *<sup>t</sup>*-test) than that found in the PPB category overall (<sup>π</sup> <sup>=</sup> 2.89 <sup>×</sup> <sup>10</sup><sup>−</sup>4). Additionally, according to the *D* statistic, the overall nucleotide diversity for the landrace category (*D* = 0.669) was significantly greater (*P* = 0.02, Welch two-sample *t*-test) than that found in the PPB category (*D* = 0.476). The positive Tajima's *D* value indicates that both landraces and PPBs are under balancing selection and implies that both categories are probably experiencing different selective pressure. Fifty-six subregions (>100 Mbp long) across the genome were identified where landrace π values exceeded PPB π values by more than 3 times (Table 1, Figure 3A). These regions, identified on all 11 chromosomes, may contain loci related to traits favored by selection associated with formal plant breeding (Figure 3A).

**Table 1.** Regions of the *P. vulgaris* (2.0) genome where high nucleotide diversity (π) was discovered in landrace genotypes compared to PPB genotypes. A literature search was performed to identify candidate genes within these regions. See Table S1 for candidate gene annotation. (Chr—Chromosome).



**Table 1.** *Cont*.

**Figure 3.** Nucleotide diversity and population differentiation. (**A**) Patterns of nucleotide diversity (π) across the genome between *P. vulgaris* landrace (green) and PPB (red) genotypes. Blue vertical bars show the strongly differentiated (3×) regions. (**B**) Weighted *F*ST plot of genetic variance differentiation among landrace and PPB categories. Significant SNPs are red. Significance threshold *F*ST > 0.5. SNP—single nucleotide polymorphism; Chr—Chromosome.

Calculation of population genetic differentiation (*F*ST) between landrace and PPB beans enabled identification of loci under selection between landrace and PPB genotypes. Twenty-six single nucleotide polymorphisms (SNPs) with significant weighted *F*ST values (>0.5) were found on Pv02, Pv07, Pv09, and Pv11 (Table 2, Figure 3B). These SNPs do not fall within the regions of high nucleotide diversity identified in the π comparison above.

**Table 2.** Regions of the *P. vulgaris* (2.0) genome where SNPs with significantly high weighted *F*ST values (>0.5) were found. JBrowse was used to search for candidate genes within 100 Kb of significant SNPs. Candidate gene descriptions are listed in Table S2. (https://legumeinfo.org/genomes/jbrowse/). (Chr—chromosome).


#### *2.3. Identification of Candidate Genes*

Two approaches were used to identify candidate genes associated with regions of significantly high (3×) nucleotide diversity (π) in landraces and high population differentiation (*F*ST) between landraces and PPBs, including: exploring the recent bean literature for reported quantitative trait loci (QTL) and searching the bean genome using JBrowse.

QTL associated with various traits have been reported in the literature, including those related to agronomic traits [5,32] and nitrogen fixation [15,33–36]. Our literature search revealed 10 QTL that fall within regions of significantly high landrace π values, 8 of which are associated with agronomic traits and 2 of which are associated with SNF-related traits. In a GWAS study of agronomic traits in the Middle American Diversity panel (MDP), Wilker et al. (unpublished) found QTL for days to flowering on Pv01 (23.2 Mbp), Pv02 (48.6 Mbp), and Pv06 (13.9 Mbp); QTL for days to maturity on Pv07 (35.6 Mbp) and Pv11 (40.3 Mbp); and QTL for hundred seed weight on Pv01 (23.2 Mbp), Pv05 (32.5 Mbp), and Pv11 (53.5 Mbp). Various candidate genes were found underlying these agronomic QTL and more detailed information is available in Table S1. In a separate GWAS study of agronomic traits in the MDP, Moghaddam et al. [32] found a QTL on Pv01 (42.9 Mbp) associated with growth habit, which contained an RNA polymerase-associated protein RTF1 homolog (Phvul.001G167200). For SNF-related traits, reported QTL that fall within regions of high landrace π values are associated with seed %N content and plant biomass. In a GWAS study of SNF related traits in the MDP, Wilker et al. (unpublished) found a QTL associated with seed %N content at 22.8 Mbp on Pv02. The QTL contains seven putative candidate genes, including a Ras homologous (Rho)/Rho of plants (Rop) family GTPase

(Phvul.002G106600). These genes play a role in the symbiotic interaction between the host plant and rhizobia [37]. Two separate studies investigating SNF and related traits in the Andean and the Middle American gene pools identified a QTL associated with shoot biomass at 45.1 Mbp on Pv11 [33,35]. Shoot biomass supports root symbiosis through carbohydrates generated through photosynthesis as well as serving as a sink for N generated through SNF, which is a source of N ultimately stored in the seed [35]. Beyond searching the recent literature for QTL associated with SNF and agronomic traits, we also examined the study by Schmutz et al. [5] which identified >1800 domestication candidate genes in the Middle American gene pool. Over 140 of the domestication genes identified by Schmutz et al. [5] fall within regions of high nucleotide diversity discovered in our study (see Table S1). Two of these genes have a role in symbiosis: Phvul.008G217100 is a short open reading frame (sORF) small protein of the glycerin rich protein family and is expressed during nodule ontology [38]; and Phvul.010G102300 belongs to the plant nuclear factor Y (NF Y) gene family, whose members are involved in nodule ontology, epidermal infection, and rhizobia discrimination [39].

The bean genome was explored using JBrowse in 100 kb segments centered on SNPs with significant genetic differentiation (*F*ST) to identify putative candidate genes. All genes found within these regions are listed in Table S2. A diverse range of gene types and functions were seen, including plant defense and stress response genes, enzymes, and transcription factors. The PubMed Central database of NCBI (https://www.ncbi.nlm.nih.gov/pmc/) was used to search for published research on putative candidate genes, and some of those findings are discussed here. The region flanking the significant *F*ST SNP on Pv02 (48.9 Mb) contains two leucine rich repeat disease resistance proteins, Phvul.002G323708 and Phvul.002G323712. This region was identified by Oladzad et al. [40] as a major QTL peak in their GWAS study of *Rhizoctonia solani* resistance in Andean beans. A second region on Pv02 (49.0 Mb) contains a disease resistance gene and one associated with nodulation. Tock et al. [41] found that the pentatricopeptide repeat superfamily protein (Phvul.002G326200) at 49.0 Mb was associated with halo blight damage, while Nova-Franco et al. [42] found that a 1-aminocyclopropane-1-carboxylate oxidase gene (Phvul.002G326600) in this region was associated with nodule senescence. A third region on Pv02 (49.2 Mb) contains a protein kinase superfamily protein (Phvul.002G328300) that Zuiderveen et al. [43] found to be significantly associated with Anthracnose resistance in a GWAS of Andean beans. The region centered at 38.9 Mb on Pv07 contains a protein kinase superfamily protein (Phvul.007G268200), which was downregulated in a slow darkening pinto bean study [44]. On Pv09, the region located at 7.8 Mb contains a GATA transcription factor (Phvul.009G035400), which belongs to a family of transcription factors that have been studied in soybean under nitrogen stress and may play a role in nitrogen metabolism [45]. In the region centered on 13.5 Mb on Pv09, a cytokinin oxidase/dehydrogenase 1 gene (Phvul.009G081800) is located that was found to be upregulated in bean root cortical cells inoculated with arbuscular mycorrhizal fungi under drought stress, compared to noninoculated roots [46].

#### *2.4. Diversity for Symbiotic Nitrogen Fixation*

The influences of genotype, environment, and the genotype by environment interaction were significant for the combined locations ANOVA for %Ndfa (Table S3); therefore, each environment was analyzed separately for this trait. At each location, significant differences were found between genotypes for %Ndfa (Tables S4–S6). At Elora 2014 (*N* = 49), the average nitrogen fixation capacity was 49.3% and ranged from 21.0% to 64.4%, a difference of 43.4% between the least and most effective genotypes. At Elora 2015 (*N* = 62), the average nitrogen fixation capacity was higher at 55.8%, yet the range for this trait was narrower with a low of 40.5% and a high of 67.3%, a difference of 26.8% between the least and most effective genotypes. At Yorito (*N* = 53), the average nitrogen fixation capacity was 49.0% with a range of 14.0% to 66.4%, a difference of 52.4% between the least and most effective genotypes, which was the greatest range in performance of all locations.

Further, in a separate ANOVA for each location, the genotypes were divided into breeding history categories and their means were compared. In these analyses, significant differences were found among breeding categories at two of the three trial locations. At Elora 2014, the landrace genotypes (*N* = 20, *M* = 52.5 %Ndfa) performed better than all other breeding categories, although the difference between categories was not significant (Table 3). It is evident from the Elora 2014 %Ndfa histogram (Figure S1A) that many landrace genotypes had above average nitrogen fixation performances. At Elora 2015, the Honduran conventional genotypes (*N* = 7, *M* = 59.0 %Ndfa) and the landraces (*N* = 26, *M* = 58.3 %Ndfa) exhibited the best nitrogen fixing capacities, but they were not significantly different from each other (Table 3). The average nitrogen fixing capacities of the North American check genotypes (*N* = 5, *M* = 50.0 %Ndfa) and the PPBs (*N* = 24, *M* = 53.2 %Ndfa) were similar at Elora 2015 and significantly lower than the values for the Honduran conventional and landrace genotypes (Table 4). Of particular note, Merlot (HON62) fixed the most N at Yorito (66.4%), almost 6% more than the next best genotype. This genotype, bred for Northern US production, also performed well at Elora in 2014 (64.3 %Ndfa), but had the worst performance among conventional genotypes at Elora 2015 (43.5 %Ndfa). Merlot has very dark green leaves, an indicator of plant N status, and consistently had high leaf chlorophyll content when measured with the SPAD meter in the Elora trials. (SPAD was not measured at Yorito.) As with the Elora 2014 trial, many landrace genotypes had above average nitrogen fixation performances at Elora 2015 (Figure S1B). At Yorito, the landrace genotypes (*N* = 22, *M* = 46.4 %Ndfa) showed significantly higher nitrogen fixing capacities than the PPBs (*N* = 22, *M* = 40.1 %Ndfa), whereas, the check and Honduran conventional genotypes had intermediary SNF means and did not have significantly different nitrogen fixing performance values when compared to each other nor the other breeding categories (Table 5). As we found at the other trial locations, the landraces showed above average nitrogen fixing performance at Yorito (Figure S1C).

The top five landraces with the highest SNF performance at Yorito were Vaina Rosada (60.6 %Ndfa; HON34), Cincuenteño (59.5 %Ndfa; HON48), Negro Cuarenteño (57.0 %Ndfa; HON42), Olanchano Negro (56.4 %Ndfa; HON65), and Paraísito (53.6 %Ndfa; HON49). These landraces represent both small red and black market classes and are from three different departments (Yoro, Francisco Morazán, and Intibucá). Vaina Rosada, Cincuenteño, and Paraísito are already used in participatory breeding efforts between Zamorano and FIPAH, and a number of the resulting PPB varieties were included in our panel (including HON05, HON23, HON25, HON26, HON28, HON31, HON32, and HON33). The PPB progeny of these landraces ranged in SNF capacity from 26.6 to 53.3 %N at Yorito, which is noteworthy considering SNF was not a breeding objective. Amilcar (53.3 %Ndfa; HON05) is among the top five SNF performing PPB varieties at Yorito, which also included Conan 33 (55.7 %Ndfa; HON24), Campechano (54.5 %Ndfa; HON57), San Jose (51.1 %Ndfa; HON35), and Arbolito Negro (50.8 %Ndfa; HON72). Both small red and black beans are represented in this list, and they come from three departments (Yoro, Francisco Morazán, and Santa Barbara).


**Table 3.** F-test of fixed effect breeding category in the GLIMMIX analysis, and the breeding category LSmeans comparisons of genotypes in the HON panel grown at Elora, 2014.

\* Means labeled with different letters within trait categories are significantly different according to ANOVA, *p* = 0.05.


**Table 4.** F-test of fixed effect breeding category in the GLIMMIX analysis, and the breeding category LSmeans comparisons of genotypes in the HON panel grown at Elora, 2015.

\* Means labeled with different letters within trait categories are significantly different according to ANOVA, *p* = 0.05.

**Table 5.** F-test of fixed effect breeding category in the GLIMMIX analysis, and the breeding category LSmeans comparisons of genotypes in the HON panel grown at Yorito, 2014–2015.


\* Means labeled with different letters within trait categories are significantly different according to ANOVA, *p* = 0.05.

Leaf chlorophyll content was measured at Elora in 2014 and 2015, and these values were analyzed in separate ANOVAs because of the repeated-measure nature of trait data collection. The combined ANOVA indicated significant differences for the fixed effects of genotype, environment, and the genotype by environment interaction at both the early vegetative and reproductive stages (Table S7). When this trait was analyzed by location, significant differences were found between genotypes at both locations and for both growth stages (Table S7).

#### *2.5. Diversity for Agronomic Traits*

A series of agronomic traits were measured for this study, including carbon discrimination (Δ13C) as an indicator of water use efficiency, plant growth stages (days to flowering and maturity), yield (kg ha−1), and hundred seed weight. Significant differences were found for the fixed effects of genotype, environment, and the genotype by environment interaction for the agronomic traits carbon discrimination (Δ13C), days to flowering (DTF) and days to maturity (DTM), yield (kg ha−1), and hundred seed weight (HSW) in a combined ANOVA (Table S3). These traits were therefore analyzed further within locations.

Significant differences were found between genotypes at all locations (Tables S4–S6) for carbon discrimination (in Δ13C units) calculated according to the method of Farquhar et al. [47] from seed carbon discrimination (δ13C) values obtained from isotope analysis of seed samples. At Elora 2014 (*N* = 48), Δ13C values ranged from 18.4‰ to 21.4‰ (Figure S2A), and at Elora 2015 (*N* = 62), the range was similar, from 17.4‰ to 21.1‰ (Figure S2B). At Yorito (*N* = 53), the Δ13C values were lower, ranging from 16.4‰ to 20.0‰ (Figure S2C). When genotypes were divided into breeding categories and compared, significant differences were only found at Yorito (Table 5), where the average PPB Δ13C value (*N* = 22, *M* = 17.7‰) was significantly lower than the average landrace Δ13C value (*N* = 22, *M* = 18.2‰) (Table 5).

Significant differences were found between genotypes in DTF measured at Elora in 2014 and 2015 (Tables S4–S6). At Elora 2014 (*N* = 58), the average was 50 DTF with a range of 42–62 DTF (Figure S3A). Some Honduran genotypes (including HON9, HON13, HON18, HON28, HON32, HON33, HON36, HON39, HON44, and HON47) did not flower at that first trial location and were replaced by different genotypes at the subsequent locations. At Elora 2015 (*N* = 57), the average was 50 DTF with a range of 42–55 DTF (Figure S3B). When genotypes were divided into breeding categories and compared, significant differences were found at Elora 2014 only (Table 2). Overall, the landrace genotypes flowered the earliest (*N* = 22, *M* = 48 days) and they were significantly earlier than the PPB genotypes (*N* = 26, *M* = 52 days) (Table 2).

Significant differences were found among genotypes for DTM measured at Elora in 2014 and 2015 (Tables S4–S6). At Elora 2014 (*N* = 35), the average was 112 DTM with a range of 97–120 DTM (Figure S4A). At Elora 2015 (*N* = 56), the average was 111 DTM with a range of 94–115 DTM (Figure S4B). Significant differences in DTM were found only at Elora 2015 (Table 3), when genotypes were grouped by breeding history and contrasted. At Elora 2015, landraces (*N* = 23, *M* = 109 days) matured significantly earlier than PPBs (*N* = 22, *M* = 112 days).

Significant differences were found among the yields of genotypes in the Elora trials only (Tables S4 and S5). At Elora 2014 (*N* = 35), the average yield was 828.4 kg ha−<sup>1</sup> with a range from 325.2–1124.2 kg ha−<sup>1</sup> (Figure S5A). At Elora 2015 (*N* = 62), the average yield was 1558.6 kg ha−<sup>1</sup> with a range from 600.5–2263.4 kg ha−<sup>1</sup> (Figure S5B). At Yorito (*N* = 58), the average yield was 791.1 kg ha−<sup>1</sup> with a range from 299–1471 kg ha−<sup>1</sup> (Figure S5C). Significant differences were found only in the Elora trials among breeding categories (Tables 2 and 3). At Elora 2014, the North American check (*N* = 5, *M* = 933.7 kg ha<sup>−</sup>1) and the Honduran landrace (*N* = 14, *M* = 912.8 kg ha<sup>−</sup>1) genotypes yielded significantly more than the other categories (Table 2), and at Elora 2015, the PPBs (*N* = 23, *M* = 1686.2 kg ha<sup>−</sup>1) yielded significantly more than the landraces (*N* = 27, *M* = 1396.5 kg ha<sup>−</sup>1; Table 3). At Yorito, the Honduran conventional genotypes (*N* = 6, *M* = 956.5 kg ha<sup>−</sup>1) returned the highest yields, followed by the PPBs (*N* = 24, *M* = 823.2 kg ha<sup>−</sup>1), while the landraces (*N* = 24, *M* = 745.2 kg ha<sup>−</sup>1) and the North American checks (*N* = 4, *M* = 669.9 kg ha<sup>−</sup>1) had lower yields (Table 4).

Significant differences were found among hundred seed weight (HSW) calculated for samples from the Elora trials (Tables S4 and S5). At Elora in 2014 (*N* = 49), the average HSW was 20.2 g with a range from 13.3–29.4 g. At Elora in 2015 (*N* = 62), the average HSW was 21.4 g with a range from 14.5–31.9 g. No significant differences were found among genotypes grouped by breeding history (data not shown).

Plant height was measured only at the Yorito location (*N* = 60), and significant differences were found between genotypes for this trait (Table S6). The average height was 35.3 cm with a range of 4 cm to 47 cm. Significant differences were not found when breeding history categories were contrasted (data not shown).

#### *2.6. Trait Correlation and Genotype by Trait Biplot Analyses*

Pearson correlation analyses were performed on LSmeans for each trial environment to determine trait interactions (Table S8). In addition, trait correlations and genotype performance were visualized using genotype × trait biplots for each location (Figure 4). In the biplots, positive correlations between traits are evidenced by vectors forming acute angles, for example between SPAD and HSW at Elora in 2014 and 2015 (Figure 4B), whereas negative correlations between traits are evidenced by obtuse angles formed between vectors, such as that formed by DTF and yield at Elora in 2014 (Figure 4A). A right-angle formed between trait vectors indicates a weak or lack of association between those traits. The results of our correlation and biplot analyses support each other.

**Figure 4.** Biplot analysis of traits for genotypes of the Honduran panel in three location years. (**A**) Elora 2014; (**B**) Elora 2015; and (**C**) Yorito 2014–15. DTF, days to flowering; Yield, yield (kg ha<sup>−</sup>1); HSW, 100 seed weight (g); Δ C, carbon discrimination; %Ndfa, percent nitrogen derived from the atmosphere; and SPAD, leaf chlorophyll content at 100% flowering.

#### 2.6.1. %Ndfa

At Elora in 2014, %Ndfa was negatively correlated with DTF (*r* = −0.31) but was positively correlated with Δ13C (*r* = 0.45) and with yield (*r* = 0.38) (Table S8). In the Elora 2014 biplot (Figure 4A), the landrace genotypes are clustered towards the yield and the %Ndfa vectors. The Honduran conventional genotypes are more closely associated with the DTF vector, as are the majority of the Honduran conventional genotypes (Figure 4A).

At Elora in 2015, %Ndfa was not significantly associated with any other trait (Table S8). The biplot analysis showed that DTF and Δ13C are not associated with %Ndfa, and leaf chlorophyll content at flowering (SPAD) and HSW have a negative relationship with %Ndfa (Figure 4B). As in the Elora 2014 biplot, the landrace genotypes cluster towards the %Ndfa vector (Figure 4B).

At Yorito, %Ndfa was not significantly correlated with other traits (Table S8). The biplot analysis for Yorito shows landrace genotypes cluster more towards the %Ndfa and Δ13C vectors, whereas the PPB genotypes cluster away from the %Ndfa vector and are more closely associated with the yield vector (Figure 4C).

#### 2.6.2. Agronomic Traits

Leaf chlorophyll content at the early reproductive stage (SPAD) was positively correlated with hundred seed weight (HSW) at both Elora 2014 (*r* = 0.36) and Elora 2015 (*r* = 0.44) (Table S8). DTF was negatively associated with yield at Elora 2014 (*r* = −0.48), but no association was found between these traits at Elora 2015. DTF was negatively associated with <sup>Δ</sup>13C at Elora 2015 (*<sup>r</sup>* <sup>=</sup> <sup>−</sup>0.37), but no association was found between these traits at Elora 2014 (Table S8). Yield was positively associated with HSW (*<sup>r</sup>* <sup>=</sup> 0.49) and negatively associated with <sup>Δ</sup>13C (*r* = <sup>−</sup>0.33) at Elora 2015; however, these associations were not repeated at the other trial locations (Table S8).

#### 2.6.3. High-Yielding and High-Fixing Genotypes

The aim of any breeding program is to generate high-yielding genotypes, and in this study an additional goal was to identify genotypes that were also high-N fixing. It is particularly useful to examine genotype performance at Yorito, where growing conditions are representative of the marginal production regions in Honduras. At Yorito, 14 genotypes had above-average yields coupled with above-average SNF performance (Figure 5). These included four Honduran conventional varieties (HON56, HON53, HON55, and HON77), four landraces (HON2, HON49, HON65, and HON66) and six PPB varieties (HON5, HON26, HON35, HON57, HON72, and HON78). Of the six PPB varieties, five were developed through participatory varietal selection, and one was developed through participatory plant breeding. These genotypes are dispersed throughout the phylogenetic tree (Figure 2A), suggesting a lack of close genetic relatedness; however, three of the high-yielding high-fixing PPB varieties share common genotypes in their pedigrees: HON5 and HON26 have a

common landrace parent, Cincuenteño (HON48); HON26 shares a conventional parent, Tio Canela 75 (HON55), with HON78; and, HON5 has Amadeus 77 as a parent, which is a daughter of Tio Canela 75. There was no apparent relationship between release date and higher yields, nor was there a temporal trend for SNF performance.

**Figure 5.** Genotype performance for nitrogen fixation (x axis) and yield (y axis) at Yorito. For this trial, average yield was 799 kg ha−<sup>1</sup> and average nitrogen fixation was 43.3 %Ndfa, and these values are shown on the plot in dashed lines, dividing the plot into quadrants. Genotypes in the upper right quadrant are higher yielding and higher nitrogen-fixing.

#### **3. Discussion**

#### *3.1. Genotype Origins and Pedigree Explain Honduran Panel Structure*

The patterns observed in the phylogenetic tree and STRUCTURE diagrams derived from the SNP compositions of the lines in the Honduran panel largely agree with what is known about their geographic origins and their pedigrees, but there are also a few exceptions. The two large groupings in the dendrogram based on SNP similarities (groups I–III and IV) show a clear separation (with some exceptions) between landraces (CRI) and material that has been conventionally bred or is the product of participatory plant breeding (PPB). Among the landraces, the small red beans that came from Yoro, Francisco Morazán, and Intibucá, were randomly interspersed throughout group III of the phylogenetic tree with no particular pattern, based on genotype origin. In contrast, clustering of genotypes by region of origin is found among the black bean landraces (groups I and II). Group I consists of the black bean landraces, which came from Intibucá, Yoro, Francisco Morazán, and Lempira, and contains the Milpero landraces (HON36, HON47, HON44, and HON39) and 'Negro Opalaca' (HON70), which are all from Intibucá, and 'Negro' (HON43), which is found alone on the next branch is from the neighboring department, Lempira ['Merlot' (HON62) is also found in this region of the tree and is discussed below]. These landraces are the most distantly related genotypes with respect to the rest of the panel. The Milpero landraces were daylight sensitive when grown at Elora in 2014, and their photoperiod sensitivity was likely inherited from a common ancestor. All photoperiod sensitive varieties in the panel may carry the dominant *ppd* gene responsible for control of this trait [48]. The remaining black landraces

from Yoro, Francisco Morazán, and Intibucá are found without any particular pattern throughout the next branch of the tree, and two are found among the small red landrace branches (HON06 and HON30). The STRUCTURE analysis shows that most of the black genotypes are admixed, suggesting a closer genetic relationship to the conventional and PPB germplasm. No small red landraces were found within the black landrace branches; however, two small red landraces, 'Concha Rosada' (HON02) and 'Rosado' (HON22), are displaced and found in group IV among the PPBs. The STRUCTURE analysis shows that Concha Rosada contains only ~10% genetic material from the landrace subpopulation, and Rosado is almost equally admixed between the landrace and PPB subpopulations. In the case of Concha Rosada, this may be explained by the fact that this landrace is widely grown and has been used as a parent in participatory breeding efforts and thus shares ancestry with many PPB varieties. Alternatively, Concha Rosada may not be a traditional landrace but instead a creolized variety derived from a formal-sector variety introduced to the Yorito region in the early 1980s [23]. The reason that Rosado is found among the PPB varieties in the tree is less apparent. Rosado has not been used as a parent for any of the PPB varieties in the panel according to the pedigree information available. Rosado recently arrived in the Yorito region, and a survey of bean farmers revealed that its origin is unknown [29]. According to M. Gomez (FIPAH), the initial population of Rosado showed phenotypic heterogeneity, and some selection has been made to create a uniform line for PPB breeding in collaboration with Zamorano. At Elora 2014, Rosado had uneven maturity, which may indicate that the seed planted that year, and the seed grown for DNA extraction, was not a fixed homogeneous line, and this heterogeneity may have resulted in misplacement of this genotype in the phylogenetic tree.

The organization of the PPB branches of the phylogenetic tree can be explained in part by common ancestry. The Honduran conventional varieties, 'Tio Canela 75' (HON55) and 'Amadeus-77' (HON52), and the landraces, 'Estica' (HON11) and 'Vaina Rosada' (HON34), have been used frequently in generating the PPB genotypes in this panel. For example, five of the seven genotypes in the left-most PPB branch (HON19, HON18, HON15, HON16, and HON14) are derived from crosses with Tio Canela 75 or Estica, or both. The STRUCTURE plot indicates that these genotypes contain a greater proportion of conventional/PPB than landrace genetic material, and therefore, they are found within the PPB branches of the tree. However, six PPB varieties (HON10, HON05, HON12, HON25, HON33, and HON72) are found among the landrace branches of the tree, which indicates that they share greater genetic similarity to their landrace parent than the other genotypes in their pedigrees. For example, FPV 921-65 (HON33) has the landrace Vaina Rosada (HON34) and Amadeus 77 in its pedigree. FPV 921-65 is found in the same branch as Vaina Rosada among the landraces in the tree, and the STRUCTURE analysis shows that FPV 921-65 has more similarity to the landrace genetic subgroup. Marcelino (HON10) and Amilcar (HON05) also contain >50% landrace genetic material (Figure 2B) and are found among the landrace branches. Fourteen PPB genotypes have between 5% and 50% admixture with the landrace subgroup. Alleles favoring agronomic characteristics for local adaptation and culinary traits contributed by landrace parents were likely prioritized under selection among PPB progeny from crosses between landrace and conventional genotypes, resulting in PPB varieties consisting of a large proportion of landrace genetic material.

Common parentage may explain the clustering of the five Honduran conventional genotypes (Amadeus-77, CENTA San Andrés, Tio Canela 75, Aifi Wuriti, and Carrizalito) in the panel. CENTA San Andrés, Amadeus-77, and Carrizalito have common ancestry. Tio Canela 75 is a parent of breeding line EAP 9510-77, which was released in the early 2000s in countries across Central America, including in El Salvador, as CENTA San Andrés, and in Honduras, as Amadeus-77. Tio Canela 75 is also a parent of EAP 9510-1, a sister line of EAP 9510-77, which was released in Honduras as Carrizalito in 2003 and as Telire in Costa Rica in 2004 [49]. Accordingly, these genotypes are closely associated in the phylogenetic tree, although some genetic differentiation has occurred between CENTA San Andres and Amadeus-77. The line 'MD 30-75' (released as Tio Canela 75) was used as a parent in generating four of the five conventional genotypes' mentioned above. This is likely a result of the effort to introduce BGMV resistance into Central American germplasm, as MD 30-75 is a highly resistant line, which carries the

*bgm-1* resistance gene [50]. DEORHO (HON53) was not genotyped in our study; however, it also has MD 30-75 in its pedigree, and it is reasonable to suppose that it would also appear in this region of the dendrogram. Aifi Wuriti was the only black conventional genotype included in the panel. Interestingly, it is most closely related to three small red genotypes (Carrizalito, Conan 33, and Cedrón) rather than the small black genotypes, although Aifi Wuriti does not appear in the pedigrees of any of the small black PPB genotypes in the panel. The final Honduran conventional genotype, 'Dorado', is unusual because it is found among the landrace genotypes in the phylogenetic tree and the STRUCTURE analysis. This may be explained by the lack of a common genotype in its pedigree compared to the other conventional genotypes.

The placement of the North American check varieties (except OAC Rosito) as a separate group within the dendrogram is consistent with the unique alleles that they would be expected to have relative to the genotypes in this panel. The location of the check varieties in the landrace portion of the tree (groups I and II) may reflect the genetic diversity and wide-ranging geographic origins of the germplasm used in the University of Guelph/AAFC and Michigan State University/USDA-ARS bean breeding programs, which are aimed at introducing resistance to abiotic and biotic stresses and improving various agronomic traits. The locations of the North American checks interspersed throughout the admixed portion of the STRUCTURE plot also indicate that genetic diversity has been retained in these modern North American genotypes. OAC Rosito is a special case, since it was derived from El Salvadoran germplasm, and it is found among the PPBs in the phylogenetic analysis where it is most closely related to two PPB genotypes, Campechano (HON57) and Quebradeño (HON26). According to the STRUCTURE analysis, OAC Rosito is more genetically similar to the Honduran Conventional/PPB sub-group than the landraces. This suggests either that El Salvadoran landraces are not similar to Honduran landraces, or more likely that the landrace population from which OAC Rosito was developed was actually a creolized conventional variety. Germplasm provided by Zamorano has been used in El Salvadoran variety development since the early 2000s [26], and this could explain the genetic similarity of OAC Rosito to the PPB varieties in our panel, which were developed in collaboration with Zamorano.

#### *3.2. Optimizing Use of Genetic Diversity of Honduran Landraces and PPB Varieties*

The larger nucleotide diversity among the landraces (<sup>π</sup> = 3.20 <sup>×</sup> 10−4) than observed in the PPBs (<sup>π</sup> = 2.89 <sup>×</sup> 10−4) in the current study is consistent with general observations that landraces are more diverse than materials that are products of selection [51–54]. However, other studies that compared diversity in wild to domesticated bean accessions found wider diversification between those groups than we found between landraces and PPBs in our panel. Nanni et al. [55] reported that within the Mesoamerican gene pool, nucleotide diversity was 3.2 times higher among wild genotypes (<sup>π</sup> <sup>=</sup> 17.34 <sup>×</sup> <sup>10</sup><sup>−</sup>3) than domesticated genotypes (<sup>π</sup> = 5.43 <sup>×</sup> 10−3). The difference between landraces and PPBs in the current study was only 1.1 times, probably because of the small population size and because these genotype groups do not represent extremes of the diversity continuum that was sampled in the previous study. Both landraces and PPBs are selections from wild accessions, and the PPBs have probably not been as strongly selected as conventionally derived varieties.

The high level of diversity in Honduran landraces suggests they could be a source of novel alleles that could be used in breeding to improve various traits. Landraces are adapted to the environmental conditions of the locations where they were maintained, in some cases, over thousands of years. Landraces that were grown in fluctuating environments and in low-input agricultural systems may be enriched for rare alleles enabling phenotypic plasticity and inherent responsiveness to diverse abiotic and biotic stresses [56]. Landraces, in regions where they are still grown, have often been pushed to marginal production environments where their performance often exceeds that of modern cultivars [29,57,58].

Dry bean landrace germplasm across Mesoamerica is genetically diverse [59]. Soil conditions across this region are poor, and the terrain ranges from low to high altitude with steep slopes, leading to certain trait adaptations in the landraces. For example, 'Common Red Mexican', a red-seeded landrace from Mexico, has been found to be drought resistant [60], while 'Puebla 152', a black-seeded landrace also from Mexico, has superior SNF capacity [61]. Originating in the Andean region, G19833, a 'Chaucha Chuga' landrace from Perú, has tolerance to high levels of soil aluminum and low levels of phosphorus [56] and resistance to a number of bean diseases [62,63]. Our survey of the literature found genes in high landrace π regions associated with abiotic stress tolerance, phosphorus use efficiency, and nitrogen fixation (see Table S1). Conservation of landraces and mobilization of the unique genetic diversity they contain through plant breeding can help address the future need for higher yielding and climate resilient varieties.

#### *3.3. Regions of High Genetic Di*ff*erentiation Indicate Regions Impacted by Selection*

Genetic divergence between the PPB and landrace subpopulations in the panel is indicated by regions of high genetic differentiation (*F*ST), and these genomic regions may contain loci that have been subjected to selection pressure. We identified several regions on chromosomes Pv02, Pv07, Pv09, and Pv11 where *F*ST values exceeded 0.5. Similarly, in a study of genetic diversity of Italian bean landraces, Lioi et al. [64] reported that genomic regions related to domestication were concentrated on Pv02, Pv07, and Pv09 for Mesoamerican types. For comparisons between wild and domesticated bean landraces, Papa et al. [65] also reported significantly larger levels of *F*ST differentiation around genomic regions associated with domestication. While the genetic distance between the Honduran landraces and PPB genotypes included in our study is not likely as wide as that between the wild and landrace genotypes investigated by Papa et al. [65], similar trends towards genetic differentiation between landrace and PPB genotypes developed with modern breeding objectives and germplasm could be expected.

In particular, the genomic regions with large *F*ST differences may be associated with traits that were a focus of selection in PPB breeding. However, an extensive search of the recent bean literature did not reveal any known QTLs associated with agronomic, SNF, or WUE traits that are located within the regions of large *F*ST differentiation identified in this study. This may be because this is the first genomic survey study of Honduran material, and the distinguishing traits between landraces and PPB materials are specific to materials from that region or expressed in that location. In particular, because we are comparing two domesticated groups of genotypes, namely farmer traditional landraces and PPB varieties, the genes underlying the regions of large differentiation found in our study could be those responsible for local adaptation, culinary qualities, and favorable plant traits, rather than traits associated with domestication [5]. Additionally, the conventional germplasm used to generate PPB genotypes, either through crosses with landraces or through varietal selection, is largely limited to material in the Zamorano breeding program, which may have a specific genetic architecture.

In spite of the lack of previous QTL evidence for selection for domestication in the high *FST* regions, several genes in those regions that have been studied for various reasons may be associated with domestication. For example, disease resistance genes, such as those found on Pv02 for Rhizoctonia resistance (Phvul.002G323708, Phvul.002G323712; [40]), Halo blight resistance (Phvul002G326200; [41]), and Anthracnose resistance (Phvul.002G328300; [43]), and on Pv08 for Anthracnose resistance (Phvul.008G019600; [66]), and for the bean-rust interaction (Phvul.008G270500; [67]), have been associated with domestication in several crops [68,69]. Genes controlling agronomic traits have been identified in domestication studies in other crops [70–72]. Additionally, genes that affect survival in diverse growing conditions may have also been favored over the course of domestication. Two such genes are located in a region of high genetic diversity on Pv08; the ethylene-responsive transcription factor (Phvul.008G019600; [73]), an ortholog of an Arabidopsis gene known to be involved in flooding tolerance [74], and the transcription factor IIIA (Phvul.008G270400), which is upregulated in phosphorus-restricted conditions [75]. Soltani et al. [73] suggest that further studies are needed to understand the process of local adaptation and allelic selection using bean landraces and wild

populations. Insight to develop climate-resilient crops can be drawn from the study of crop adaptation under natural selection and domestication [76,77].

#### *3.4. Landraces are Superior Nitrogen Fixers*

Although some genotype- and environment-influenced variability was seen in our study, our examination of symbiotic nitrogen fixation in the Honduran panel revealed a wide range of capacity for this trait. The superiority of the landraces for SNF capacity at all trial locations may be the consequence of the continual selections of these materials under conditions of low soil fertility endemic to Central America. Even today, these materials continue to be grown by small scale farmers who do not have access to fertilizer inputs. Under these conditions, bean genotypes which have developed efficient associations with nitrogen-fixing bacteria would have a larger source of nitrogen for metabolic processes and better phenotypic fitness compared to poor nitrogen-fixing genotypes. Strong nitrogen-fixers would have a competitive advantage in the low input environments and would likely be preferentially selected over time. There may be parallels between the selection pressures during landrace evolution and the selection of heirloom bean varieties, which have also been shown to have strong SNF capacity [18].

There are few studies that have investigated SNF capacity of bean landrace genotypes. Heilig et al. [33] used 'Puebla 152', a black-seeded Mexican landrace known for its nitrogen fixing capacity [78], in a cross with conventional genotype 'Zorro' to create a RIL population to study SNF. The authors found that Puebla 152 fixed between 13.0 to 45.5% of the nitrogen in samples (seed + biomass), which was slightly more than Zorro, which fixed between 5.4 to 44.4% [33]. Many landraces in our study fixed more N than Puebla 152, indicating that Honduran landraces may be a useful source of SNF capacity. The SNF performance of Zorro ranged from 47.7 to 53.0%Ndfa in our study, a mid-range performance among our check genotypes, and overall better than its performance in Heilig's study [33].

The SNF performance of the progeny of the cross between the conventional genotype Zorro and the landrace Puebla 152 may be predictive of the performance of PPB varieties that are crosses between Honduran conventional varieties and landraces. The SNF performance of Honduran PPB varieties ranged between 20.5 to 55.7%Ndfa at Yorito. Although the focus of the participatory breeding program between FIPAH and Zamorano has been to generate higher-yielding genotypes, rather than on improving SNF performance, the SNF capacity seen for the PPB varieties falls within the upper range of that found for the RILs in Heilig's study [33].

For the Honduran panel, the insights gleaned from Yorito are of particular interest because this location has growing conditions representative of small-scale growers across the region, and as much as possible, local growing practices were employed in the trial. At Yorito, there was a range in SNF performance in the landraces, and overall genotypes belonging to this breeding history group performed better than the others studied. In addition, PPB genotypes derived from crosses with the best SNF-performing landraces had strong SNF capacity. Zamorano used the methods of participatory varietal selection to develop these PPB varieties with CIALs, enabling local growers to evaluate genotype performance on their farms. For example, Amilcar and San Jose were tested by various CIALs through the regional adaptation nursery (VIDAC, Vivero de Adaptación Centroamericano) in the mid-2000s. Amilcar has the high-SNF performing landrace Cincuenteño in its pedigree. Generally, native Rhizobia inhabit tropical soils and farmers do not use Rhizobia inoculants, although Zamorano disseminates SNF knowledge through the CIALs, including effective Rhizobia inoculants and protocols for use. In addition, the breeding program has the capacity to test SNF performance in '*bancales*' where soil N levels are low and SNF-related traits, such as nodulation, can be observed. PPB breeding for enhanced SNF capacity could be expanded if grower demand and the threat of climate change and resulting raising input costs warrant it.

In addition, the range in SNF performance among the conventionally bred North American checks and Honduran conventional varieties was wide. The superior SNF capacity Merlot exhibited at Yorito would suggest it has value as a breeding parent for this trait in Honduras; however, it has a larger seed size and a dark red seed coat; traits that are less preferred by Honduran consumers and could be challenging to select against in a breeding program. Of the Honduran conventional genotypes, DEORHO (HON53) fixed the highest amount of N (56.8%), but it performed poorly in Elora fixing 34.4% (2014) and 52.6% (2015) of its N. It is not found in the pedigrees of any PPB genotype included in our panel, but DEORHO has been a popular variety in commercial growing regions of the country. It has disease resistance, high yield, and the preferred light red seed coat color, making it a good candidate for future PPB breeding efforts.

#### *3.5. PPB Genotypes Have Superior Water Use E*ffi*ciency Values*

Plants that have higher water use efficiency (WUE) are more drought tolerant, and WUE can be estimated using carbon differentiation (Δ13C) values measured from plant biomass. During photosynthesis, plants discriminate against the incorporation of the heavy C isotope (13C), depleting 13C in plant biomass and driving lower Δ13C values [79]. Plants with comparatively low biomass Δ13C values can be considered more drought tolerant. WUE has been studied in beans, including landraces. A study by Munoz-Perea et al. [80] of the WUE of 16 dry bean genotypes in drought-stressed and nonstressed environments found that the two landraces included differed in their responses to drought stress, but Common Red Mexican was among the best performers under drought stress conditions. In contrast, in our study, the significantly lower Δ13C values measured in Yorito for the PPB genotypes than the landraces indicates that the PPB varieties in our panel may be more resilient to drought conditions than the landraces.

The drought resistant characteristics of the PPBs were likely contributed by the conventional parents. For example, PM2-Don Rey (HON23) was the most WUE PPB genotype at Yorito, with a Δ13C value of 16.36‰. PM2-Don Rey was developed through PPB methods from a cross between the landrace, Paraísito, and the Honduran conventional variety, Carrizalito. It was released as a drought-resistant variety in 2016 [81], and it is the first variety from the EAP-Zamorano-CIAL PPB collaborations to be released at the national level. A second PPB genotype, Marcelino (HON10), was developed through participatory varietal selection, and it had similar WUE (16.43‰) to PM2-Don Rey. The PPBs FPY-724-43 (HON16; 17.03‰), Cedrón (HON03; 17.41‰), and Amilcar (HON05; 17.42‰) have the next best WUEs.

Three landraces had WUE values below 17.5‰, including Concha Rosada (HON02; 16.85‰), Chapin Rojo (HON27; 16.86‰), and Chirineño (HON67; 17.30‰). Concha Rosada is of particular note because it is favored by poor farmers for its early maturity, which allows it to escape drought conditions late in the growing season [29,30]. Our study indicates that Concha Rosada not only has drought resistance through 'drought escape' but also has WUE characteristics that enable it to survive drought.

Among the conventionally bred genotypes, Carrizalito (HON77) and OAC Rosito (HON63) were the most water use efficient with low Δ13C values of 17.47‰ and 17.54‰ at Yorito. Carrizalito is a commercial Honduran variety, and it was used as a parent contributing disease resistance, agronomic, and likely WUE traits, to PM2-Don Rey. Among the check genotypes, OAC Rosito, recently developed at the University of Guelph from an El Salvadoran plant introduction, had the best WUE performance. This characteristic is likely retained from its domestication in Central America, and this enabled it to outperform the other check genotypes that have been developed for production in the Great Lakes region of North America.

In the coming decades, the effects of climate change are predicted to bring drier conditions to Honduras, and drought-resistant crops will help protect yields through periods of minimal rainfall. It has been proposed that WUE can be improved through selection and breeding. In alfalfa, evaluating genotypes for Δ13C and selecting for lower Δ13C values has been used to improve WUE in this important forage species [79]. The current results indicate that there is variation among the Honduran PPB and conventional bean germplasm in WUE traits, and selecting for lower Δ13C values could be applied to beans in Honduras to generate improved varieties that are more resilient to drought conditions.

#### *3.6. Conventional Genotypes Have Superior Yields*

Releasing varieties with higher yields is the objective of modern breeding programs, and our trial results suggest that improvements impacting yield have been made along the breeding history continuum from landraces to PPBs to conventional varieties. Considering the Honduran germplasm, the landraces were the lowest-yielding group at Yorito, followed by PPBs and the conventional varieties. This result is counter to the findings of early experiments performed by CIALs, where landraces out-yielded conventional materials [23]. However, our trial was conducted at the FIPAH office in Yorito, where soil fertility is less restrictive and the plot is flat, whereas the early CIAL trials were conducted in farmers' fields, which have low-fertility soils and sloped land; conditions for which conventional materials were not developed. The superior performance of the conventional materials in our trial at Yorito is consistent with the aim of modern breeding practices in generating higher yielding varieties. The North American checks were the highest yielding at Elora 2014, as they were bred for performance in this region, whereas the Honduran conventional and PPBs performed poorly at Elora 2014. The landraces also performed well at Elora 2014, and this may be attributed to phenotypic plasticity resulting from retention of useful nucleotide diversity enabling them to perform well in a new environment.

#### *3.7. Utility of Panel Genotypes for Breeding*

The different breeding and/or selection histories for the materials contributing to the phenotypic diversity present in the Honduran panel may provide opportunities for improving different traits in beans in the same way that a number of studies with different crops have found unique benefits from the use of landraces. In wheat, for example, cultivation of landraces in low-input systems has led to the conservation of traits that increase the duration of photosynthesis, which can lead to an increase in grain yield [82]. In a study comparing barley landrace and modern cultivar performances under stress conditions, the landraces were higher yielding and were less likely to fail outright [58]. The advantage of using landraces as parents in breeding programs has also been explored. In a study examining barley yields under drought conditions, progeny from crosses using landrace genotypes were found to be higher yielding than progeny from crosses without landraces in their pedigrees [56]. The authors concluded that breeding crops for vulnerable environments could be enhanced by identifying landrace alleles associated with yield performance and abiotic stress adaptation and employing these in breeding programs [56].

In the current study, landraces, which had superior symbiotic nitrogen fixation characteristics could be excellent sources of novel alleles for this trait. Similarly, PPB materials, which had superior WUE, and cultivated varieties, which had superior yields within their target environments, might be exploited, respectively, for these purposes. In general, all the germplasm types that were tested represent useful resources for breeding for important traits in the face of climate change and increasing production costs/demands.

The diversity for SNF capacity inherent in Honduran bean landraces, and their unique adaptation to the microclimates where they are grown, leads us to conclude that the inclusion of landrace germplasm in breeding for enhanced SNF would produce high fixing genotypes with growth and culinary characteristics already accepted by small-scale bean growers.

#### **4. Materials and Methods**

#### *4.1. Plant Material*

The Honduran Panel was assembled in 2014 at the University of Guelph in collaboration with agronomists at FIPAH. The initial panel contained 27 landraces, 26 PPB varieties, and 5 Honduran conventional checks provided by FIPAH, as well as 6 North American checks sourced from the University of Guelph bean breeding program.

The landraces consisted of traditional inbred varieties unimproved by modern plant breeding, which are grown by subsistence farmers in hillside communities. The PPB varieties were generated either through participatory varietal selection (PVS) or participatory plant breeding (PPB) through a collaboration between the bean breeding program at Zamorano and CIALs associated with FIPAH. The landraces and PPB varieties were sourced by M. Gomez (FIPAH) from six departments in west–central Honduras. The landraces were from Yoro, Intibucá, Francisco Morazán, and Lempira, and the PPB varieties were from Yoro, Francisco Morazán, Santa Bárbara, and Comayagua (Figure 1). Seed was either collected directly from farmers in their communities or sourced from central seed banks maintained by FIPAH and PRR. The five Honduran conventional checks were developed for production in lower to mid-altitude, hillside and valley commercial-production regions of the country. The six North American varieties consisted of Merlot and OAC Rosito as small red market class checks, Zorro as a black market class check, and three navy beans: OAC Mist, a high-nitrogen-fixing genotype, R99, a nonfixing mutant, and its parent line OAC Rico. All genotypes in the panel belong to race Mesoamerica [83].

In the first trial location (Elora 2014), 10 Honduran genotypes were found to be daylight sensitive and were not grown at the subsequent locations. Additionally, seed of 16 genotypes that exhibited uneven maturity in Elora 2014 were sent to Puerto Rico for seed increase over the winter of 2015. For the second trial location (Yorito, 2014–2015), 12 new genotypes (6 landraces, 5 improved, and 1 Honduran conventional check) were added to the panel. For the third trial location (Elora 2015), seed harvested from Elora 2014, from the Puerto Rican seed increase, and from Yorito were used, as available. An additional Honduran conventional variety was grown that year to fill in the experimental design. Overall, a total of 77 genotypes were tested in the Honduran panel, 50 genotypes of which were grown at all 3 locations. A summary of the genotypes included in the Honduran Panel, including trial year, market class, seed source, and pedigree information is provided in Tables 6–8 according to breeding history.


*Plants* **2020**, *9*, 1238

**Table 6.**

Genotypes of the North American check and Honduran

conventional

 breeding categories that were included in the HON panel tested at three field locations,


region








not available.


*Plants* **2020** , *9*, 1238


**Table 8.** *Cont*.

#### *Plants* **2020** , *9*, 1238


**Table 8.** *Cont*.

breeding lines are selected through generations of testing by CIALs; classified by FIPAH. NA

Information not available.

#### *Plants* **2020**, *9*, 1238

#### *4.2. Field Experimental Design and Maintenance*

Field trials were carried out at the University of Guelph Elora research station (ERS) in summer 2014 and 2015 and at Yorito, Honduras in the *Postrera* season (planted December, 2014).

#### 4.2.1. Elora

The Elora fields were selected based on low soil nitrogen levels as measured by preplant soil tests and by site crop rotation histories that indicated that no dry bean crops had been produced in those fields for the previous decade at a minimum. In 2014, nitrates (NO3 −) were low at 7.1 ppm, and ammonium (NH<sup>+</sup>4) levels were 3.2 ppm. In 2015, nitrates (NO3 −) were very low at 4.8 ppm, and ammonium (NH<sup>+</sup>4) levels were 4.5 ppm.

A square lattice design (8 × 8) with two replications was used for each trial. At the ERS in 2014, 135 seeds of each genotype were grown in 4-row plots (150 cm × 190 cm, 37.5 cm between rows) with approximately 6 cm between plants. At the ERS in 2015, 60 seeds of each genotype were grown in 4-row plots (150 cm × 90 cm, 37.5 cm between rows) with approximately 5 cm between plants within rows.

Clean seed of each genotype was coated with commercially available Nodulator (Becker-Underwood) *Rhizobium leguminosarum* bv *phaseoli* peat-based inoculant prior to planting. The day before planting, inoculant powder (1/4 teaspoon, approximately 0.4 g in 2014; 1/8 teaspoon, approximately 0.2 g in 2015) was added to each seed envelope and the contents were shaken to coat the seeds. Inoculated seed was stored at the ERS at 4 ◦C until planting to maintain inoculant viability. The entire contents of each envelope (coated seed + loose inoculant powder) was planted. Successful inoculation was confirmed each year by observing pink (active) nodules on a few plants chosen at random throughout the trial at flowering time.

At the ERS, plots were maintained with standard practices throughout the growing season, except no-nitrogen fertilizer was used. Preplant fertilizer (0-20-20) at a rate of 200 kg ha−<sup>1</sup> was applied approximately one week prior to planting. Preplant herbicide was applied to control broadleaf weeds, and pesticides were applied as needed throughout the growing season at standard rates to control leafhoppers, anthracnose, and root rot (see details in Wilker et al. [18]). Plots were manually weeded once before canopy closure each year.

At Elora 2014, the harvest was staggered according to maturity. The plots were pulled by hand at maturity and threshed at the side of the field using a Wintersteiger plot combine (Wintersteiger AG, Upper Austria, Austria) with a Classic Seed-Gauge weighing system by Harvest-Master (Juniper Systems Inc., Logan, UT, USA), and plot seed weight and moisture content were recorded. Plots that did not reach reproductive/physiological maturity were not harvested. In 2015, plot harvest took place after all plots reached maturity with an SPC20 Almaco plot combine (ALMACO, Nevada, IO, USA), which automatically recorded moisture and weight (kg ha<sup>−</sup>1) for each plot at 13% moisture.

#### 4.2.2. Yorito

The Yorito trial site was chosen based on field uniformity, access to irrigation, and proximity to the FIPAH regional office. Soil NO3 − levels were 18 ppm ("moderate" to "high") at Yorito, and the field had been used for bean and maize production in previous seasons. The FIPAH agronomist, M. Gomez, indicated that the trial site conditions were representative of the bean production areas around Yorito.

Seed of each North American check variety were provided by the University of Guelph, and seed of all Honduran bean genotypes in the trial were sourced in Honduras. A square lattice design (8 × 8) with two replications was used for the trial. At Yorito, 100 seeds of each genotype were grown in 2-row plots (100 cm × 500 cm, 50 cm between rows) with approximately 30 cm between plants and 3 seeds sown per hole, as per the traditional planting system.

Inoculant for the trial, a mixture of *Rhizobium etli* (CIAT 632) and *R. tropici* (CIAT 899) strains, was provided by J.C. Rosas (EAP-Zamorano) and was applied according to a protocol provided by *PIF* at a rate of 500 g ha−<sup>1</sup> [93]. Briefly, the peat-based inoculant powder was applied to slightly moistened seed to ensure it adhered well to the seeds, and once sown, the seeds were covered with soil to protect the inoculant from the sun. CIAT 899 has high symbiotic stability and efficient N-fixation characteristics [15].

Plots were maintained with standard practices throughout the growing season. A preplant fertilizer of 12-24-12 NPK was used at a rate of 64.81 kg ha−<sup>1</sup> as a formulation without N was not available in Honduras. Carbendazim (DEROSAL, Bayer) at a rate of 400 mL ha−<sup>1</sup> was used to protect the trial from angular leaf spot (*Pseudoscercospora griseola*), common bacterial blight (*Xanthomonas axonopodis* pv. *phaseoli*), and rust (*Uromyces appendiculatus*).

At Yorito, plots were harvested by hand over a number of days as each plot matured. Yield (kg ha<sup>−</sup>1) was measured at harvest, the two plots of each genotype were bulked, and a subsample of seed was sent to Guelph for further analyses.

#### *4.3. Phenotyping*

#### 4.3.1. Elora

As plots initiated the reproductive phase, DTF was recorded as the date when 50% of the plants in a plot had one flower open. Relative leaf chlorophyll content was measured twice during the growing season (when the mean number of plots had reached (1) the second trifoliate stage and (2) at 100% flowering) using an SPAD 502 Plus Chlorophyll Meter (Konica-Minolta). The meter was calibrated according to manufacturers' instructions each time the unit was powered-on (https://www.specmeters. com/assets/1/22/2900P\_SPAD\_502.pdf). The middle leaflet in the top-most, fully expanded trifoliate leaf was used for the measurements, and three plants were sampled at random per plot.

As plots reached physiological maturity, DTM was recorded as the date when plots were ready to harvest. Three plants were randomly sampled from mature plots, placed in large paper bags, and dried in a repurposed tobacco kiln (De Cloet Bulk Curing Systems, model TPG-360, Tillsonburg, ON, Canada) at 33 ◦C at the ERS for 24–48 h. Roots were cut from each plant and the above-ground biomass was weighed. Plants were then threshed using an indoor belt thresher (Agriculex SPT-1A, Guelph, ON, Canada), their seed collected, weighed, and counted. Hundred seed weights (HSW) were calculated.

#### 4.3.2. Yorito

As plots initiated the reproductive phase, DTF was recorded as the date when 50% of the plants in a plot had one flower open. Disease ratings and agronomic scores were collected throughout the growing season; however, statistical analyses revealed no significant differences between genotypes, and these traits are not further reported here. DTM was recorded as the date when 90% of the pods in a plot had changed color.

#### *4.4. Isotope Analysis*

Seed from each plot was processed and analyzed as detailed in Wilker et al. [18]. Briefly, seed was finely ground, precisely measured, and the isotope analyses were carried out using mass spectrometry at the Agriculture and Agri-food Canada (AAFC) gas chromatography mass spectrometry facility in Lethbridge, Alberta. Samples were analyzed for δ15N (‰) and δ13C (‰).

To calculate the percent nitrogen derived from the atmosphere (%Ndfa), the natural abundance method was used on seed samples in this study [94]. Seed N represents the total N accumulated by a plant over the course of the growing season, and seed N values are representative of whole-plant N values [21]. Additionally, processing seed samples is more efficient than shoot tissue.

The natural abundance method uses the following equation,

$$\%Ndfa = \frac{\delta^{15}N\_{refrecuce\ plant} - \delta^{15}N\_{fixing\ plant}}{\delta^{15}N\_{refrecuce\ plant} - B}$$

where δ<sup>15</sup>*Nref. plant* is the rate of δ15*N* in the reference genotype (R99), δ<sup>15</sup>*Nfixing plant* is the δ15*N* of the N-fixing bean genotype, and B is the average δ15*N* of beans grown in an environment where its entire N source is from fixation [95]. The B-value was obtained for this experiment as described by Farid (2015) [36]. Briefly, δ15*N* was measured and averaged for 20 bean genotypes from both the Andean and Middle American gene pools, which were grown in a growth room in N-free media. Normalized δ15*N* values were used for all genotypes, and an average of δ15*N* values for R99 were used in %Ndfa calculations.

To estimate water use efficiency, δ13C values obtained from GCMS seed analysis were used following the methods proposed by Farquhar et al. [47]. Because the current WUE discussion utilizes Δ13C values, the raw δ13C values were converted to Δ13C using the following equation:

$$
\Delta \mathcal{C} = \frac{\delta^{13} \mathcal{C}\_{air} - \delta^{13} \mathcal{C}\_{p\text{hr}\text{rt}}}{1 + \delta^{13} \mathcal{C}\_{p\text{laut}}}
$$

where <sup>δ</sup><sup>13</sup>*Cair* is the current free atmospheric level of approximately <sup>−</sup>8‰ and <sup>δ</sup><sup>13</sup>*Cplant* is calculated per plant seed sample using appropriate C isotope standards. For example, a plant with a δ13C value of <sup>−</sup>28.2‰ yields <sup>Δ</sup>13C <sup>=</sup> (−0.008 <sup>+</sup> 0.0282)/(1 <sup>−</sup> 0.0282) <sup>=</sup> 0.0202/0.9718 <sup>=</sup> 20.7‰.

#### *4.5. Genotyping*

To enable discovery of the genetic structure of the HON panel, 73 genotypes (see Table 1) were genotyped for single nucleotide polymorphisms (SNPs). DNA was extracted following the same methods outlined in Wilker et al. [18]. Briefly, plants were grown in a controlled environment (16 h photoperiod, 22 ◦C) at the University of Guelph, and young-leaf tissue samples were harvested, freeze dried, and the DNA extracted according to manufacturer's instructions using the NucleoSpin Plant II kit (Macherey-Nagel, Dueren, Germany). DNA of adequate quality was sent to the Genome Quebec Innovation Centre (McGill University, Montreal) for SNP genotyping using the Illumina Infinium iSelect Custom Genotyping BeadChip (BARCBEAN6K\_3) containing 5398 SNPs [96]. TASSEL was used to filter the SNP data (MAF > 0.01) to a set representing 72 individuals and containing 4314 polymorphic SNPs [97]. Missing data comprised less than 3% of the data, and these were subsequently imputed using Beagle v4.1 [98] as described by Torkamaneh and Belzile [99].

#### *4.6. Population Structure*

The population structure of the Honduran panel was determined using a number of methods, as follows. First, the population structure was estimated using variational Bayesian inference implemented in fastSTRUCTURE [100]. Five runs were performed for each number of populations (K) set from 1 to 10 using the 4.3K genome-wide SNPs identified in this study. A ChooseK analysis was conducted to determine the number of subpopulations that maximize the marginal likelihood. Then, a principal component analysis (PCA) was conducted in TASSEL and plotted using PCAshiny in R [101]. Finally, the evolutionary relationships among the genotypes of the panel were inferred using the Neighbor-Joining method with the genome-wide SNP data [102]. The taxa were clustered together using the bootstrap test (1000 replicates) [103]. The tree was drawn to scale, with branch lengths (next to the branches) in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method and the units correspond to the number of base substitutions per site. Evolutionary analyses were conducted in MEGA7 [104,105].

#### *4.7. Genetic Diversity*

The levels of genetic diversity in the landrace and PPB breeding history categories of the HON panel were assessed using the 4.3K SNP dataset and VCFtools [106]. The π statistic provides an indication of polymorphism within a population as measured by nucleotide diversity, and Tajima's D (*D*) provides an indication of selection pressure [107,108]. Both π and *D* were measured in sliding windows of 1 Mb across the genome using the—window-pi and—TajimaD options in VCFtools [106], which resulted in an average of 6 SNPs per window. The pairwise π and *D* values were also calculated among different subpopulations. Genome-wide averages of π and *D* for each breeding history category were generated by taking the average across all windowed calculations. Landrace and PPB π values were compared across the genome, and regions where landrace π exceeded PPB π by more than 3 times were considered highly differentiated, and the regions that were at least 25,000 bp in length were considered significant. To investigate the level of differentiation between the landrace and PPB genotypes, the *F*ST statistic was computed. *F*ST was calculated using the—weir-fst-pop option in VCFtools in sliding windows of 100 bp across the genome [106]. Weighted *F*ST values range from 0 with no genetic differentiation to 1 where fixation of alleles has occurred. *F*ST values exceeding 0.5 were considered significant in our analysis.

#### *4.8. Candidate Gene Investigation*

A literature search was carried out to identify previously reported QTL and genes that colocalize with regions where π values were highly differentiated between the Landrace and PPB categories. JBrowse (https://legumeinfo.org/genomes/jbrowse/) was used to explore the bean genome around SNPs with significant weighted *F*ST values in order to identify candidate genes. A 100 kb region centered on each significant marker was searched.

#### *4.9. Statistical Analysis*

Analysis of variance (ANOVA) tests were performed on the phenotypic data collected from each environment and all environments combined, using the GLIMMIX procedure in SAS (version 9.4, SAS Institute, Cary, NC, USA, 2012). In the combined analysis, genotype, environment, and the genotype-by-environment interaction were considered fixed effects, while all other effects and their interactions were considered random. In the separate environment analyses, genotypes were considered fixed effects, while all other effects and the interaction effects were considered random. The Shapiro–Wilk test was performed on the residuals in the UNIVARIATE procedure to test their normality [109]. Random and independent distributions of the residuals were visually examined by plotting the studentized residuals against the predicted values. Data that generated outlier residuals were removed from the data set. Further, single degree of freedom contrasts were conducted in GLIMMIX between breeding history categories—landraces, PPBs, and conventional and check genotypes—contrasting each category to each of the others. Repeated measures of leaf chlorophyll content (SPAD) were taken, and separate ANOVA tests were used to compare SPAD values at each time point in a combined analysis and by environment. The least squared means (LSmeans) for each trait were computed using the LSMEANS statement in the GLIMMIX procedure for each genotype.

Using the LSmeans calculated above, the pair-wise Pearson's coefficients of correlation were computed for all traits in the CORR procedure in SAS. The PRINCOMP and PRINQUAL procedures were used in SAS to generate the principal component (PC) values, to estimate the proportion of variance accounted for by each PC, and to plot PC1 against PC2 to generate genotype x trait (GT) biplots to determine genotype and trait interactions in each environment [110].

#### **5. Conclusions**

The aim of our study was to evaluate a large set of Honduran landraces and varieties generated through participatory plant breeding, as well as check conventional genotypes, to ascertain their value in future breeding efforts. We used simple genomics and phenotyping to characterize the panel. Our genetic analyses found that the panel is divided into predominantly-landrace and predominantly-PPB groupings, with Honduran conventional genotypes sharing most similarity to the PPBs. Breeding history and pedigrees account for this division. The genetic diversity analysis revealed that landraces have retained a higher level of nucleotide diversity than PPB genotypes, which we attribute to selection pressure imposed by breeding for different production environments/objectives, and the use of a small number of conventional/elite parents in breeding efforts. The nucleotide diversity inherent in landraces can be used to increase the frequency of rare alleles in breeding programs. Beyond genetic characterization, it is important to classify germplasm for trait phenotypic diversity, which could be employed in breeding, because landraces that have evolved in adverse environments contribute adaptive traits to variety development.

Two traits that contribute to climate resiliency, nitrogen fixation capacity, and water use efficiency were evaluated in our study. Genotypes with good nitrogen fixation capacity are an asset for remote hillside growers who have limited funds and limited opportunity to purchase inputs because of poor market access. Landraces were shown to have superior SNF capacity and are already favored by hillside producers. Genotypes with enhanced water use efficiency will also be an asset to hillside growers in a future with drier and more changeable growing conditions. The PPB and conventional varieties in our study show promising characteristics for drought resilience. Further evaluation of PPB varieties under drought conditions is warranted.

Honduran bean production will continue to be carried out predominantly by small-scale hillside producers. The widely grown farmer landraces are locally adapted and accepted by consumers, and future breeding efforts should deliver varieties that maintain the inherent SNF capacity of landraces, while enhancing drought resilience and producing high yields. PPB methods employed in the breeding efforts between EAP-Zamorano and FIPAH- and PRR-supported CIALs have succeeded in generating promising PPB varieties. One variety that combines these characteristics is Amilcar (HON05). It is a small red PPB variety selected among germplasm provided by Zamorano for improvement by a CIAL in Yoro. Ultimately, Amilcar has been widely accepted because of its commercial value, culinary qualities, and disease resistance, but its climate resiliency traits and yield potential are of critical importance. Landrace characterization, conservation, and employment in breeding programs will bring continued benefits.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2223-7747/9/9/1238/s1, **Table S1**: Candidate genes identified in regions of the *P. vulgaris* (2.0) genome where high nucleotide diversity (π) was discovered in landrace genotypes compared to PPB genotypes [5,32,33,35,37–39,46,63,67,111–115]. **Table S2**: Candidate genes identified in regions of the *P. vulgaris* (2.0) genome where SNPs with significantly high weighted FST values (>0.5) were found [40,42–44,46,116,117]. **Table S3**: F-test of fixed and Pearson's χ<sup>2</sup> test of random effects in the combined GLIMMIX mixed model analysis of the Honduran panel genotypes tested in multiple field locations in Ontario, Canada and Yorito, Honduras, 2014–2015. **Table S4**: F-test of fixed effect of genotype overall and by breeding history category, and the Pearson's χ<sup>2</sup> test of random effects in the GLIMMIX analysis of 63 genotypes tested at Elora, Ontario, Canada, 2014. **Table S5**: F-test of fixed effect of genotype overall and by breeding history category, and the Pearson's χ<sup>2</sup> test of random effects in the GLIMMIX analysis of 63 genotypes tested at Elora, Ontario, Canada, 2015. **Table S6**: F-test of fixed effect of genotype overall and by breeding history category, and the Pearson's χ<sup>2</sup> test of random effects in the GLIMMIX analysis of 63 genotypes tested at Yorito, Honduras, 2014–2015. **Table S7**: F-test of fixed effects and Pearson's χ<sup>2</sup> test of random effects in the GLIMMIX mixed-model analysis of leaf chlorophyll content (SPAD) at early vegetative (SPAD1) and reproductive (SPAD2) stages in the Honduran panel overall (Combined) and at Elora in 2014 and 2015. **Table S8**: Phenotypic (rp) correlations among %Ndfa and other traits estimated in the Honduran panel grown in three locations in 2014–2015. Values shown for traits within environments are significantly correlated at the 5% level. The number of genotypes analyzed in each correlation is presented in brackets for each significant correlation. **Figure S1**: Histograms of Nitrogen derived from the atmosphere (%) values of genotypes comprising the HON panel tested at three field locations from 2014–2015. A. Elora 2014, B. Elora 2015, and C. Yorito. Breeding history category averages with standard errors are presented, followed by individual genotype LSmeans with standard errors. North American check genotypes (CK; purple), Honduran conventional genotypes (CV; blue), landraces (LR; green), and PPB varieties (PB; red). Genotype numbers correspond to those listed in Tables 6–8. **Figure S2**: Histograms of carbon discrimination (Δ) values of genotypes comprising the HON panel tested at three field locations from 2014–2015. A. Elora 2014, B. Elora 2015, and C. Yorito. Breeding history category averages with standard errors are presented, followed by individual genotype LSmeans with standard errors. North American check genotypes (CK; purple), Honduran conventional genotypes (CV; blue), landraces (LR; green), and PPB varieties (PB; red). Genotype numbers correspond to those listed in Tables 6–8. **Figure S3**: Histograms of days to flowering of genotypes comprising the HON panel tested at two field locations from 2014–2015. A. Elora 2014, B. Elora 2015. Breeding history category averages with standard errors are presented, followed by individual genotype LSmeans with standard errors. North American check genotypes (CK; purple), Honduran conventional genotypes (CV; blue), landraces (LR; green), and PPB varieties (PB; red). Genotype numbers correspond to those listed in Tables 6–8. **Figure S4**: Histograms of days to maturity of genotypes comprising the HON panel tested at two field locations from 2014–2015. A. Elora 2014 and B. Elora 2015. Breeding history category averages with standard errors are presented, followed by individual genotype LSmeans with standard errors. North American check genotypes (CK; purple), Honduran conventional genotypes (CV; blue), landraces (LR; green), and PPB varieties (PB; red). Genotype numbers correspond to those listed in Tables 6–8. **Figure S5**: Histograms of yield (kg ha<sup>−</sup>1) values of genotypes comprising the HON panel tested at three field locations from 2014–2015. A. Elora 2014, B. Elora 2015, and C. Yorito. Breeding history category averages with standard errors are presented, followed by individual genotype LSmeans with standard errors. North American check genotypes (CK; purple), Honduran conventional genotypes (CV; blue), landraces (LR; green), and PPB varieties (PB; red). Genotype numbers correspond to those listed in Tables 6–8. **Data Repository files**: Phenotypic, SNP, and genetic diversity datasets. Available at https://doi.org/10.5683/SP2/DJB7VY

**Author Contributions:** Conceptualization, J.W., A.N., S.H. and M.G.C.; methodology, J.W., K.P.P. and A.N.; software, D.T. and M.E.; validation, J.W. and M.E.; formal analysis, J.W., A.N., M.E. and D.T.; investigation, J.W.; resources, J.W., K.P.P., A.N., M.G.C. and J.C.R.-S.; data curation, J.W. and D.T.; writing—original draft preparation, J.W.; writing—review and editing, K.P.P.; visualization, J.W. and D.T.; supervision, K.P.P.; project administration, J.W.; funding acquisition, K.P.P., and S.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Ontario Bean Growers, Agriculture Agri-Food Canada through their Cluster Program, the Ontario Ministry for Food and Rural Affairs, the Ontario Ministry for Research and Innovation, and the Food from Thought Program at the University of Guelph funded by the Canada First Research Excellence Fund. SeedChange/USC Canada, Global Affairs Canada, International Development Research Centre, World Accord, Development Fund, Norway, USAID.

**Acknowledgments:** GCMS analyses were performed by Brett Hill at AAFC, Lethbridge and arranged by Frederic Marsolais, AAFC, London.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

#### *Article*

## **DNA Fingerprinting and Species Identification Uncovers the Genetic Diversity of Katsouni Pea in the Greek Islands Amorgos and Schinoussa**

**Evangelia Stavridou 1, Georgios Lagiotis 1, Lefkothea Karapetsi 1, Maslin Osathanunkul 2,3 and Panagiotis Madesis 1,\***


Received: 21 March 2020; Accepted: 5 April 2020; Published: 9 April 2020

**Abstract:** Pea (*P. sativum* L.), one of the most important legume crops worldwide, has been traditionally cultivated in Lesser Cyclades since ancient times. The commonly known traditional pea cultivar, 'Katsouni', is endemic to the islands of Amorgos and Schinoussa and is of great local economic importance. Despite the widespread cultivation of 'Katsouni' in both islands, it is still unknown whether the current Schinoussa and Amorgos pea populations are distinct landraces, and if they have common evolutionary origin. To assist conservation and breeding of the pea crop, the genetic diversity and phylogenetic relationships of 39 pea samples from Amorgos and 86 from Schinoussa were studied using DNA barcoding and ISSR marker analyses. The results indicate that both populations are different landraces with distinct geographical distribution and are more closely related to *P. sativum* subsp. *elatius* than the *P. abyssinicum* and *P. fulvum* species. Further characterization of the 'Katsouni' landraces for functional polymorphisms regarding pathogen resistance, revealed susceptibility to the powdery mildew (*Erysiphe pisi* DC.). This work represents the first investigation on the genetic diversity and population structure of the 'Katsouni' cultivar. Exploiting the local genetic diversity of traditional landraces is fundamental for conservation practices and crop improvement through breeding strategies.

**Keywords:** pea landraces; Amorgos; Schinoussa; DNA Barcoding; ISSR genotyping; HRM analysis; powdery mildew

#### **1. Introduction**

Species in the economically important *Fabaceae* family have been a staple human food for millennia and their use is closely related to human evolution [1]. Legumes, such as Spanish vetchling (*Lathyrus clymenum* L.), lentils (*Lens culinaris* M.) and beans (*Phaseolus vulgaris* L.) are an important plant-based protein source, rich in mineral nutrients, complex starch and fibers, and contain health-promoting antioxidants, such as carotenoids and phenolic compounds [2–4]. The usage of leguminous crops in traditional crop rotation systems, reduces the need for synthetic nitrogen-based fertilizers by forming symbiotic relationships with nitrogen (N)-fixing soil bacteria [5]. Such management practices are of great ecological importance and have high potential for conservation agriculture, considering legumes are functional either as growing crop or as crop residue [6].

Pea (*P. sativum* L.) is among the most important legume crops, such as chickpea (*Cicer arietinum* L.), lentil and faba bean (*Vicia faba* L.), in temperate climates and has a wide geographical distribution,

with field pea being specifically adapted to a wide range of climates and altitudes. The *Pisum* species are of high commercial importance and are cultivated worldwide for dry and fresh consumption. According to the International Legume Database (ILDIS) and to the classification of Maxted and Ambrose (2001) [7], the *Pisum* genus includes three species: i) *Pisum abyssinicum*, ii) *Pisum fulvum* and iii) *Pisum sativum* L., which further includes the wild pea, *Pisum sativum* subsp. *elatius* (M. Bieb. Asch. & Graebn) and the domesticated pea, *Pisum sativum* subsp. *sativum*.

Phylogenetic analyses of various pea taxa with molecular markers indicate that hybridization between wild peas is not an extensive phenomenon [8]. The recently annotated pea genome sequence and the resequencing of data from 42 wild, landrace and cultivar *Pisum* genotypes, provided further insights into legume genome evolution [9]. It has been suggested that the common ancestor of the *Pisum* species was probably cytogenetically similar to *P. sativum* subsp. *elatius*, which evolved across the Mediterranean and Middle East [8] and gave rise to *P. sativum* subsp. *sativum* and *P. fulvum* in the northern Middle East. Regarding *P. abyssinicum*, two main hypotheses exist; it is considered the result of a domestication event from a southern *P. sativum* subsp. *elatius* ancestor [10] followed by a migration to Abyssinia, possibly through ancient human trading routes [11], indicating at least two domestication events independent of *P. sativum* subsp. *sativum* [9,12]. The alternative hypothesis about the origins of *P. abyssinicum* suggests that it derived from a hybridization event between *P. fulvum* and *P. sativum* subsp. *elatius*, which occurred in the western half region of the Fertile Crescent [13] and then a small sample was introduced to north-eastern Africa, where it evolved into the modern *P. abyssinicum* [14]. The very low genetic diversity in the Abyssinian pea suggests that the taxon has recently experienced a severe bottleneck or is a relatively young taxon [10] and the hybridization event has most likely occurred about 4000 years bp [15].

The *Pisum* genus is very diverse, showing the gamut of relatedness that reflect taxonomic identifiers, eco-geography and breeding gene pools [8,12,16,17]. Several phenotypic classification studies on pea germplasm are based on agronomical characteristics and morphological descriptors [18–21], which are unreliable for the evaluation of pea genetic resources and the identification of different cultivars in the Fabaceae family [22], especially considering the environmental effects on the expression of the genotype. Several different molecular methods have been previously employed to assess the genetic diversity in the *Pisum* genus, such as Random Amplification of Polymorphic DNA (RAPD) [23–25], Inter-Single Sequence Repeats (ISSRs) [23,24,26], Single Sequence Repeats (SSR) [27,28], Retrotransposon-Based Insertion Polymorphism (RBIP) markers [13,17,29–31] and Expressed Sequence Tags (EST)-derived genomic markers [32]. Additionally, high-throughput parallel genotyping via genome-wide next generation sequencing techniques have also been used to study the diversity of wild pea [8,33,34].

An alternative method for the simple and accurate authentication of plant species is DNA Barcoding. The *rbcL* and *matK* regions have been recommended as core DNA barcodes for plant identification [35]. In *Fabaceae* species, four coding chloroplast regions (*rpoB*, *rpoC1*, *rbcL*, and *matK*) and two non-coding nuclear regions (*ITS1* and *ITS2*) have been used as barcodes [22,36–38]. DNA barcoding has also been used to reconstruct the phylogenetic relationship of the main Mediterranean leguminous crops [39]. Furthermore, the combination of DNA barcoding with high resolution melting analysis (Bar-HRM) has, thus far, been proved an effective approach for the identification of diverse plant species, their Protected Designation of Origin (PDO) products and quantification of adulterants [40–43].

In Greece, pulses have been traditionally cultivated since the ancient times and is a staple food in the local culinary culture. A popular Greek dish (namely 'Fava') is typically prepared using different legume species, such as yellow- split peas (*P. sativum* L.) or faba beans (*Vicia faba*). However, in the island of Santorini, the authentic PDO 'Fava Santorinis', is exclusively prepared from a local grass pea variety of *L. clymenum*. In other Cycladic islands, especially in Amorgos and Schinoussa, 'Fava' is prepared from the dried peeled and split seeds of an endemic *Pisum* cultivar also known as 'Katsouni', named after the convex shape of a small sickle's lobe used for mowing the crop. 'Katsouni' is a traditional product and a crop of great economic importance for the Lesser Cyclades. Currently, is in the process to be appointed as a PDO EU mark, offering a significant income to the local farmers. 'Katsouni' is fully adapted to the local climatic conditions of the Cyclades, with dry and hot summers and mild winters. It is rich in proteins (over 22%) and can be stored after drying the seeds throughout the year.

Historic records indicate that the 'Katsouni' landrace is the result of long-term selection and evolution from prehistoric times that occurred in Amorgos. Since the mid-19th century 'Katsouni' was transferred to the deserted Schinoussa island by residents of neighbouring Amorgos, who moved to settle there, and has since been cultivated uninterruptedly. However, Schinoussa was not always an abandoned island. Archaeological excavations have revealed findings indicating great activity from the prehistoric times to the Classical and Hellenistic period [44]. In the Byzantine times, trade and commerce were essential components of the island's prosperity [45], which heralded an age of advancement, especially during the Venetian rule (13th–16th century). However, during the Ottoman rule (16th–19th century) the island was deserted as indicated by the famous botanist and traveler de Tournefort (2003) [46]. In the late Middle Ages, in 1537, Cyclades along with Schinoussa were plundered by the Ottoman pirate Haiderin Barbarossa [47] and piracy continued up to the 19th century [48]. With Schinoussa becoming a pirates' den, we hypothesized that either: (i) the current Schinoussa pea population is an independent landrace, which is possibly the result of introgression of *P*. *abyssinicum* (transferred by pirates) into the *P. sativum* subsp. *elatius* germplasm; (ii) the Amorgos and Schinoussa populations belong to the same widely distributed landrace, or (iii) the two landraces emerged from the split of an ancestral population.

The knowledge of genetic relationships and diversity among individual landraces is fundamental for conservation practices and the selection of appropriate parents in breeding programs. Hence, in the present study, we evaluated the application of ISSR marker analysis and DNA barcoding for the molecular characterization of local Amorgos and Schinousssa pea populations. Furthermore, considering that powdery mildew (*E. pisi* DC.) severely affects pea crops worldwide [49], and in the frame of targeting functional polymorphisms, we aimed at characterizing the two landraces for the presence of the powdery mildew resistance gene (*er1-7)* with HRM analysis. The diversity assessment of local landraces may not only provide insights in understanding pea phylogenetics and population genetics, but also broaden pea breeding strategies.

#### **2. Results**

#### *2.1. DNA Barcoding, Sequencing and Tree Analysis*

To identify potential inter- and intra-specific variation between the two pea populations of Amorgos and Schinoussa, 24 pea samples were analyzed in the present study using the *ITS2*, *trnL* and *rpoC*, and 21 for the *psbA*-*trnH* and *matK* barcoding regions. The selection of *matK* and *psbA*-*trnH* was based on the unique SNPs observed in the aligned sequences among the three species and especially between *P. abyssinicum* and *P. sativum* (Figure S1). The amplification of *trnL* and *rpoC* regions was 100% successful, whilst for the *psbA-trnH*, *ITS2* and *matK* sequences the rate was 95.24%, 95.8% and 90.47%, respectively.

The positive amplicons were sequenced, and based on the BLAST results, most of the markers were able to identify the samples at the genus level, but not at the species level, when blasted against the NCBI database. The BLAST entries matched with all the three *Pisum* species (*P. sativum* subsp. *sativum*, *P. sativum* subsp. *elatius*, *P. fulvum* and *P. abyssinicum*), with over 96% similarity in identity and coverage. However, the *ITS2* marker identified all target sequences as *P. sativum* subsp. *elatius* with 100% similarity to the available sequences.

The sequences of each barcode gene were aligned and compared against indicative BLAST entries. In *psbA-trnH* we observed an intel polymorphism among our samples and the NCBI database entries. Most of our samples showed a gap in the barcoding region similar to that observed in the *elatius* subspecies sequences, but not in the *P. abyssinicum* and *P. fulvum* sequences (Figure S1). Other observed variations include the *P. abyssinicum*-specific polymorphism of a thymine (T) in place of a cytocine (C), and an unspecific SNP (G/T) present exclusively in the samples of both islands and the *P. sativum*

subsp. *elatius* NCBI entries (Figure S1). Regarding the *ITS2* barcode, only the brown colored peas from both islands shared a common nucleotide variation of a guanine (G) in place of an adenine (A) (Figure S2). The *matK* barcoding showed a *P. fulvum*-specific polymorphism (C/A) and other unspecific SNPs (Figure S2). However, the observed variations in *trnL* and *rpoC* regions were not consistent among either seed coat color and/or landrace (Figure S2). Furthermore, both *matK* and *psbA-trnH* regions showed higher diversity and pairwise distance values compared to the other markers (Table S1); yet, the observed low genetic variation values across all barcoding regions (Table S1) indicate that the populations are closely related with a recent common ancestor.

To provide a basic illustration of the phylogenetic associations between the landraces, the DNA barcoding data was used to calculate the genetic distances and generate Neighbour-joining dendrograms (Figure 1 and Figure S3). The *psbA-trnH* dendrogram illustrates the clustering of the *P. abyssinicum* sequences supported by 76% bootstrap value, which are distinct from the rest of the samples at 64%, emphasizing that our samples probably do not belong to *P. abyssinicum* (Figure 1). In contrast, both the main structures of the *trnL* and *ITS2* dendrograms (Figure S3) separated the two landraces in two distinct clusters, which also corresponded to the represented geographical regions, although supported by a low bootstrap value (30–60%). The main structure of the *rpoC* dendrogram (Figure S3) presented also two clusters separating the pea taxa from Schinoussa to those of Amorgos island, supported by a higher bootstrap value (59–66%). However, the *matK* dendrogram (Figure S3) did not show any significant patterns. Taking into consideration the results from the barcoding analysis, the two populations probably belong to the *P. sativum* subsp. *elatius*.

**Figure 1.** The *psbA-trnH* dendrogram illustrating the phylogenetic relationships between Amorgos (red circle) and Schinoussa (blue triangle) pea populations. The corresponding NCBI sequences of the *P. sativum* subsp. *elatius* (purple square), *P. sativum* subsp. *sativum* (white square), *P. abyssinicum* (yellow square) and *P. fulvum* (green square) were used as reference taxa.

#### *2.2. ISSR Genotyping*

To further investigate the genetic differences between the two populations, we used fifteen ISSR markers, out of which six were found to be polymorphic. ISSR analysis of the pea populations using the six polymorphic markers yielded 66 bands in total; five unique bands were identified for the Schinousasa population, whereas the remaining 61 bands were shared between the two populations (Table 1). The Schinoussa population presented a significantly greater polymorphism compared to Amorgos by displaying a greater number of different alleles (Na), whilst the number of effective alleles (Ne) did not show significant differences (Table 2).

**Table 1.** Band patterns of the Amorgos and Schinoussa populations resulted from the ISSR analysis.


**Table 2.** Mean value and standard error over loci for Amorgos and Schinoussa populations.


Mean = Mean value, SE = Standard error, N = Number of alleles, Na = Number of different alleles, Ne = Number of effective alleles = 1/(pˆ2 + qˆ2), I = Shannon's Information Index = −1\* (p \* Ln (p) + q \* Ln(q)), h = Diversity = 1 − (pˆ2 + qˆ2), uh = Unbiased diversity = (N/(N − 1)) \* h.

The genetic differentiation between landraces (PhiPT distances) was significantly different (PhiPT = 0.188; P ≤ 0.001), indicating that the two landraces are geographically distinct. The Analysis of Molecular Variance (AMOVA) based on the PhiPT values indicated that most of the genetic diversity occurred within landraces (80%), while the variability among landraces contributed to the 20% of the observed genetic diversity (Table 3).


**Table 3.** AMOVA analysis of the Amorgos and Schinoussa pea populations.

Df = Degrees of freedom, SS = Sum of Squares, MS = Mean Square, Est. Var. = Estimated Variance.

The Principal Coordinates Analysis (PCoA) generated two major clusters, in which samples from Amorgos and Schinoussa were clearly separated (Figure 2A). Additionally, the clustering based on seed coat color supports that the two landraces present genetic differences (Figure 2B), given Amorgos' Black, Brown and Green peas do not overlap with the corresponding colors of Schinoussa peas, supporting the hypothesis that the Amorgos and Schinoussa pea populations are two distinct landraces. The UPGMA dendrogram illustrates that the pea samples originating from the same geographic location are closely clustered (Figure 3).

**Figure 2.** Principal Coordinates Analysis (PCoA) of the Amorgos and Schinoussa pea populations, clustering for: (**A**) population, and (**B**) seed coat color from each landrace. The PCoA analysis shows the separation of the two populations as distinct landraces based on their region of origin.

**Figure 3.** Dendrogram of the Amorgos and Schinousa pea populations based on the UPGMA analysis of the ISSR polymorphisms. Individuals are shape- and color-coded based on their region of origin (Red circle: Amorgos, Blue triangle: Schinoussa).

#### *2.3. Molecular Screening for Powdery Mildew Using HRM Analysis*

HRM analysis using a specific functional marker was employed to characterize the two landraces for the presence of the powdery mildew resistance gene. Regarding the specific InDel marker, all the pea samples tested were found negative to the resistant allele (*er1-7*), across both landraces (Figure 4A). This was also confirmed by sequencing of selected samples, where the resistant allele was absent and therefore the 10-bp sequence TCATGTTATT was present (Figure 4B).

**Figure 4.** HRM analysis coupled with a co-dominant functional marker specific for *er1-7.* (**A**) Normalized fluorescence graph of selected pea samples per seed coat color and landrace. In the x axis deg. indicates temperature in ◦C. (**B**) Sequence alignment of the *er1-7* region from selected pea samples and the corresponding reference sequences obtained from Sun et al. (2016) [50].

#### **3. Discussion**

Exploiting genetic diversity from local traditional landraces is fundamental for conservation practices and breeding programs, especially under the pressure posed for adaptation to climate change worldwide. To promote the local pea landraces of the Cyclades we aimed at identifying the species and understanding the genetic relationship of the local Amorgos and Schinousssa pea populations, using DNA barcoding and ISSR marker analysis. Over the centuries, from the prehistoric times and the Bronze Age (3000–1100 B.C in Greece) to the mid-19th century, the Aegean was a field of pirate action [51,52]. Oral traditions and place names, throughout the history of Schinoussa, suggest that the island was used as a ground and shelter for pirates. One of our hypotheses was that the landrace of Schinoussa belongs to *P. abyssinicum*, however, our results strongly suggest otherwise. Although both landraces showed a geographical clustering according to the PCoA and the UPGMA analysis of the ISSR polymorphisms, both were identified based on DNA Barcoding as *P. sativum* subsp. *elatius*.

Herein, the inability of most of the barcoding markers to discriminate samples at the species level, being not variable enough to resolve phylogeny of the genus, is probably due to the conserved chloroplast sequences and the low mutation rate [53]. The inability to discriminate among the *P. abyssinicum*, *P. sativum* subsp. *elatius* and *P. fulvum* is in agreement with the view that *P. abyssinicum* is an ancient hybrid of the two species [14]. Nevertheless, despite the inability of the *psbA-trnH* spacer on identifying species due to the frequently observed intraspecific inversions [54], in this study it was shown as the most effective marker for separating the two pea landraces from *P. abyssinicum*. Additionally, although the *matK* barcode showed the lowest resolution capacity in our study, it was sensitive enough to discriminate the two pea landraces from *P. fulvum*. Thus, the most informative

barcoding markers to draw conclusions concerning the species identification in the two populations is the combination of *psbA-trnH* and *matK*. Taking into consideration the phylogenetic trees and barcoding results from all the studied markers, the two landraces are more closely related to *P. sativum* subsp. *elatius*, than to the *P. abyssinicum* and *P. fulvum* species.

There is a limited number of studies on *Pisum* germplasm that mainly are focusing on SSR analysis [55]. Herein, the ISSR analysis showed that the populations were distinguished into geographical regions, as separated clusters, indicating the adaptation of these traditional landraces to relatively different agro/climatic conditions (Figures 2 and 3). This geographical isolation could have potentially resulted to the genetic drift of the two landraces. This result is in accordance with the larger molecular differences detected between pea landraces collected in Maritime areas of Spain [26]. Pea is known as self-pollinating with occasional cross-pollination which allows spontaneous hybridization [56]. As a self-pollinated crop, higher genetic diversity is expected among cultivars than within cultivars. However, the AMOVA analysis showed larger genetic diversity within cultivars (80%), which is in accordance with similar findings in other legumes such as chickpea [57]. This may be attributed to the natural interspecific crosses that can occur between *Pisum* species, serving as a source of additional genetic diversity for the selection of common pea [15]. This indicates that despite autogamy, the analysis of genetic diversity on some plants per landrace might be useful in breeding programs [28]. The large genetic diversity might be due to the long-term adaptation of the landraces to the local environment and the diverse agro-ecological systems [55], in combination with putative migration events among the regions, followed by introgression with pre-existing germplasms [28].

Aiming to identify unique traits in the two pea traditional landraces we screened for the powdery mildew resistance gene (*er1*). The er1 is the loss-of-function mutation in the disease susceptibility-related *PsMLO1* gene [58] and the most widespread across resistant pea germplasm, conferring penetration resistance [50,59]. HRM coupled with a functional marker has been previously reported to be highly efficient and cost-effective for routine large-scale screening of pea germplasm for resistance to powdery mildew [60]. This approach allowed for the accurate genotyping of both the homozygous and heterozygous resistant peas [58,60]. In this study, using the *er1-7* functional marker coupled with HRM, we identified that the two landraces of the Cycladic islands are susceptible to powdery mildew. These results indicate the need for further crop improvement of the traditional pea landraces by introducing resistance through molecular-assisted breeding strategies.

#### **4. Materials and Methods**

#### *4.1. Plant Material and DNA Isolation*

The plant material used in this work includes samples from two pea populations from the islands of Amorgos and Schinoussa (Table 4 and Figure 5). The pea samples were grouped based on the geographical region of origin and the seed coat color; groups A-C from Amorgos and groups D-G from Schinoussa (Table 4). Color grouping was based on the three main seed coat colors (black, brown and green) observed in both populations, except for Schinoussa, which had an additional brown-green hue (Table 4). Following, a subsample of seeds from each group was planted in pots containing 2:1 peat:perlite in order to obtain fresh leaf material for DNA extraction. Total genomic DNA was isolated from approximately 0.1 g of fresh leaf material for each sample following the modified CTAB protocol as described by Doyle and Doyle (1987) [61]. After extraction, the DNA samples were re-diluted in 1X TE buffer (10 mM Tris-Cl pH 8.0, 1 mM EDTA) and stored at −20 ◦C. DNA quantity and quality were assessed by regular spectrophotometric procedures using the UV-Vis Spectrophotometer Q5000 (Quawell Technology Inc., U.S.A.) and gel electrophoresis in 1% agarose gel.


**Table 4.** Samples from Amorgos and Schinoussa pea populations used in this work. The samples were grouped in sub-groups (A–G) according to region of origin and seed coat color.

**Figure 5.** Map of the sampling sites in Amorgos and Schinoussa islands.

#### *4.2. DNA Barcoding and Sequencing Analysis*

For the identification of the two landraces, we performed DNA barcoding analysis using the *ITS2* [62], *trnL* [63], *rpoC*, *matK* [35], and *psbA-trnH* [64] barcoding markers. PCR amplification was performed on a Rotor-Gene 6000 real-time 5-Plex HRM PCR Thermocycler (Corbett Research, Sydney, Australia), using the Rotor-Gene Q software version 2.0.2 (Corbett Life Science, Cambridge, UK). PCR reaction mixtures with a total volume of 20 μL consisted of approximately 20 ng genomic DNA, 1× PCR buffer, 0.5 μM forward and reverse primers, 0.2 mM dNTPs, 1.5 mM SYTO™ 9 Green Fluorescent Nucleic Acid Stain (Invitrogen, Eugene, Oregon, USA), and 1 U Kapa Taq DNA polymerase (Kapa Biosystems, USA). The universal regions were amplified using the following protocol: initial denaturation at 95 ◦C for 4 min, followed by 35 cycles of 95 ◦C for 30 sec, corresponding annealing temperature (Ta) ◦C for 60 sec, and 72 ◦C for 60 sec with a final extension phase at 72 ◦C for 3 min.

After sequencing, the sequences of the five candidate regions were aligned with the MUSCLE algorithm and genetic distances were calculated using Molecular Evolutionary Genetics Analysis X (MEGA X; Version 10.05) based on the K2P-distance model [65] to evaluate divergence between the two populations. The Neighbour-joining clustering method was used to demonstrate the represented differences as an unrooted dendrogram using MEGA X [65]. Statistical support for each constructed tree was provided by two statistical data analysis as bootstrapping (1000 replications) and pairwise distance model.

Species identification based on the sequence similarity approach was performed with the National Center for Biotechnology Information (NCBI) database [66] by basic local alignment search tool (BLAST; setting: blastn, megablast) [67] and all regions of the three various *Pisum* species were used as query sequences. Correct identification was concluded when the best BLAST hit of the query sequence had over 96% query coverage and identity.

#### *4.3. ISSR Genotyping and Data Analysis*

For the distance-based analysis of the two populations we used 15 ISSR markers of which six (UBC811, UBC818, UBC827, UBC841, UBC873, UBC880) were selected for further analysis based on their discrimination efficiency. The total volume of PCR reaction was 25 μL containing 1X PCR buffer, 0.2 mM dNTPs, 10 μM primer, 1 U/μL Taq DNA Polymerase (Kapa Biosystems Ltd.) and 20 ng template DNA. The profile of the PCR reaction program was an initial denaturation for 4 min at 94 ◦C, followed by 35 cycles at 94 ◦C for 30 s, 40 sec annealing at the corresponding Ta ◦C, and 40 sec extension at 72 ◦C, ending with final extension phase at 72 ◦C for 7 min. The amplified PCR products were run on 1.5% agarose gel with 1X TAE buffer at 100 V and visualized with the UV Minibis Pro (DNR Bio-Imaging Systems, Jerusalem, Israel) instrument. Band scoring was performed using the Logger Pro 3.15 software.

DNA fragment profiles were scored in a binary fission with '0 indicating the absence and '1 indicating presence of a band. Using the binary haploid data, a pairwise individual-by-individual genetic distance matrix was constructed using the Jaccard coefficient. The percentage of polymorphic loci (P), Number of alleles (N), Number of different alleles (Na), Number of effective alleles (Ne), gene diversity (expected heterozygosity, He), Shannon's diversity index (I), Diversity (h) and unbiased genetic distances (uh) according to [68] were subsequently calculated. The hierarchical distribution of genetic diversity among and within populations was also characterized by Analysis of Molecular Variance (AMOVA) and Principal Co-ordinate Analysis (PCoA). All of the above analyses were performed using the GenAlex 6.5 software package [69]. The Unweighted Pair Group Method based on Arithmetic Averages (UPGMA) clustering analysis for analyzing the similarity estimates was performed using MEGA X [65] and expressed as a dendrogram.

#### *4.4. Molecular Screening for Powdery Mildew Resistance Using HRM Analysis*

We used a co-dominant functional marker specific for *er1-7*, the InDel111–120, associated with pea resistance to powdery mildew [50,59,60]. Representative samples were selected from each color group of both populations. The total volume of PCR reaction was 20 μL containing 1X PCR buffer, 0.2 mM dNTPs, 10 μM of the er1-7 primer, 1 U/μL Taq DNA Polymerase (Kapa Biosystems Ltd.) and 20 ng template DNA. PCR conditions where: preheating for 5 min and initial denaturation at 95 ◦C, followed by 35 cycles at 94 ◦C for 30 s transition, 30 sec annealing at 56 ◦C. The fluorescence was measured at the end of each extension step during the PCR cycles. The HRM was performed by an initial pre-melt conditioning of the PCR products at 95 ◦C for 5 sec and 50 ◦C for 30 sec, followed by a melt at range of 75–85 ◦C in increments of 0.1 ◦C every 2 sec. Fluorescence was measured at the end of each increment.

#### **5. Conclusions**

This work represents the first investigation focused on the molecular characterization of pea traditional accessions collected from the semi-arid region of Lesser Cyclades with the main goal to evaluate their genetic diversity and their population structure. Based on the DNA Barcoding and the ISSR marker analysis: i) both landraces have been identified as *P. sativum* subsp. *elatius* germplasm and ii) the landraces show a tendency for differentiation, which is in accordance with the geographical distribution of the genetic structure as an underlying evolutionary process. However, further research and phylogenetic analysis is required with larger population sizes for better understanding of the evolutionary processes that led to these differences, as well as for the preservation of existing diversity in *ex-situ* collections.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2223-7747/9/4/479/s1, Table S1: Pairwise distance and diversity analysis within and between populations of the five barcodes, Figure S1: Sequence alignments of the *psbA-trnH* region, Figure S2: Sequence alignments of the *ITS2*, *trnL*, *rpoC* and *matK* regions from selected pea samples and the corresponding Pisum reference sequences available in the NCBI, Figure S3: The *ITS2*, *trnL*, *rpoC* and *matK* dendrograms illustrating the phylogenetic relationships between Amorgos (red circle) and Schinoussa (blue triangle) pea populations.

*Plants* **2020**, *9*, 479

**Author Contributions:** Conceptualization, P.M.; methodology, E.S., G.L., L.K., M.O.; data curation, E.S., G.L.; writing—original draft preparation, E.S.; writing—review and editing, E.S., G.L., M.O.; visualization, G.L., E.S.; supervision, P.M.; funding acquisition, P.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research has been co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH-CREATE-INNOVATE (project code: T1EDK-04448).

**Acknowledgments:** We would like to thank Dimitris Roukas for providing the pea landrace seeds and vital information regarding the origin of the seeds and the history behind its cultivation in the two islands. We would also like to thank the University of Chiang Mai for financial support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Evaluation of Wild Potato Germplasm for Tuber Starch Content and Nitrogen Utilization E**ffi**ciency**

#### **Silvia Bachmann-Pfabe \* and Klaus J. Dehmer**

Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Genebank Department, Satellite Collections North, Gross Luesewitz Potato Collections, Parkweg 3a, 18190 Gross Luesewitz, Germany; dehmer@ipk-gatersleben.de

**\*** Correspondence: pfabe@ipk-gatersleben.de; Tel.: +49-38209-82-314

Received: 10 June 2020; Accepted: 1 July 2020; Published: 2 July 2020

**Abstract:** Potato wild relatives provide a considerable source of variation for important traits in cultivated potato (*Solanum tuberosum* L.) breeding. This study evaluates the variation of tuber starch content and nitrogen utilization efficiency (NutE) in wild potato germplasm. For the experiments regarding starch content, 28 accessions of ten different tuber-bearing wild *Solanum*-species were chosen, and in vitro plantlets were raised from seeds. Twenty plantlets (= genotypes) per accession were then cultivated in the greenhouse until natural senescence and tuber starch content was determined. The average tuber starch content across all genotypes tested was 21.7% of fresh mass. Contents above 28% of fresh mass were found in 50 genotypes, belonging to the species *S. chacoense*, *S. commersonii*, *S. jamesii*, and *S. pinnatisectum*. Subsequently, 22 wild genotypes revealing high tuber starch contents and four modern varieties of cultivated potato were studied as in vitro plantlets under optimal and low N supply (30 and 7.5 mmol L−<sup>1</sup> N). Low N supply lead to a genotype-dependent reduction of shoot dry mass between 13 and 46%. The majority of the wild types also reduced root dry mass by 26 to 62%, while others maintained root growth and even exceeded the NutE of the varieties under low N supply. Thus, wild potato germplasm appears superior to cultivars in terms of tuber starch contents and N utilization efficiency, which should be investigated in further studies.

**Keywords:** genetic resources; *Solanum chacoense*; stress tolerance

#### **1. Introduction**

Besides being one of the most important food crops worldwide, potato (*Solanum tuberosum* L.) plays an important role in industry due to its starchy tubers. Starch is used in bakery products, thickening products, soups and noodles but also for the production of paper, textiles, building materials, pharmaceutical products, chemicals and biodegradable packaging materials [1]. Compared to other starches, potato starch has superior characteristics because it is easily isolated, of high purity and of large granule size, needs low temperatures for gelatinization and produces gels with highest viscosity [2]. Depending on genotype and growing environment, a fresh potato tuber contains about 20% of dry mass, 60–80% of which is starch [3]. Dedicated starch varieties may even reach a starch content of up to 23% of fresh mass [4]. The nitrogen (N) fertilizer regime influences starch yield by positively affecting canopy development and photosynthesis efficiency, dry matter partitioning to the tubers, tuber bulking and tuber yield formation [5,6]. Furthermore, field N availability influences starch quality parameters, such as granule size, viscosity and breakdown [7]. Consistently, Maltas et al. [8] reported a highly significant effect of different N fertilizer rates on total tuber yield, the percentage of large tubers and starch concentration under field conditions in cv. Bintje and Laura.

Depending on environmental conditions and genotype, the potato crop has been found to remove 90 to 190 kg N ha−<sup>1</sup> [9]. However, the shallow, less branched and less dense root system of potato does not allow the exploration of a large soil volume or to retrieve nitrogen from deeper soil layers, and hence, potato demands a high level of readily available soil N at the right period of growth [5,9–12]. In a review, Iwama [13] reported that most of the potato roots are present in the upper 30 cm of the soil and only a small fraction extends to 100 cm. In combination with the fact that potatoes are often cultivated on coarse and sandy soils and under irrigation, these areas face an increasing potential for nitrate leaching and contamination of groundwater [14].

Thus, improving the N use efficiency in potato production is not only of economic, but also of environmental concern, and different measures such as split application of N or foliar application of urea are being discussed in order to reduce N leaching [5]. Improving the N uptake and use efficiency of the potato crop itself is also an important approach, and many studies evaluated the N use efficiency in cultivated potatoes [8,15–21]. In contrast, only few studies evaluated the N use efficiency and/or tuber starch contents of native Andean cultivars or wild potato germplasm [22–25]. The wild relatives of the cultivated potato could be an important source of variation for root length and morphology, tuber starch content and N use efficiency. The secondary and tertiary genepool of potato has intensively been studied as a source of disease resistance [26–28], and was used, amongst others, to improve foliar late blight and nematode resistance of *S. tuberosum* [29,30]. In terms of N use efficiency, Errebhi et al. [22] compared 39 wild potato accessions of 23 species with three cultivated varieties under high and zero N in the field. They found some wild potato genotypes which were able to take up significantly more applied N than their cultivated relatives. Genotypes of *S. microdontum* and *S. chacoense* were identified as the ones with the highest N uptake efficiency (NupE) [22]. Selected native Andean cultivars indicated a similar nitrogen use efficiency to commercial cultivars, but showed, despite different environmental conditions, a highly consistent performance across a two-year field study [24]. To our knowledge, the most comprehensive study of tuber starch content and quality in exotic germplasm, was provided by Jansen et al. [25]. Accessions of 46 wild and cultivated potato species showed a high variation in starch contents ranging between 3.8 and 39.6% of fresh mass. Highest starch contents were predominately found in genotypes of species *S. pinnatisectum* and *S. chacoense* [25].

Based on the above-mentioned findings, our study aimed to (I) update and assess the variability of the tuber starch contents in wild potato germplasm and to (II) study the nitrogen use efficiency of genotypes with high tuber starch contents in relation to modern cultivars.

#### **2. Results**

#### *2.1. Variation of Tuber Starch Contents in Wild Potato Germplasm after Greenhouse Cultivation*

In 2013, altogether 28 different wild potato accessions (= populations) representing ten different species were cultivated in the greenhouse to evaluate their tuber starch contents. For each accession 20 different genotypes were cultivated as in vitro plantlets, however, the results only include genotypes which produced sufficient tubers for starch analysis (in total 506 genotypes, Table 1). On average, of all the 506 genotypes analyzed, the starch content in the tubers amounted to 21.7% of fresh mass (FM). The lowest average starch content with 14.2% of FM was measured for accession Gross Luesewitz Potato Collections (GLKS) 31559 (*S. stenotomum*), while accession GLKS 30211 (*S. commersonii*) showed the highest starch content with on average 30.0% of FM. Interestingly, all genotypes of accession GLKS 30211 showed high tuber starch contents ranging between 26.4 and 33.3% of FM, indicating a rather low variation within this population (Table 1). From *S. chacoense,* all the 15 accessions had an average starch content of 22.5% of FM and showed a rather low variation between the populations (CV = 8.71%), and a higher within population variation (CV: 11.5–23.9%, Table 1). Ten accessions of *S. chaoense* encompassed genotypes with a tuber starch content higher than 28% of FM, a target value which was considered as selection criterion for high-starch genotypes in this study. Regarding *S. pinnatisectum*, the tubers of the five accessions studied, had an average starch content of 22.2% of FM, but showed starch contents up to 36.6% of FM. Altogether 11 genotypes of three accessions produced starch contents higher than 28% (Table 1).


**Table 1.** Wild potato accessions of the Gross Luesewitz Potato Collections (GLKS) used in the study ranked according to their tuber starch content. Given are the number of genotypes tested per accession, the mean starch content (%) in the tubers, its minimum and maximum values and the coefficient of variation (CV %).

For subsequent N efficiency experiments, genotypes with starch contents higher than 28% of FM were to be used. In *S. chacoense*, altogether 23 genotypes were identified revealing a tuber starch content above 28% of FM. Overall, 15 genotypes, belonging to accessions GLKS 30135, GLKS 30154, GLKS 30156, GLKS 30159, GLKS 30160, GLKS 30177, GLKS 30181, GLKS 30916, and GLKS 30995 were selected for the N experiments and re-cultivated in 2014 to validate their starch contents (Table 2). In *S. pinnatisectum*, 11 genotypes showed tuber starch contents above 28% of FM, and genotype GLKS 31600\_10 with a starch content of 36.6% was selected for up-coming experiments and re-tested in 2014. Additionally, two genotypes with the highest starch contents of *S. microdontum* (GLKS 30688\_04, GLKS 30688\_12) and *S. stenotomum* (GLKS 31559\_11, GLKS 31559\_14) as well as one genotype of *S. tuberosum* subsp. *andigena* (GLKS 34995\_18) were selected for the N efficiency experiments in order to cover a broader spectrum of *Solanum* species, even if they had starch contents below 28% of FM (Table 2). The tuber starch contents determined in 2013 and after re-testing of selected genotypes in 2014 correlated well (r = 0.72, *p* ≤ 0.01), confirming the high-starch properties of the majority of the selected accessions.


**Table 2.** Selected accessions and genotypes of the Gross Luesewitz Potato Collections used for the N efficiency experiments as well as their respective tuber starch contents (%) in 2013 and 2014.

#### *2.2. Dry Yield of Shoots, Roots and Root-DM:Shoot-DM Ratio in the N Experiments*

Shoot and root DM as well as the root-DM:shoot-DM ratio were predominately affected by the genotype and, to a lesser extent, by the factor treatment. For these traits, the factor genotype explained up to 68% of the variation in the data, while the factor treatment explained between 6 and 22%. The genotype × treatment interaction explained 11 and 15% of the total variation for root DM and root-DM:shoot-DM ratio, respectively, but played only a minor role for shoot DM (3.58%, Table 3).

In the high N (30 mmol L<sup>−</sup>1) treatment, the shoot DM of the genotypes varied between 214 and 682 mg vessel−<sup>1</sup> (Table 4). The lowest shoot DM was observed for the genotypes GLKS 31600\_10, GLKS 30177\_17 and cv. Kiebitz, while the highest biomass was achieved by the genotypes GLKS 30177\_20, GLKS 30181\_06 and GLKS 30160\_15. These genotypes even exceeded the performance of cv. Tomba, which showed the highest shoot DM amongst the standard varieties. Under low N supply (7.5 mmol L<sup>−</sup>1), shoot DM ranged between 156 and 549 mg vessel−1. Shoot DM decreased under low N supply on average by 115 mg vessel−<sup>1</sup> (23%) and the shoot biomass reduction was significant for all the genotypes tested, except for cv. Kiebitz and cv. Eurobravo (Table 4). The strongest reduction (>30%) was observed for genotypes GLKS 30135\_19, GLKS 30995\_18, GLKS 30177\_02 and GLKS 31559\_11. A moderate shoot DM reduction (15 to 20%) at simultaneously high yields in the control was observed for genotypes GLKS 30135\_05, GLKS 30160\_13 and GLKS 30177\_20. Again, cv. Tomba produced the highest shoot DM amongst the standard varieties under low N supply. However, several wild potato genotypes performed as well or even exceeded the shoot DM of cv. Tomba under reduced N conditions (Table 4).


#### *Plants* **2020** , *9*, 833

30688\_12 30916\_08 30995\_18

*S. microdontum*

*S. chacoense* *S. chacoense*

497 a 384 a \*\*\* 139 b 116 a ns 24.2 a 6.58 b \*\*\* 4.07 a 2.05 a \*\*\* 28.2 a 8.64 b \*\*\*

536 a 403 a \*\*\* 160 b 104 b \*\* 22.8 a 5.80 b \*\*\* 3.65 a 1.83 a \*\*\* 26.4 a 7.64 b \*\*\*

458 b 306 b \*\*\* 103 b 51.5 b \*\* 22.5 a 5.84 b \*\*\* 2.87 b 1.04 b \*\*\* 25.4 b 6.90 b \*\*\*


**Table 4.** *Cont.* The letters within one column indicate whether there is a significant difference to the best cultivar Tomba ("b") or not ("a", Dunnett's test *p* ≤ 0.05), underlined are mean significantly higher than that of cv. Tomba; asterisks indicate a significant difference between the treatments within one genotype (pairwise comparisons, Tukey adjustment; \*\*\* *p* ≤ 0.001,\*\**p* ≤ 0.01, \* *p* ≤ 0.05, ns = not significant).

The root DM varied between 30 and 337 mg vessel−<sup>1</sup> in the control treatment. The lowest root growth (<80 mg vessel<sup>−</sup>1) was observed for genotypes GLKS 31600\_10, cv. Kiebitz, GLKS 30181\_18 and GLKS 31559\_14, while the genotypes GLKS 30154\_09, GLKS 30160\_13 and GLKS 30177\_20 produced more than 300 mg vessel−1. Under low N conditions, the root DM varied between 41 and 219 mg vessel<sup>−</sup>1. The different genotypes either maintained or decreased root DM due to N deficit. The decrease was highest (about 60%) for the genotypes GLKS 30135\_05, GLKS 30177\_02, GLKS 30181\_06, while the genotypes GLKS 30177\_15, GLKS 30181\_18, GLKS 30688\_04, GLKS 30688\_12, GLKS 31559\_14, GLKS 31600\_10 and cv. Kiebitz maintained root mass. Most interestingly, genotype GLKS 30177\_15 produced a high root mass in the control and maintained it under reduced N conditions. Amongst the cultivars, cv. Tomba produced the highest root mass in both treatments. However, several wild potato genotypes produced a significantly higher root biomass under control (GLKS 30160\_13, GLKS 30177\_20) or low N conditions (GLKS 30160\_15, GLKS 30177\_15, GLKS 30177\_20) compared to cv. Tomba (Table 4).

Relating the root to the shoot biomass improves the understanding of genotype-specific reactions to low N supply. The root-DM:shoot-DM ratio varied strongly between the wild types, ranging from 0.14 to 0.61 in the control and 0.10 to 0.48 in the low N treatment, but was generally comparable to that of the standard varieties (Table 5). High values above 0.50 were calculated for genotypes GLKS 30154\_09, GLKS 30177\_01, GLKS 30160\_13 and GLKS 30177\_17, indicating a strong root growth in relation to the shoot. Genotypes reacted differently to low N supply, either by maintaining (13), reducing (11) or increasing (2) root-DM:shoot-DM ratio. The genotypes GLKS\_31559\_14 and GLKS\_31600\_10 showed the lowest root-DM:shoot-DM ratio in the control, but increased root growth at the expense of the shoot under low N.


**Table 5.** Mean root-DM:shoot-DM ratio as well as partitioning of N taken up by the different wild potato genotypes (root-N:shoot-N ratio) cultivated under optimal and reduced N conditions in a climate chamber for 21 days. Results of ANOVA and post hoc comparison of means.

The letters within one column indicate whether there is a significant difference to the best cultivar Tomba ("b") or not ("a", Dunnett's test *p* ≤ 0.05), underlined are mean values significantly higher than that of cv. Tomba; asterisks indicate a significant difference between the treatments within one genotype (pairwise comparisons, Tukey adjustment; \*\*\* *p* ≤ 0.001, \*\* *p* ≤ 0.01, \* *p* ≤ 0.05, ns = not significant).

#### *2.3. N Uptake, N Partitioning and N E*ffi*ciency Parameters in the N Experiments*

Interestingly, the genotypic variation for the shoot and total N uptake was low and explained only about 4% of the total variation. In contrast, the factor genotype explained between 20 and 60% of the total variation for the traits root N uptake, NutE and N partitioning (root-N:shoot-N ratio). Additionally, a clear genotype × treatment interaction was observed for the root-N:shoot-N ratio (Table 3). In the control treatment, the average shoot N uptake was 22.0 mg N vessel−<sup>1</sup> and varied only by <sup>±</sup> 3.18 mg vessel−<sup>1</sup> (CV <sup>=</sup> 14.5%). In the low N treatment, shoot N uptake was on average reduced by 15.3 mg N vessel−<sup>1</sup> (69%), and the reduction was significant for all the genotypes tested (Table 4). Amongst the standard varieties, cv. Tomba had the highest N uptake in the control, and many wild potato genotypes achieved shoot N uptakes as high as cv. Tomba. However, under low N supply, the wild potato genotypes generally lag behind cv. Tomba, except for genotype GLKS 30135\_05.

In both treatments, root N uptake accounted for approximately one quarter of the shoot N uptake but showed a stronger variation between the genotypes. In the control, root N uptake was on average 4.59 mg vessel−<sup>1</sup> and varied by <sup>±</sup> 1.36 mg vessel−<sup>1</sup> (CV <sup>=</sup> 29.6%). Under low N supply, root N uptake decreased on average by 59% to 1.88 mg vessel<sup>−</sup>1. The reduction was significant for all the genotypes tested, except for GLKS 31600\_10. Again, cv. Tomba was the best cultivar in terms of root N uptake in both treatments, but, in contrast to shoot N uptake, root N uptake of many wild potato genotypes even exceed that of cv. Tomba under control as well as under reduced N conditions (Table 4).

Total N uptake (shoot + root N uptake) generally ranged from 22.1 to 31.4 mg vessel−<sup>1</sup> (except GLKS 31600\_10 with only 12.8) in the control, and from 6.90 to 11.3 mg vessel−<sup>1</sup> in the reduced N treatment. Relating the total N uptake of the plant to the total amount of N supplied revealed an average NupE of 102% in the control and 132% in the low N treatment. This indicates that the plants took up all the N provided via the nutrient solution and that an additional amount of N was introduced into the system via the ten shoot tips. The NutE, expressed as the amount of total biomass (shoot + root dry mass) produced per unit N taken up, increased, on average, from 25 units in the control to 57 units in the low N treatment. NutE differed moderately between the genotypes in the control treatment where it ranged from 16 to 35 units. In the reduced N treatment, NutE varied strongly from 28 to 78 units. Although all genotypes significantly increased their NutE under low N conditions, many wild potato genotypes exceeded the NutE of the standard varieties. For example, in the low N treatment 13 of the wild potato genotypes achieved a similar and eight genotypes a significantly better NutE than the best cultivar Tomba (Figure 1). The N partitioning, as the ratio of N taken up by the root and N taken up by the shoots (root-N:shoot-N ratio), gives insight into the distribution of N within the plant. In the control, most of the wild types showed values similar to that of the standard varieties, while nine wild types showed significantly higher values than cv. Tomba. Under reduced N conditions, the majority of the wild types increased the root-N:shoot-N ratio, while it remained constant for all the standard varieties. This indicates a stronger partitioning of N to the root for wild potato genotypes, in particular under N deficit.

To evaluate the stress performance of the different genotypes, two commonly used indices (stress susceptibility index SSI, stress tolerance index STI) were applied based on total plant dry mass (shoot DM + root DM). In Figure 2 the accessions and genotypes are ranked according to their SSI and their STI from the best to the weakest genotype. The SSI, as a measure of the yield stability under stress conditions, identified cv. Kiebitz, GLKS 31559\_14 and GLKS 30181\_18 as the three most stable genotypes across both environments, while the reaction to N stress was most pronounced for GLKS 30177\_17, GLKS 30177\_02 and GLKS 31559\_11. On the other hand, the STI identified GLKS 30177\_20, GLKS 30160\_15 and GLKS 30160\_13 as the most promising genotypes, because they produced high total yields under control as well as under stress conditions, whereas cv. Kiebitz and GLKS 31600\_10 ranked least, due to relatively low yields in both treatments (Figure 2).

**Figure 1.** Mean nitrogen utilization efficiency (NutE) of 22 different wild potato genotypes and the four standard varieties under optimal (**a**) and reduced N conditions (**b**) after 21 days of cultivation in a climate chamber. Error bars indicate standard deviation, the dotted line indicates the mean NutE of the best cv. Tomba. The different letters at the bottom of the bars indicate whether there is a significant difference to the best cultivar Tomba ("b") or not ("a", Dunnett's test *p* ≤ 0.05).

**Figure 2.** Stress susceptibility index (SSI) and stress tolerance index (STI) of different wild potato genotypes and four cultivars based on their total dry mass production (DM) under optimal and reduced N supply during 21 days of cultivation in a climate chamber. Genotypes are ranked according to their performance, with the best being on top of the graph.

#### **3. Discussion**

We assessed the tuber starch content in wild potato accessions and studied the reaction of selected wild types to N deficiency. After greenhouse cultivation, the average starch content of all tested wild potato genotypes amounted to 21.7% of FM, but ranged from minimal 7.1 to maximal 36% of FM. This indicates considerable variation for this trait in the wild relatives of cultivated potato. Furthermore, in 50 out of the 506 genotypes, tuber starch content was higher than 28%, which clearly exceeds the starch contents of modern cultivars. Based on our study, germplasm with high tuber starch content is predominately found in *S. chacoense*, *S. commersonii*, *S. jamesii* and *S. pinnatisectum*. Our results are in line with Jansen et al. [25], where tuber starch contents ranged between 3.8 and 39.6% of FM in wild species cultivated in the greenhouse. Here, genotypes of the species *S. chacoense* and *S. pinnatisectum* revealed the highest starch contents. In comparison, 14 modern starch potato varieties cultivated in pot experiments revealed starch contents between 13.9 and 21.9% of FM [20]. Under field conditions, tuber starch contents ranged from 7 to 20% of FM in a set of 300 potato cultivars, breeding clones, landraces and diploid clones [31]. This points out the potential of wild potato germplasm to increase tuber starch contents in cultivars. However, next to the tuber starch content, also the starch yield, as a result of starch content multiplied by tuber yield, plays an important role for industrial starch production. Whether the high starch contents in the wild species will be maintained when tuber size increases due to breeding, has still to be examined. Studies of Schönhals [31] showed that tuber yield is negatively correlated with tuber starch content, while no significant correlation between tuber yield and starch content was found by Bombik et al. [32].

An increase in tuber starch content and starch yield should be linked with a high resource use efficiency. This holds particularly true for the element N, because potato cultivation bears high risks of N leaching due to its high demand of readily available N in soil and the small root system of the crop [5,6,14]. Based on our results mentioned above, 22 wild potato genotypes with high tuber starch contents and four commercial cultivars were studied for their N use efficiency by cultivating them as in vitro plantlets in 500 mL vessels filled with a nutrient solution containing 30 or 7.5 mmol L−<sup>1</sup> N for 21 days in a climate chamber. This system allowed us to screen a high number of plants under low space requirement and highly controlled conditions. Several other reports underline the potential of in vitro cultures for the evaluation of potato germplasm with respect to abiotic and biotic stress or rooting characteristics, because it provides conditions independent from weather conditions, pathogens, N leaching or immobilization events [16,33–35]. Our results revealed a high variation in shoot and root DM development between the wild potato genotypes. N deficiency significantly reduced shoot DM for all wild types and the cultivars (except for cv. Kiebitz and cv. Eurobravo), with the extent of shoot DM reduction being genotype dependent. Most interestingly, genotypes GLKS 30135\_05, GLKS\_30160\_13 and GLKS 30177\_20 of *S. chacoense* showed a moderate shoot DM reduction due to low N supply whilst producing a comparably high shoot biomass under high N conditions. The root DM varied considerably and the wild types GLKS 30160\_13, GLKS 30160\_15, GLKS 30177\_15 and GLKS 30177\_20 clearly exceeded the root growth of the best cultivar Tomba under high and/or low N supply, indicating that wild potato germplasm could considerably contribute to enhance root growth of *S. tuberosum* cultivars. Under N deficit, the genotypes followed different strategies in terms of root development. A significant reduction of root DM was observed for the majority of the genotypes, while seven genotypes maintained root biomass. This was also reflected in the root-DM:shoot-DM ratio which was either maintained (13 genotypes), reduced (11 genotypes) or increased (two genotypes). To sustain or even increase root biomass at the expense of the shoots is a well-known reaction of plants to nutrient deficiency and helps to maintain the nutrient uptake from soil or nutrient solution by exploring a larger (soil) volume [36,37]. In contrast, other wild types seem to preferably invest into shoot growth, probably in order to maintain photosynthetic activity. Different strategies to cope with low N as observed in our experiment are also known from cultivated potatoes. A higher root:total mass ratio under N deficiency was reported for the majority of 17 modern starch and table potato varieties during the course of 18 days of in vitro cultivation [16]. On the other hand, the authors also identified some genotypes which missed the ability to stimulate root growth at the expense of the shoots under N deficiency. This is in accordance with previous studies under climate chamber conditions where some cultivars reduced root FM with increasing N stress, while others showed an increased root

development upon N reduction and even maintained root growth at very low N levels [17]. From their studies, Schum et al. [16,17] observed that genotypes with high biomass production and fast nitrogen uptake under high N supply did not enhance root growth under low N and clearly reduced biomass production. On the other hand, genotypes with comparatively slow growth under high N supply increased root mass under low N supply [16,17]. Besides, it should be considered that other well-known responses of plants to nutrient deficiency stress such as changes in the root architecture, increase of root length, root surface area, root volume or number of root hairs [33], is not necessarily reflected in changes of the total root biomass as measured in our experiment. In relation to the total N applied, the plants took up almost all the N available in both treatments (except for GLKS 31600\_10). This explains the low genotypic variation for total N uptake (Table 3). In some cases, N uptake of the genotypes even exceeded the amount of N provided via the nutrient solution, probably due to the additional N introduced into the system via the transferred shoot tips. Therefore, it is difficult to evaluate the NupE of the different accessions. Nevertheless, on the basis of the uniform N uptakes of the genotypes, the results give insight into genotype-dependent N partitioning and provide a clear picture in terms of NutE.

Most of the N taken up by the different genotypes was partitioned to the shoots under high N supply in our study. Under low N supply, a clear shift towards the roots was observed for many genotypes (Table 5). This confirms previous results, where generally a higher percentage of N was translocated to the roots or tubers under nutrient deficiency [6,36]. The increase in root-N:shoot-N ratio became especially evident for many wild potato genotypes, while this was less pronounced for the standard varieties.

NutE was on average 25.3 units in the control and increased to 57.3 units under N deficiency and the genotypic variation in NutE was particularly high in the low N treatment (Figure 1). This is in line with several studies [16,17,20,24]. In our experiment, many wild potato genotypes exceeded the NutE of the standard varieties under low N supply. Outstanding genotypes were GLKS 30154\_09, GLKS 30916\_08, GLKS 30181\_06, GLKS 30156\_16, GLKS 30177\_15, GLKS 30160\_13, GLKS 30160\_15 and GLKS 30177\_20 of *S. chacoense*. This indicates that these genotypes need considerably less N to produce the same amount of biomass. A high NutE is often related to a good translocation of N from the root to the shoot and/or a reallocation from older leaves to the younger leaves in order to maintain the photosynthetic activity and eventually to the reproductive organs [8,36]. The superior performance of *S. chacoense* genotypes was also found in field studies where, amongst 39 wild potato accessions, genotypes of *S. microdontum* and *S. chacoense* revealed the highest total biomass (tubers + roots + shoots + fruits), a high NupE and N recovery from soil, even exceeding the performance of the control varieties cv. Russet Norkotah and cv. Red Norland [22]. The authors attributed the higher N recovery by the wild species to the deeper penetrating, denser, and more branched root system that is advantageous for nutrient uptake. However, it generally has to be considered that wild potato species form only small tubers, in some cases, produce stolons rather than tubers under long day conditions. Whether the high N recovery rate of the wild type will be maintained after crossing to cultivars needs detailed examination. Hybrids of *S. chacoense* and a haploid *S. tuberosum* line (USW551) were studied in the field with high and zero N supply by Errebhi et al. [23]. Here, hybrids showed highest N use efficiency and produced a total biomass (tubers + roots/stolons + shoots + fruits) higher or similar than that of commercial varieties, but tuber yield was low [23].

For a final assessment of the overall performance of the different genotypes under N deficiency, we studied the two stress indices SSI and STI based on the total DM. Zhao et al. [38] studied different indices to evaluate low N tolerance in maize, and advised to use several indices and not to rely on only one. The SSI for example, proposed by Fischer and Maurer [39] for evaluating the yield stability under stressed and non-stressed environments, does not consider the yield of a respective genotype in relation to the other genotypes tested under control conditions [38]. Cv. Kiebitz and GLKS 31559\_14, for example, exhibited the lowest SSI and could thus be considered as the ones with the lowest N stress susceptibility. That is confirmed by no significant changes in shoot or root mass under low N as

compared to high N. However, these genotypes produced a low total biomass during the three-week in vitro culture compared to the other genotypes tested even in the high N treatment. This might indicate that genotypes with a slow biomass development, and by this a probably rather low internal N demand, react less sensitive to a reduction in N supply than fast growing types with a strong biomass development. By calculating the STI, these genotype-specific growth rates were considered, and here genotypes with a high biomass development under both control and stress conditions rank best. Under this premise, cv. Kiebitz and GLKS 31559\_14 were rather intolerant to N stress, while GLKS 30177\_20, GLKS\_30160\_15, GLKS\_30160\_13 and GLKS 30181\_06, GLKS\_30177\_15 and cv. Tomba were more tolerant. Finally, genotypes being among the best under both indices will be very interesting candidates for further research and pre-breeding. Here, we consider GLKS 30177\_20, GLKS 30177\_15 and GLKS 30160\_15 of *S. chacoense* as the most relevant genotypes, because they combine high shoot and root biomasses in both treatments with a moderate reduction in shoot and root biomass under low N supply. Furthermore, the best performers revealed the highest share of root biomass in relation to total biomass, maintained root-DM:shoot-DM ratio under low N, but partitioned more N to the roots than other genotypes and revealed a high internal N utilization efficiency. *S. chacoense* is a well-known source of pest and disease resistance, resistance to cold-induced sweetening and abiotic stresses such as drought tolerance, but its tubers contain high levels of toxic steroidal glycoalkaloids [40,41]. As a diploid species (2n = 2x = 24, EBN 2), hybridization with tetraploid *S. tuberosum* (2n = 4x = 48, EBN 4) is possible after a ploidy reduction in the *S. tuberosum* parent to the diploid level, followed by backcrossing [42], but it is also an interesting future candidate for diploid breeding programs [43].

Apart from a sole comparison of different genotypes, our study highlights the variation of N use efficiency between genotypes within one population. The five genotypes of GLKS 30177 reacted differently to N stress. While GLKS 30177\_15 and GLKS 30177\_20 belong to the best performing genotypes, GLKS 30177\_02 and GLKS 30177\_17 exhibited a medium to low shoot DM, a strong reduction in root DM and root-DM:shoot-DM ratio under low N, low NutE and a low stability under stress (SSI). Instead, GLKS 30177\_01 can be considered as an intermediate type. These results underline the high diversity of the different genotypes within a wild potato accession which is maintained as a population in gene banks. Furthermore, it highlights the importance to study, describe and maintain individual wild potato genotypes in order to promote the use of wild potato germplasm in breeding and research [44]. Therefore, the tested genotypes in this study are maintained clonally via in vitro propagation at the Gross Luesewitz Potato Collections.

Genotypes GLKS 34995\_18, GLKS 31559\_14 and GLKS 31559\_11 of *S. tuberosum* subsp. *andigena* and *S. stenotomum*, belonging to the cultivated part of series Tuberosa and being most related to *S. tuberosum* according to the taxonomy of Hawkes [45], showed no outstanding performance in the respective experiment. Tuber starch contents as well as shoot and root biomass or N uptakes and efficiencies were on an intermediate to low level. Although revealing the highest tuber starch content, the genotype of *S. pinnatisectum* (GLKS 31600\_10) showed a rather weak performance under the given experimental conditions. This was indicated by the lowest shoot and root biomass as well as the lowest N uptakes compared to the other wild potato genotypes. Since this was the sole genotype of this species, we can only speculate whether this is a generally low-yielding species or if the experimental conditions were unfavorable. However, its reactions to N deficit clearly distinguished it from the other genotypes. Although not statistically significant, absolute root mass increased by about 35% under N deficit and the proportion of root biomass in relation to total biomass increased most at the expense of the shoot mass. In relation to its exceptionally high tuber starch contents, it is worth studying the root parameters of this species in further experiments.

#### **4. Materials and Methods**

#### *4.1. Plant Material*

In total, a set of 28 accessions was selected from the Gross Luesewitz Potato Collections (GLKS, Gross Luesewitz, Germany) of the Leibniz Institute of Plant Genetics and Crop Plant Research (Table 6). The set comprised 15 accessions of the species *S. chacoense* Bitter, five accessions of *S. pinnatisectum* Dunal, and one each of *S. tuberosum* subsp. *andigena* Hawkes, *S. commersonii* Dunal, *S. hondelmannii* Hawkes and Hjerting, *S. jamesii* Torrey, *S. microdontum* Bitter, *S. sparsipilum* (Bitt.) Juz. and Bukasov, *S. stenotomum* Juz. and Bukasov, and *S. tarijense* Hawkes (Table 6). These species were selected because they are known for other interesting traits such as disease resistance and/or because they originate from regions with high temperature and low rainfall, and by this may additionally provide tolerance to heat and drought. The latter is especially expected from *S. chacoense*, which originates from the Chaco-Region, a hot and semi-arid region in southern America. Detailed passport data of the wild potato accessions maintained at the IPK Potato Collections in Gross Luesewitz are also available via the genebank information system (GBIS). For in vitro establishment, 50 seeds of the respective accession were pretreated in gibberellic acid (500 ppm) for 24 h at room temperature to improve germination. After that, seeds were treated with 5% NaClO solution to sterilize them and placed in a test tube (one seed per tube) containing about 6.0 mL of a solid culture medium under sterile conditions. The solid culture medium was composed as described by Murashige and Skoog [46]. The seeds were then placed in a climate chamber at 20 ◦C and 12 h of light (150–250 μmol m−<sup>2</sup> s−1). After approximately four weeks, 20 well developed genotypes per accession were chosen for further experiments and multiplied. For multiplication, the plantlet of a respective genotype was cut in up to four nodal sections which were then transferred to new tubes with solid culture medium and grown in a climate chamber as described above. Prior to their cultivation in the respective experiments, genotypes were tested for quarantine diseases like virus (X, Y, L, S, M, A), potato spindle tuber viroid (PSTVd), bacteria (*Clavibacter michiganensis* ssp. *sepedonicus*, *Ralstonia solanacearum*) and Andean viruses (Andean Potato Latent Virus (APLV-Col, APLV-Col 2, APLV-Hu), Andean Potato Mottle Virus (APMoV-B, APMoV-H), Potato Black Ringspot Virus (PBRSV), Aracacha Virus B, Oca strain (AVB-O), Potato Virus T (PVT), Potato Virus V (PVV), Potato Yellowing Virus (PYV)).

#### *4.2. Evaluation of Tuber Starch Contents*

In 2013, 20 genotypes of each accession and three plantlets per genotype were transferred into pots (16 × 16 cm, 16 cm deep) filled with a turf-based planting substrate (95% white turf, 5% sand, 1.5 kg NPK (14% N, 16% P2O5, 18% K2O, micro nutrients), Einheitserde GmbH, Uetersen, Germany) and cultivated in the greenhouse. One accession was finally represented by 60 pots (20 genotypes and three plants per genotype). Plants were irrigated daily with rain- or tap water according to their needs. After natural senescence (three to four months after planting), irrigation was stopped, the aboveground plant biomass was removed, and the tubers were harvested separately for each genotype. Due to limited greenhouse capacities, starch content evaluations of accessions GLKS 30211, GLKS 30475, GLKS 31583, GLKS 31595 and GLKS 31559 were performed in 2014 in the same way as described above. Accordingly, accessions GLKS 31559, GLKS 32852 and GLKS 34995 were repeated in 2014 because too many genotypes were lost during greenhouse cultivation or too few tubers were produced. Furthermore, tubers of genotypes with high starch contents were re-cultivated in 2014 in order to validate the results.


**Table 6.** Overview of the accessions analyzed for starch content and nitrogen use efficiency as well as passport data of the accessions and taxonomic classification (according to Hawkes [45]).

ˆ EBN = Endosperm Balance Number; \* COM = Commersonia, PIN = Pinnatisecta, TUBc = Tuberosa cultivated, TUBw = Tuberosa wild, YNG = Yungasensa; ◦ ARG = Argentina, BOL = Bolivia, MEX = Mexico, URY = Uruguay, USA = United States of America, UNK = unknown.

#### *4.3. Plant Material and Experimental Setup of N Experiments*

For the N efficiency studies, genotypes were selected which had a tuber starch content higher than 28% of FM in 2013 and grew reliably in vitro and in the greenhouse. These comprised 16 genotypes from nine different accessions of *S. chacoense* and one genotype of *S. pinnatisectum*. Additionally, to cover a broader spectrum of *Solanum* species, two genotypes of *S. microdontum* and *S. stenotomum* as well as one genotype of *S. tuberosum* subsp. *andigena* were added (see Table 2). The selected genotypes were multiplied in vitro as described above. Finally, after having produced 40 plantlets per selected genotype, shoots tips of approximately 1.5 to 2.0 cm length were transferred to the testing system.

An in vitro method for early evaluation of nitrogen use efficiency traits as described in Schum et al. [17] was applied. In brief, 500 mL glass cultivation vessels were filled with 62 mL of a nutrient solution based on Murashige and Skoog [46] (Table 7). For the N experiments, two N levels were applied, containing 0.420 g L−<sup>1</sup> N (control) and 0.105 g L−<sup>1</sup> N (reduced N), respectively, being equivalent to 30 mmol L−<sup>1</sup> and 7.5 mmol L−<sup>1</sup> N. Ten shoot tips of one genotype were cultivated in one vessel for 21 days, and fixed via a perforated stainless steel plate. The transfer of the shoots to the experimental system was carried out under sterile conditions and the vessels were closed with a cellulose ring to enable gas exchange plus a glass lid to prevent contamination. They were placed in a climate chamber with 12 h of light and a constant temperature of 20 ◦C in a complete randomized design. All treatment × genotype combinations were repeated four times. Due to the high number of accessions and genotypes to be multiplied and tested, combined with the unequal growth rate of the different accessions, several consecutive experiments were conducted. For comparison, four modern varieties (cultivars, cv.) were used, kindly provided by the breeders; cv. Kiebitz (Norika, Germany), cv. Maxi (Bayerische Pflanzenzuchtgesellschaft, Germany), cv. Eurobravo and Tomba (Europlant Pflanzenzucht, Germany).


**Table 7.** Composition of the nutrient solution used in the control and reduced N treatment of the N efficiency experiments.

#### *4.4. Laboratory Analyses and Calculations*

#### 4.4.1. Tuber Starch Contents

Ten tubers of similar size (approx. 2 cm long) were selected per genotype, washed and analyzed immediately for their starch content via the underwater weight method using a balance (KERN PES 6200-2M, Kern & Sohn GmbH, Balingen, Germany) equipped with a cage to sink the tubers in water. The specific gravity (SG) was calculated based on the weight in air divided by difference of weight in air and weight under water. The starch content was calculated based on studies by Lunden [47] and as described in Meise et al. [20]:

Starch (% of FM) = −211.89 + 209.06 ∗ SG

#### 4.4.2. Yield, Nitrogen Uptake and Stress Indices

Harvested shoots and roots formed per in vitro vessel were weighed to determine fresh mass (FM). Dry mass (DM) was determined after drying the shoot and root biomass in an oven at 60 ◦C and weighing. The dry plant material was then ground in a mixer mill (Retsch, Tissue Lyser, Quiagen GmbH, Duesseldorf, Germany) for one minute at a frequency of 30 s−<sup>1</sup> using 3 mm steel beads. After that, the dried and ground plant material was analyzed for its N content using an elementar analyzer (Eurovector EA 3000 b, HEKAtech GmbH, Wegberg, Germany). Shoot and root N uptake were calculated by multiplying the measured N content with the shoot or root dry mass. Total N uptake was calculated by summing up shoot and root N uptake. The root-DM:shoot-DM ratio was calculated by dividing the root DM by the shoot DM. Similarly, the N uptake into the root was divided by the shoot N uptake to reflect the partitioning of N in the plant, and was denoted as root-N:shoot-N ratio. The N uptake efficiency (NupE, %) was calculated by dividing the total N uptake by the amount of N supplied in the respective treatment:

$$\text{NupE}\left(\%\right) = \frac{\text{total N uptake (mg per vessel)}}{\text{total N supplied (mg per vessel)}}$$

*Plants* **2020**, *9*, 833

The relation of the total N taken up by the plant to the total dry mass produced is denoted as N utilization efficiency (NutE) and was calculated as follows:

$$\text{NutE (arb. u.)} = \frac{\text{total dry mass (mg per vessel)}}{\text{total N uptake (mg per vessel)}}$$

Furthermore, the stress susceptibility index (SSI) was calculated based on total dry mass (SSIDM) as introduced by Fischer and Maurer [39]:

$$\text{SSI} = \frac{(1 - \text{Ps}/\text{Pc})}{(1 - \text{meanPs}/\text{meanPc})}$$

where Ps is the parameter (DM, N uptake) determined under stress conditions and Pc is the parameter determined under control conditions. This is related to the mean of all genotypes tested under stress conditions (mean Ps) divided by the mean of all genotypes under control conditions (mean Pc). Additionally, the stress tolerance index was calculated as proposed by Fernandez [48]:

$$\text{STI} = \frac{\text{Pc} \bullet \text{Ps}}{(\text{mean} \text{Pc})^2}$$

#### *4.5. Statistical Analyses*

All statistical analyses were performed using the software R (version 3.3.2, R Foundation for Statistical Computing, Vienna, Austria) [49]. For tuber starch content, one-way analysis of variance (ANOVA) was applied to test for within- and between-accession variations. Finally, mean, minimum and maximum values as well as coefficient of variation (CV) were calculated for each accession. Pearson correlation coefficient was calculated using the RcmdrMisc package [50].

For dry mass and N uptake traits, two-way ANOVA was used to test the effect of genotype and treatment as well as their interaction on the respective trait. Because the climate chamber allows the cultivation of the plantlets under highly controlled and standardized conditions, all consecutive experiments were analyzed together in one model. A linear model using the "lmer" procedure of the package "lme4" [51] was applied, with genotype, treatment and genotype × treatment as fixed effects. Assumptions such as normality of residuals and homogeneity of variances were tested prior to ANOVA using q-q-plots, the Shapiro–Wilk normality test and the Levene's test (package "cars" [52]). If assumptions were not met, data were log transformed. If significant factor effects were identified (*p* ≤ 0.05), post hoc comparison of means was performed using the Tukey test of the "lsmeans" package [53] to compare all means. In the results, significant differences between the control and reduced N are indicated by asterisks. The Dunnett's test against the control of the package "multcomp" [54] was used to identify means differing significantly from the best cultivar. In the results, means significantly different (higher or lower) from cv. Tomba are indicated by the lowercase letter "b". Additionally, means significantly higher than cv. Tomba are underlined.

The original data of the experiments are provided at the e!DAL repository [55] under the DOI 10.5447/ipk/2020/19.

#### **5. Conclusions**

We assessed the tuber starch content of 506 wild potato genotypes under greenhouse conditions. Of them, 50 revealed tuber starch contents above 28% of FM, clearly exceeding the starch contents of commercial cultivars. Amongst the wild types with high starch content, three were superior in terms of N utilization efficiency (NutE) as indicated by the in vitro screening in a climate chamber under high and low N levels for 21 days. GLKS 30177\_15, GLKS 30177\_20 and GLKS 30160\_15 of species *S. chacoense* produced the highest shoot and root biomass under N stress and showed only a moderate reduction of the total biomass under low N compared to the high N treatment. NutE of these

genotypes was high and exceeded that of most other wild types and the standard varieties. Combining two common stress tolerance indices (SSI and STI) proved to be a helpful tool for the identification of genotypes with a high and stable biomass production under stress compared to non-stress conditions. Based on our study, the identified genotypes of *S. chacoense* are a promising source for further research projects aiming to improve starch contents and N use efficiency in cultivated potato. Most wild potato genotypes of this study are maintained in vitro and are available at the IPK Gross Luesewitz Potato Collections.

**Author Contributions:** Data analysis, manuscript conceptualization, writing and editing: S.B.-P.; project administration, methodology, review of the manuscript: K.J.D. All authors have read and agreed to the published version of the manuscript.

**Funding:** This project was funded by the FNR Agency for Renewable Resources (22023411) on behalf of the German Federal Ministry of Food and Agriculture.

**Acknowledgments:** We thank K. Löschner for excellent technical assistance during the greenhouse and climate chamber studies, and C. Brandt and U. Behrendt for fruitful discussions and proof reading of the manuscript. We also thank our partners from the Federal Research Centre for Cultivated Plants (JKI) for their advice and support in applying the in vitro testing system as well as the breeders for kindly providing modern cultivars for the studies. The publication of this article was funded by the Open Access Fund of the Leibniz Association.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### **Genetic Diversity, Population Structure and Marker-Trait Association for 100-Seed Weight in International Sa**ffl**ower Panel Using SilicoDArT Marker Information**

**Fawad Ali 1,2, Muhammad Azhar Nadeem 3, Muza**ff**er Barut 2,4, Ephrem Habyarimana 5, Hassan Javed Chaudhary 1, Iftikhar Hussain Khalil 6, Ahmad Alsaleh 7, Rü¸stü Hatipo ˘glu 4, Tolga Karaköy 3, Cemal Kurt 4, Muhammad Aasim 3, Muhammad Sameeullah 2, Ndiko Ludidi 8, Seung Hwan Yang 9, Gyuhwa Chung 9,\* and Faheem Shehzad Baloch 2,\***


Received: 8 April 2020; Accepted: 19 May 2020; Published: 21 May 2020

**Abstract:** Safflower is an important oilseed crop mainly grown in the arid and semi-arid regions of the world. The aim of this study was to explore phenotypic and genetic diversity, population structure, and marker-trait association for 100-seed weight in 94 safflower accessions originating from 26 countries using silicoDArT markers. Analysis of variance revealed statistically significant genotypic effects (*p* < 0.01), while Turkey samples resulted in higher 100-seed weight compared to Pakistan samples. A Constellation plot divided the studied germplasm into two populations on the basis of their 100-seed weight. Various mean genetic diversity parameters including observed number of alleles (1.99), effective number of alleles (1.54), Shannon's information index (0.48), expected heterozygosity (0.32), and unbiased expected heterozygosity (0.32) for the entire population exhibited sufficient genetic diversity using 12232 silicoDArT markers. Analysis of molecular variance (AMOVA) revealed that most of the variations (91%) in world safflower panel are due to differences within country groups. A model-based structure grouped the 94 safflower accessions into populations A, B, C and an admixture population upon membership coefficient. Neighbor joining analysis grouped the safflower accessions into two populations (A and B). Principal coordinate analysis (PCoA) also clustered the safflower accessions on the basis of geographical origin. Three accessions; Egypt-5, Egypt-2, and India-2 revealed the highest genetic distance and hence might be recommended as candidate parental lines for safflower breeding programs. The mixed linear model i.e., the Q + K model, demonstrated that two DArTseq markers (DArT-45483051 and DArT-15672391) had

significant association (*p* < 0.01) for 100-seed weight. We envisage that identified DArTseq markers associated with 100-seed weight will be helpful to develop high-yielding cultivars of safflower through marker-assisted breeding in the near future.

**Keywords:** *Carthamus tinctorius*; genotyping by sequencing; germplasm characterization; GWAS; oilseed crop

#### **1. Introduction**

Safflower (*Carthamus tinctorius* L.) belongs to the family *Compositae* and it is a self-pollinated crop with a genome size of about 1.4 GB [1]. Safflower is cultivated in 20 different countries of the world on a total cultivated area of 1,140,002 hectares that produces approximately 948,516 tons [2]. It is grown as an important industrial crop for different purposes, which include extraction of edible oil, production of dyes, and several uses in the pharmaceutical industry [3–5]. Safflower has better adaptation to stress conditions such as salinity and drought, although it produces oil in lower quantity than other oilseed crops [6]. Safflower also gained importance because it has the capability of biofuel production [7]. *Carthamus* species have been utilized since the pre-historic period as its archeological remains were found at sites in Syria since 7500 BC. Safflower was distributed from its center of domestication (i.e., Syria) to linked regions comprising Egypt, the Aegean region and southern Europe [8].

Safflower is considered an underutilized crop in comparison to other oilseed crops such as soybean, rapeseed and sunflower [3]. Key factors contributing to its underutilized status include lower oil content and seed yield, insect pest susceptibility, and lower resistance to diseases, which decrease safflower productivity and quality [9]. Narrow genetic diversity of local and traditional varieties necessitate exploration of the genetic diversity of the available germplasm by collecting accessions from different geographies worldwide. Such collections will provide information that will aid safflower conservation and utilization in the future [10]. Domestication of safflower has bottlenecked its genetic diversity, which greatly reduced its adaptive potential against different environmental stresses [10,11]. Exploration of genetic diversity is regarded as an important tool yielding a good source of trait variations [12,13]. Genetic diversity present in the germplasm contributes to execution of successful breeding programs [14]. Germplasm possessing diverse traits may be helpful in breeding programs for the development of elite cultivars [15,16]. Characterization of crop germplasm also helps in food security through the identification of novel genetic variations [11–14]. The Food and Agriculture Organization described the decrease of crop genetic diversity as one of the important factors that negatively impact food security and the environment [17]. The availability of limited information regarding genetic diversity and lack of the efficient genomic tools are considered as hampering factors to the improvement of safflower agronomic traits in breeding programs. Great emphasis on the molecular characterization and development of efficient molecular markers in safflower is required to accelerate safflower breeding activities [18–20]. Safflower genetic diversity using different molecular markers has been estimated [2,10,18,21–27].

Next generation sequencing technologies, such as genotyping by sequencing (GBS) and multiplex sequencing, aid in the generation of massive genetic data for various applications [28]. Application of the current polymerase chain reaction (PCR)-based marker technologies—aiming at whole genome analysis for association studies, construction of genetic maps, assessment of the collected germplasm for large scale molecular evaluation, and genome wide selection of the desirable alleles—are not attainable due to consumable and labor costs [28]. The application of DNA hybridization-based technologies like some SNP technologies and Diversity Arrays Technology are more suitable for such purposes. Hassani et al. [29] implemented DArTseq technology to assess genetic diversity in 89 safflower accessions originating from different countries of the world. They applied 1136 silicoDArT markers along with 2295 SNPs in their investigation.

Linkage analysis, also known as QTL mapping, helps in the identification of genomic regions controlling complex plant traits. QTL mapping is a time-consuming technique that needs mapping populations to be developed from bi-parents. QTL mapping captures less allelic variation utilizing bi-parental populations due to the very low rate of occurrence of recombination events and low mapping resolution [30]. Association mapping is a more efficient and faster technique, which provides higher resolution of complex plant traits in comparison to QTL mapping. Association mapping emerged as a promising technique to avoid limitations present in QTL mapping [31,32]. Relationships between plant traits and genetic polymorphisms observed in a heterogeneous assembly of distinct individuals, utilizing naturally occurring recombination events, aid in fine scale mapping of traits. Ambreen et al. [21] and Ebrahimi et al. [33] identified marker-trait associations in safflower, utilizing SSR and AFLP marker systems, respectively. Yield in any crop is the most important trait that is polygenic and difficult to measure and improve as it is highly influenced by other contributing traits. Thus, indirect selection for yield is highly preferred in comparison to direct selection focusing on the yield-contributing traits [34]. This study aimed at the establishment of the usefulness of silicoDArT markers to investigate phenotypic and genetic diversity, population structure and marker-trait associations for 100-seed weight. To attain these aims, we implemented a total of 12232 silicoDArT markers detected by a DArTseq approach of genotyping by sequencing in a safflower panel collected from 26 countries.

#### **2. Results**

#### *2.1. Phenotypic Data Evaluation*

During this study, 100-seed weight at both locations (Pakistan and Turkey) was recorded with the help of an electronic seed counter and weighing balance by taking undamaged and fully matured seeds. Analysis of variance (ANOVA) for both locations revealed highly significant differences among the studied safflower accessions for 100-seed weight (Table 1). Minimum, maximum, and mean values for 100-seed weight reflected sufficient phenotypic variation in the studied safflower panel (Table 2). Overall 100-seed weight ranged from 2.17 to 5.32 g with an average of 3.33 g. This reflects the presence of genetic variability and suggests that the safflower accessions studied here are suitable for association analysis. Mean 100-seed weight in Pakistan was comparatively lower than 100-seed weight obtained from Turkey (Table 2). Safflower accessions Egypt-5, China-3 and China-5 recorded superior 100-seed weight at both locations (Supplementary Table S1). Frequency distribution revealed a normal distribution of 100-seed weight at both locations (Pakistan and Turkey) and better mean performance of the safflower accessions for 100-seed weight in Turkey compared to Pakistan (Figure 1).



\*\* Significant at *p* < 0.01.


**Table 2.** Minimum, maximum, mean, and standard deviation (StD) of the 100-seed weight in international safflower panel.

**Figure 1.** Frequency distribution chart for 100-seed weight for both studied locations (Pakistan and Turkey).

The implemented constellation plot clearly divided international safflower panel into two populations on the basis of their 100-seed weight. A total of 46 and 48 accessions clustered in population A and population B, respectively (Figure 2).

**Figure 2.** Constellation plot for 100-seed weight in international safflower panel.

#### *2.2. DArTseq Profiling by GBS*

DArTseq profiling of 94 safflower accessions resulted in a total of 29,048 silicoDArT markers. This dataset was filtered by accounting markers having less than 5% missing data, polymorphism information content (PIC) value of 0.10 to 0.50, call rate greater than 0.81, and 100% reproducibility, to retain 12,232 high quality markers for further analysis. Figure 3 shows the distribution of PIC values of the filtered silicoDArT marker dataset. The whole safflower panel revealed maximum and minimum PIC values of 0.50 and 0.10 respectively, with an average of 0.31. Highest and lowest call rate values of 1.00% and 0.81%, with an average of 0.93%, were obtained through the whole safflower panel.

**Figure 3.** Frequency distribution of PIC values of 12232 silicoDArT markers.

#### *2.3. Genetic Diversity and Population Structure Analysis in Sa*ffl*ower Panel*

Various diversity parameters such as observed number of alleles (1.99), effective number of alleles (1.54), Shannon's information index (0.48), expected heterozygosity (0.32), and unbiased expected heterozygosity (0.32) reflected a good level of genetic variation in the studied germplasm (Table 3, Supplementary Table S2). Maximum genetic distance (0.76) was found between Egypt-2 and India-2 accessions, while mean genetic distance for the entire safflower population was 0.50 (Supplementary Table S3). Diversity indices were investigated on country basis, and Pakistan and Turkey revealed the existence of maximum percentage of polymorphic loci and high diversity parameters from the rest of countries (Table 3).


**Table 3.** Diversity indices calculated to investigate genetic diversity for whole safflower panel and accessions grouped according to country of origin panel with silicoDArT markers.

Na: Observed number of alleles, Ne: Number of effective alleles, I: Shannon's information index, He: Expected heterozygosity, uHe: Unbiased expected heterozygosity, GD: Jaccard Genetic distance.

Pairwise kinship coefficients ranged from −1.45 to 1.24 for the entire safflower panel. A total of 51.17% kinship values ranged from −0.40 to 0.4. 99% of the kinship coefficient values, which ranged from 0.60 to 1.00; however, 0.21% of the kinship coefficients ranged from 1.10 to 1.30, respectively (Figure 4). Analysis of molecular variance (AMOVA) revealed the division of the total variation into two stratum; i.e., among countries (9%) and within country group (91%) (Table 4). The ΔK peak at K = 3 in the structure analysis revealed that the genetic structure of the 94 safflower accessions is divided into three groups (Figure 5).


**Table 4.** Analysis of molecular variance among countries and within country groups of safflower germplasm.


**Figure 5.** Delta K for the entire safflower population indicating the presence of three subpopulations at K = 3.

The Bayesian clustering model grouped the international safflower panel into three main populations implemented in STRUCTURE software on the basis of membership coefficient: 17 accessions in population A, 6 accessions in population B, and 21 accessions in population C. The remaining 50 accessions were clustered as admixture population (Figure 6). Clustering of the safflower accessions within the same population revealed membership coefficients of either 80% or greater than 80%. The Neighbor Joining analysis divided the 94 safflower accessions into two populations (A and B), each containing 47 accessions (Figure 7). PCoA was also performed and results showed that collection countries played some role in the clustering and an admixture of accessions was found as well, which had a similar structure clustering (Figure 8).

**Figure 6.** Structure-based clustering of the 94 safflower accessions using silicoDArT molecular markers.

**Figure 8.** Principal coordinate analysis (PCoA) of the 94 safflower accessions using silicoDArT molecular markers.

#### *2.4. Marker-Trait Associations and Putative Candiadate Genes Identification for 100-Seed Weight*

The MLM (Q + K) model was performed to assess marker-trait associations for 100-seed weight in the international safflower panel. DArT-45483051 and DArT-15672391 showed marker-trait association for 100-seed weight by revealing statistically significant association (*p*-value; 1.17E-04 and 1.15E-04), respectively (Figure 9). DArT-45483051 and DArT-15672391 markers were present on supposedly chromosome 2 and 3, respectively. DArT-45483051 and DArT-15672391 markers contributed a total of 17.40% and 18.60% variation for 100-seed weight, respectively (Table 5). BLAST search against NCBI for DArT-45483051 resulted in AT1G01040 as a putative candidate gene. A search using the DArT-15672391 marker resulted in the retrieval of AT5G58040 as a putative gene for this marker.

**Figure 9.** Pseudo manhattan plot for 100-seed weight in world safflower panel. DArT-45483051 and DArT-15672391 were considered statistically (FDR thresholds *p* = 0.01) associated with this trait.


DArT-15672391 1.15E-04 18.60% *AT5G58040*

**Table 5.** Marker-trait associations of the 100-seed weight with its associated markers in international safflower panel of 94 accessions.

#### **3. Discussion**

The studied safflower panel revealed statistically significant differences for 100-seed weight. Analysis of variance (ANOVA) confirmed the statistically significant genotypic effects in both locations (Table 1). These results were found to be in line with El-Lattief et al. [35] as they also found statistically significant genotypic effects for various agronomic traits of safflower, including 1000-seed weight. Frequency distribution for 100-seed weight at both locations (Pakistan and Turkey) was also calculated for a better understanding of the distribution of data. There was a normal frequency distribution of 100-seed weight at both locations. Frequency distribution revealed that only one safflower accession (Egypt-5) resulted in average 100-seed weight of > 5.00 g at both locations (Figure 1). Overall mean (3.29 g) and range of 100-seed weight in this study were higher than that reported in a previous study [36]. Mean and range of 100-seed weight obtained from Turkey were slightly higher than those obtained from Pakistan. Köse et al. [37] ascribe variations in seed weight to be a result of the reaction of genotypes to different environments. Our ANOVA also confirmed that there is a statistically significant genotypic effect on 100-seed weight. Constellation plot clustered the whole germplasm into two main populations, i.e., 46 and 48 accessions present in population A and B based on their 100-seed weight (Figure 2). Further division of accessions into subpopulations occurred. i.e., A1, A2, A3, B1, B2, and B3. Subpopulation A1 clustered only those 12 safflower accessions having 100-seed weight >4 g. Subpopulation A2 clustered safflower accessions having 3.70 to 4.00 g 100-seed weight. Accessions having 100-seed weight range 3.24 to 3.59 g clustered together, making subpopulation A3. Subpopulation B1 clustered safflower accessions having 100-seed weight ranging from 3.13 to 3.29 g. Accessions having 100-seed weight in a range of 2.64 to 2.85 g clustered together to make subpopulation B2. All accessions having 100-seed weight <2.60 g were present in subpopulation B3. As it is obvious from the above discussed results, 12 accessions present in the A1 subpopulation have higher 100-seed weight than the rest of the accessions. Therefore, these accessions can be used for breeding activities of safflower for high yield.

DArTseq technology gained the attention of scientists globally due to low cost and high throughput nature. DArTseq technology has been used to explore the genetic diversity and population structure of different crops with a large number of entries and complex genomes [38,39]. Hassani et al. [29] used DArTseq technology to explore genetic variations in a world panel of 89 safflower genotypes of diverse origin. The safflower panel utilized in their investigation is different from our panel except one accession, i.e., Afghanistan-1. During this study we also aimed to explore the phenotypic and genetic diversity, population structure and marker-trait association in an international safflower panel using silicoDArT markers. Hassani et al. [29] used 1136 silicoDArT and 2295 SNP markers, while we used a higher number of markers (12232) for the molecular characterization. Moreover, Hassani et al. [29] used germplasm from 12 countries, while we included germplasm from 26 countries to explore population structure more extensively.

Polymorphism information content (PIC) value is a measure of polymorphism which provides information regarding the genetic diversity or DNA segment in a studied population, and indicates the allele's evolutionary pressure and mutations that occurred at a locus over a time period. The range of the PIC value (0.10 to 0.50) obtained in this study suggests the existence of a high level of genetic variation that might be derived utilizing a large number of good quality markers in a diversified safflower panel. An average PIC value of 0.31 across all the silicoDArT markers was obtained during this study. PIC values were distributed asymmetrically and were skewed towards the lower values. More than 50% of the implemented silicoDArT markers revealed a PIC value of more than 0.30, which indicates the informativeness and usefulness of these markers for genetic diversity, population structure, and marker-trait association in safflower.

Diversity parameters including observed number of alleles (Na), effective number of alleles (Ne), Shannon's information index (I), expected heterozygosity (He), and unbiased expected heterozygosity (uHe) for the entire population of 94 safflower accessions, which were 1.99, 1.54, 0.48, 0.32, and 0.32, respectively. Previous use of different gel-based marker systems obtained lower diversity metrics values than our current results from the silicoDArT marker system [2,18,40,41]. The most prominent reason for getting good diversity results is likely due to higher capability of the silicoDArT marker system in comparison with other gel-based marker systems. Differences in the experimental materials might also be another reason of revealing higher polymorphism in this study. Furthermore, results of diversity indices on the basis of collection countries revealed the highest polymorphism and genetic diversity for safflower genotypes from Pakistan and Turkey, while the lowest polymorphism and genetic diversity was obtained for safflower accessions originating from Iraq and Morocco. In a similar way, the highest mean genetic distance was observed for accessions originating from Turkey, and followed by India, Austria and Uzbekistan. The lowest mean genetic distance was observed for accessions originating from Jordan, followed by Spain (Table 3).

The Jaccard coefficients of genetic distance resulted in a mean value of 0.50 for the entire population of 94 safflower accessions. A maximum genetic distance was proposed between safflower accessions Egypt-2 and India-2, followed by Egypt-5 and India-2 with genetic distance values of 0.76 and 0.76, respectively. The highest genetic similarity was recorded between safflower accessions Spain-1 and Spain-2, with a genetic distance value of 0.14. The presence of higher genetic similarity between safflower accessions is possibly because of their origin from common parents (Supplementary Table S3). Safflower accessions containing desirable plant traits can be integrated in different breeding programs to aid in the development of superior cultivars [2]. The most diverse safflower accessions identified (Egypt-2, India-2, and Egypt-5) during the current evaluation can be recommended as candidate parental lines for future safflower breeding activities. The inferences obtained from kinship coefficient estimations with silicoDArT markers are robust to population structure. Negative kinship coefficients were also observed, suggesting an unrelated relationship between the safflower accessions. The close relatives can be inferred fairly reliably based on the estimated kinship coefficients. Thus, it is suggested that most of the safflower accessions were less related, having kinship coefficients of either 0 or below 0 (Figure 4). Analysis of molecular variance (AMOVA) revealed the division of the total variation into two strata, i.e., among countries and within country. A total of 91% of the genetic diversity was present within country group (Table 4). This is supported by Hassani et al. [29], where the majority of genetic variation among accessions within populations was obtained. The presence of a higher level of genetic variation within populations can be attributed to gene flow, which depends on the informal seed exchanges between farmers of different ecological zones [42].

#### *3.1. Genetic Structure and Diversity in Sa*ffl*ower Panel*

The three clustering algorithms important to genetic diversity and population structure analysis (model-based structure, Neighbor Joining, and PCoA) were implemented and revealed that the safflower accessions were successfully grouped by the silicoDArT markers based on geographical regions. Among the three clustering algorithms, more preference was given to the model-based structure algorithms. The reason for giving such a high preference to the structure is that this algorithm revealed more robustness in the previous works [43,44]. Structure algorithm divided the whole germplasm panel into four genetic populations: population A, B, C, and an admixture population. These different populations will aid in the selection of the parental accessions, which can used to design and conduct various crossing combinations for safflower genetic improvement (Figure 6).

Grouping of the safflower accessions obtained from structure analysis was based on the geographical origins of germplasm. Population A clustered safflower accessions originating from Israel, Jordan, Syria, Egypt, Turkey, Hungary, Afghanistan, and Pakistan. Out of 17 of these accessions, 13 belong to the geographical locations situated in the Mediterranean region. Safflower accessions from the Mediterranean region clustered together and revealed their genetic similarity, and possibly shared a similar parentage. Clustering of the safflower accessions from Mediterranean countries proposes this region as a center of safflower domestication, with Syria having a predominant role [8]. Population B clustered safflower accessions from the South Asian countries, i.e., Afghanistan, Iran and Bangladesh, but not Turkey. Population C revealed safflower accessions from Iran, Pakistan, India, Turkey, Bangladesh, Egypt, Morocco, Uzbekistan, Spain, and Argentina. All four populations clustered accessions from Turkey. Our results are strongly supported by Hassani et al. [29] as they observed that safflower accessions from the Asian continent like Pakistan, Iran, Turkey, and India were found genetically similar and grouped into the same cluster. Clustering of the safflower accessions originating from the Mediterranean region to other localities suggested the distribution of safflower accessions from the Mediterranean region to other geographies. Turkey signifies a high level of biodiversity and differentiation center for safflower among the continents, thus reflecting a key role in the connection of different continents (Asia and Europe) with each other [15], and might possibly have played a role in the distribution of safflower from its domestication center. Our current findings are also supported by previously conducted archeological and molecular characterization studies using safflower accessions with its wild progenitors. These studies concluded the domestication of safflower in Fertile Crescent and its distribution to other parts of the world, i.e., Europe, Africa, the Middle and Far East [45,46]. All safflower accessions with a membership coefficient below 80% were classified as admixture populations. Accessions from countries such as Afghanistan, Australia, Austria, Bangladesh, China, Egypt, France, India, Iran, Israel, Iraq, Jordan, Kazakhstan, Libya, Morocco, Pakistan, Portugal, Romania, Russia, Syria, Thailand, Turkey, and Uzbekistan were clustered in the admixture population. The representation of safflower accessions from such a wide range of countries in the admixture population might be due to their use in international breeding programs. Other important factors like mutation, migration and selection by humans might also be responsible for the occurrence of admixture populations in safflower accessions of different origin [21,27].

Neighbor joining analysis divided the studied germplasm into two populations based on their geographical origin (Figure 7). Structure-based clustering of the 94 safflower accessions was also greatly supported by the principal coordinate analysis (PCoA) with silicoDArT markers information. PCoA resulted in clustering of safflower accessions on the basis of their geographical origins (Figure 8). Clustering results by PCoA were also supported by the structure, as admixture of accessions was also observed in the PCoA clustering as well.

#### *3.2. Marker-Trait Associations for 100-Seed Weight*

Seed yield is a complex trait with many underlying factors contributing to it. Comprehensive understanding of the relationship between seed yield and other contributing traits is crucial to the process of selection and ultimately to crop improvement [47]. In safflower, 100-seed weight is regarded as an important yield trait. Seed weight is proposed as an important trait to the process of selection in safflower breeding programs [48–50]. Furthermore, Chaudhary [48] observed the positive effects of 1000-seed weight on seed yield in safflower. Identification of loci influencing important plant morpho-agronomic traits is a prerequisite to marker-assisted breeding for enhancement of crop productivity. Our current investigation involved the identification of two silicoDArT markers (DArT-45483051 and DArT-15672391) associated with 100-seed weight (Table 5). Earlier studies reported different loci/markers linked with 100-seed weight. Ambreen et al. [21] reported two loci (NGSaf\_306 and NGSaf\_309) associated with 100-seed weight utilizing SSR markers. Mirzahashemi et al. [51] identified one QTL (qThsw5) associated with 100-seed weight.

The BLAST search against DArT-45483051 marker resulted in AT1G01040, a putative candidate gene. AT1G01040 is highly associated with seed embryo health and embryo shape at seed maturity and ovule development [52]. Therefore, the marker DArT-45483051 associated with 100-seed weight

can be suggested for marker-assisted breeding of safflower for this trait. The BLAST search against DArT-15672391 retrieved a gene (AT5G58040) that encodes the pre-mRNA polyadenylation factor FIP1. Recently, Téllez-Robledo et al. [53] explored the role of polyadenylation factor FIP1 for plant development and root response to abiotic stresses. Polyadenylation factor FIP1 plays an important role in the plant embryo cotyledonary stage of development. Paez-Garcia et al. [54] established that better root growth ultimately contributes to higher crop yield. Based on the role of polyadenylation factor FIP1 during the plant embryo cotyledonary stage, AT5G58040 should be considered as having a potentially important role in seed weight. The studied germplasm reflected a wide range of phenotypic variations for 100-seed weight. Moreover, various genetic diversity indices also confirmed the existence of higher polymorphism in the evaluated germplasm. Characterization of germplasm provides us with an opportunity to unlock the novel genetic variations that can be utilized for breeding purposes [55–57]. This is a pioneer study concerning the investigation of marker-trait association for 100-seed weight for safflower using GBS analysis. We believe that these identified markers can be utilized in safflower marker-assisted breeding in order to develop cultivars with improved yield.

#### **4. Materials and Methods**

#### *4.1. Plant Materials and Phenotypic Evaluation*

A total of 94 safflower accessions originating from 26 countries were used as plant materials in this study. Seeds of the evaluated germplasm were provided by the United States Department of Agriculture (USDA) (Supplementary Table S4). The experimental materials were sown at two diverse locations, i.e., Pakistan and Turkey. The dirst experiment was conducted at the National Agricultural Research Center (Pakistan), whereas the second experiment was conducted at the research and experimental area of Bolu Abant Izzet Baysal University (Turkey) during 2016–2017 and 2018, respectively. Field experiments were performed by implementing an augmented block design. Seeds of each safflower accession were planted in elementary plots with a row length, inter-row and intra-row spacing of 3 m, 50 cm, and 10 cm respectively. A total of 10 plants for each accession were maintained for the phenotypic characterization. Thori-78 and Dinçer were included as check cultivars. Di-ammonium phosphate (DAP) and ammonium sulfate were applied as a source of fertilizer, while standard cultural practices were performed at both locations. Safflower accessions were harvested at their proper maturity at both locations. Measurement of 100-seed weight was done with the help of an electronic seed counter by taking undamaged and fully matured seeds randomly in triplicate.

#### *4.2. Genomic DNA Isolation*

To extract the genomic DNA from each accession, fresh, healthy and young leaves were harvested and kept frozen in the laboratory at −80 ◦C. DNA isolation of each safflower accession was performed utilizing the bulk of leaves from 10 individuals. The individuals used for the purpose of DNA isolation were from plants from the original seeds from the gene bank. DNA isolation was performed according to CTAB protocol [58] and a specific protocol suggested by Diversity Arrays Technology [59]. DNA concentration was estimated with agarose gel (0.80%) and was then confirmed with NanoDrop (DeNovix DS-11 FX, USA). For the genotyping by sequencing (GBS) analysis, DNA was diluted and a 50 ng.μl <sup>−</sup><sup>1</sup> DNA concentration was maintained. The prepared DNA samples were sent to Diversity Array Technology Pty, Ltd., Bruce, Australia, for DArTseq analyses of GBS [60].

#### *4.3. DArTseq-Generated SilicoDArT Marker Analysis*

DArTseq technology is a complexity reduction method and next generation sequencing platform [61]. DArTseq facilitated the selection of the genome fractions containing active genes associated with agronomically important plant traits [62]. Digestion/ligation reactions were used for the processing of DNA samples following the method described by Kilian et al. [63]. Mixed fragments (PstI–MseI) were amplified by performing 30 rounds of PCR cycles. Details of silicoDArT markers analysis can be found in earlier studies [63,64].

#### *4.4. Statistical Analysis*

#### 4.4.1. Phenotypic Data Analysis

Online software developed by Rathore et al. [65] for statistical inferences of augmented block design was used. Analysis of variance (ANOVA) for both locations was calculated through SAS 9.3 version [66]. Data recorded on 100-seed weight of both field experiments was averaged and used to calculate parameters like minimum, maximum, mean, standard deviation, and frequency distribution utilizing statistical software XLSTAT (Addinsoft, 2018) [67]. The cluster constellation plot for 94 safflower accessions was constructed through JMP 14.1.0 statistical software (2018, SAS Institute Inc., Cary, NC, USA).

#### 4.4.2. DArTseq Markers Analysis

All images were analyzed from the DArTseq platform using DArTsoft v.7.4.7 (DArT P/L, Canberra, Australia). SilicoDArT are dominant markers that were detected through DArTseq and scored using the binary fashion, where 1 or 0 represent presence or absence of the restriction fragment in the genomic representation of each sample, respectively [12,68]. Screening of the markers was done with various parameters including call rate, polymorphism information content (PIC) and reproducibility being considered. Markers with PIC, reproducibility and call rate lower than 0.10, 100% and 0.80% were ignored during bioinformatics analyses to avoid false inferences.

#### 4.4.3. Genetic Diversity Analyses

The proportion of shared alleles that were obtained from silicoDArT markers were used to compute the genetic distances among the safflower accessions using Jaccard's coefficients of genetic distance. Important diversity metrics such as observed number of alleles (Na), effective number of alleles (Ne), Shannon's Information Index (I), expected heterozygosity (He), and unbiased expected heterozygosity (uHe) were estimated for the entire population using GenAlEx 6.5 [69]. The kinship coefficients between safflower accessions were calculated with hierfstat R package to investigate the pairwise relationships of the 94 safflower accessions. Analysis of molecular variance (AMOVA) was computed with GenAlEx 6.5 [69] considering total variation into two strata, i.e., among countries and within countries.

Population structure of the studied safflower accessions was evaluated with STRUCTURE software (version 2.3.4; [70]). The most suitable number of clusters (K subpopulations) ranging from 1 to 10 was determined applying STRUCTURE software following the protocol of Evanno et al. [71]. For each K value and for each run, 10 independent runs were set. The initial burn-in period was set to 500 with 500,000 MCMC (Markov Chain Monte Carlo) iterations with no prior information on the origin of individuals. The most probable number of subpopulations was investigated by following the methodology suggested by Evanno et al. [71]. Each accession was assigned to a specific population on the basis of a membership coefficient. The PCoA was performed with GenAlEx 6.5 [69], while the Neighbor Joining tree was constructed with hierfstat R package in R statistical software. The populations obtained from the Neighbor Joining and PCoA were named and colored with the same clusters pattern identified with model-based Structure algorithm for coherence purposes.

#### 4.4.4. Functional Analysis for Putative Candidate Gene Identification

To investigate the putative candidate genes, sequences of identified silicoDArT markers were used to perform BLAST searches against the National Center for Biotechnology Information (NCBI) [72], and the Phytozome V.12.1 [73] database. Moreover, detailed information about putative candidate genes was obtained from the TAIR database [74].

#### 4.4.5. Genome-Wide Association Mapping for 100-Seed Weight

A Mixed linear model (MLM, Q + K) approach was applied to inspect marker-trait associations (MTAs) via TASSEL 5.0.5 [75]. The population and family structure were corrected utilizing Q-metrics (Q) and kinship (K) during association analysis, as suggested by Nadeem et al. [76,77]. Scaled identity was utilized to detect kinship matrix by the descent method applied in TASSEL 5.0.5 [75]. In the results of association analysis, the *p* value signifies the relatedness of a marker with the associated trait, and R2 reflects the proportion of phenotypic variation resulting from a significant marker [78]. SilicoDArT markers with Bonferroni and FDR thresholds *p* = 0.01 were taken as significantly associated with the 100-seed weight. A Pseudo-Manhattan plot was developed using the qq-man R Package in the R 4.0.0 statistical software [79].

#### **5. Conclusions**

The current evaluation revealed a good level of phenotypic and genetic diversity in the studied safflower panel from the silicoDArT markers information. Analysis of variance (ANOVA) revealed the highly significant genotypic effect for 100-seed weight. Frequency distribution resulted in a normal distribution for 100-seed weight across the two locations (Pakistan and Turkey). The mean and range of 100-seed weight obtained from Turkey was higher than that from Pakistan. Analysis of molecular variance (AMOVA) revealed the division of total variations into two stratum i.e., among countries and within country. A total of 91% of the genetic variation was present within country and low variation (9%) was observed among the countries. Findings of AMOVA were also supported by the results of 100-seed weight and genetic distance. A good range of variations in 100-seed weight and genetic distance calculated at countries basis confirmed that mostly variations resulted in this study are because of diverse individuals within countries. Safflower accessions Egypt-5, Egypt-2, and India-2 showed the highest genetic distance among the studied panel and hence might be recommended as candidate parental lines for safflower breeding programs. Moreover, Egypt-5 is the only accession among the studied international safflower panel that reflected 100-seed weight of > 5.00 g at both locations and the highest genetic distance (0.76 with silicoDArT markers). Model-based structure analysis, Neighbor joining analysis and Principal coordinate analysis (PCoA) clustered the safflower accessions on the basis of their geographical origin. This is a pioneer study uncovering the marker-trait association for 100-seed weight in safflower. A total of two silicoDArT markers (DArT-45483051 and DArT-15672391) showed statistically significant association for 100-seed weight and these markers can be used in marker-assisted breeding to develop safflower cultivars with improved yield.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2223-7747/9/5/652/s1, Supplementary Table S1: Mean data for 100-seed weight across two locations (Pakistan and Turkey), Supplementary Table S2: Calculated diversity parameters for the whole safflower germplasm, Supplementary Table S3: Genetic distance between the 94 safflower accessions, Supplementary Table S4: List of 94 safflower accessions evaluated to explore genetic diversity and population structure with silicoDArT molecular markers.

**Author Contributions:** Conceptualization, F.S.B., H.J.C.; methodology, F.S.B., M.A.N.; software, M.A.N., E.H.; validation, F.A., M.A.N.; formal analysis, E.H., M.A.N., M.S.; investigation, F.A., M.A.N., M.B., A.A.; resources, H.J.C., G.C., S.H.Y.; data curation, F.A., M.A.N., M.B.; writing—original draft preparation, F.A., M.A.N.; writing—review and editing, M.A.N., R.H., S.H.Y., C.K., A.A., N.L., M.A., T.K., I.H.K.; supervision, F.S.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding

**Acknowledgments:** The authors are very thankful to TUB˙ ITAK (The Scientific and Technological Research Council of Turkey) for their support to Fawad Ali under the TUBITAK-2216 fellowship program.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
