**3. Discussion**

*Medicago truncatula* is a representative of species adapted to the typical Mediterranean climate characterized by strong seasonality with hot and dry summers often with large diurnal temperature oscillations followed by main rainfall in the autumn. In particular, the eastern and southern zones of the Mediterranean-desert transitions are associated with increased aridity [38]. Summer drought limits growth and is the major cause of seedling mortality, while winter cold limits vegetative growth. High oscillating temperatures are hypothesized [8] to be one of the main triggers of PY dormancy release and it was confirmed experimentally in several legume species, including *Lupinus, Trifolium, Pisum* [10,26–28] and, in this study, *Medicago*. However, these studies tested only the effect of temperature variation, while more factors vary in nature [39]. In particular, soil moisture oscillation is very difficult, if not impossible, to mimic in laboratory conditions. The effect of water potential on seed germination was tested only as a static component [40,41]. As a result, our experimental setup could reveal only temperature-related dormancy release.

#### *3.1. Association between Seed Dormancy Traits and the Environment*

Variation in germination strategy is particularly relevant for plants inhabiting unpredictable environments and is consistent with seed function, and securing the next generation in time and space [39]. In our study, seed dormancy release varied among accessions and years, and this could potentially act as a mechanism that favors the persistence of the seed in the soil and helps to distribute genetic diversity through time [24,42]. However, when taking average estimates of various traits characterizing absolute dormancy release (i.e., FPYD, AUC, and AUC35-25), no clear relationships were found between synthetic environmental (macroclimatic) clines (expressed as PCA axes) and mean dormancy status per accession (Figure 2). Our observations thus seem to be in agreemen<sup>t</sup> with study of Mediterranean wild lupines [29] or perennial woody legume (*Vachellia aroma*) along a precipitation gradient that did not find any relationship between climate and dormancy release [43]. In contrast, *Arabidopsis thaliana* [44,45] and other winter-annual species, such as *Betta vulgaris* subsp. *maritima, Biscutella didyma, Bromus fasciculatus*, and *Pisum sativum* subsp. *Elatius*, showed a cline in dormancy [10,43,46,47]. In particular, more dormant genotypes of mentioned taxa occurred in lower latitudes or more arid habitats with seasonally unpredictable precipitation and less dormant in higher latitudes or more humid habitats, suggesting dormancy is an adaptation securing population survival in less predictable conditions. However, in arid and unpredictable environments, there are also species called "risk-taking" with lower dormancy and rapid germination in response to lower rainfall events. The high and reliable seed production determines that the consequences of failure to establish in these species are less dire [17].

Does the absence of a relationship between average estimates of dormancy release in *Medicago* and macroclimatic clines within our dataset sugges<sup>t</sup> that selection does not operate on dormancy traits? Taking into account plasticity of dormancy release between temperature treatments revealed that physical dormancy plasticity (PIPY) increases with increasing aridity (2D, Figures 3 and 4; Supplementary data Figure S3). It follows, therefore, that more plastic behavior can potentially distribute germination across the year and act as a within-season bet-hedging strategy [48], suggesting that in more unpredictable environments genetic components of phenotypic variance may be lower and thus a reduced evolutionary response to selection would be possible [24]. The bet-hedging strategy is thus positively associated with more arid habitats as found in pea [10], and plastic responses provide potential to cope with high levels of environmental heterogeneity [49]. On the other hand, the responsiveness of the accessions of the macro-environmental clusters K1, K3 (except in 2017), and K4 to high temperatures (35/15 ◦C) in relation to cluster K2 (Figures 2–4) could be one of the main triggers of PY dormancy release under an arid and unpredictable environment. Occasional precipitation and hot temperatures in summer would accelerate the PY dormancy release and germination after overcoming the PD dormancy. Earlier emergence can profit from a long growing season, providing a competitive and reproductive advantage for limited resources. Recently, ten Brink et al. [50] showed

that more unpredictable natural environments can select earlier within-season germination phenology. In addition, we found an increase, although slight, in PY dormancy for accessions originating from sites with higher among-season temperature variations expressed by IV indexes. This could be an among-season bet-hedging strategy, which increases PY dormancy under more unstable summer temperatures between years, and might thus contribute to avoidance of increased germination under adverse temperatures condition or "false breaks" (seed germination outside the optimal growing season) [51].

#### *3.2. Potential Shortcomings of the Study*

Although dormancy is genetically determined, it also depends on the environmental conditions experienced by the mother plant and the subsequent status of the seed [24,39]. This was shown in several taxa, including *Trifolium* and *Medicago* [24,27,28,52,53]. There are several possible sources of variability, from the effect of the maternal plant status (drought, photoperiod, nutrition) to natural variation within a population or even the same individual [54]. Distinguishing between maternal and environmental effects is difficult. The information on the maternal environmental can be mediated via the nutrition, phytohormones, or gene expression levels and variability in the seed sensitivity [39]. All this might contribute to *M. truncatula* seed stock generated in our study. To minimize environmental maternal effects, we grew the accessions under common garden conditions (glasshouse), but these were to some extent variable between years. In 2016, the accessions were sown February to April and harvested in a hot July with a day temperature over 35 ◦C, while in 2017 and 2018, they were sown in September and grew over winter, flowered in February, and maturated in April, with day temperatures in the glasshouse around 28 ◦C. The higher temperature in 2016 during the seed filling period resulted in more dormant seeds in some accessions in relation to 2017 and 2018 (Supplementary data Figure S8). In addition to this, seeds from different accessions differ up to 3 weeks in maturation due to differences in flowering time. Moreover, the individual seed stock from a given accession was harvested during a period of about 3 weeks, which could be possibly synchronized by different sowing. This needs to be considered in follow up studies.

Our analysis was also likely impacted by several factors inherent to the available *Medicago* set. At first, there is substantial imprecision in the GPS localization of the origin of some accessions [55] leading consequently to incorrectly extracted environmental factors [56–58]. Secondly, there is geographical bias towards the western part of the Mediterranean with underrepresented parts of the native species range, such as Italy, Adriatic Sea coast, Turkey, Lebanon, and Israel. In addition, the characteristics of WorldClim data (averages in terms of time and also space) mask the micro-ecological pattern and, in geographically complex regions, environmental conditions change considerably over short spatial scales, such that neighboring populations can be subject to different selective pressures, as found in the study of seed dormancy of Swedish *A. thaliana* accessions [59].

#### *3.3. Genetic Basis of Seed Dormancy Release in Legumes*

In contrast to the seed development, the genetic basis of *Medicago* seed germination was studied only by Dias et al. [60]. These authors, however, focused on true germination, e.g., radicle emergence, removing the seed coat prior to testing and thus assessing physiological dormancy, while we were interested in PY dormancy executed by seed coat permeability. Since the dormancy is indirectly evaluated as the release from dormancy by imbibition and germination, the detected candidate genes might be related to the changes in the seed coat mediated imbibition process rather than dormancy per se. There was no overlap in associated loci between the tested seed dormancy traits, despite the detection of a similar set of candidate genes (Supplementary data Table S7). This is similar to other tested quantitative and complex traits, such as drought and biomass, where different traits had different candidate genes [61]. We have detected four genes active in flavonoid and phenylpropanoid biosynthesis pathways leading either to flavonoids or via polymerization to lignins impregnating the seed coat [12,62]. Notably, homologue genes were detected when comparing dormant and non-dormant pea seed coat expression [63]. Furthermore, hydrolytic enzymes such as xyloglucan 6-xylosyltransferase, xylogalacturonan β-1,3-xylosyltransferase involved in plant cell wall modification were identified. Notably, one of the Quantitative Trait Loci (QTL) identified in biparental mapping of *Medicago* seed germination also encodes xyloglucan endotransglucosylase [60]. The β-1,3-Glucanase (EC 3.2.1.39) plays roles in the regulation of seed germination, dormancy, and in the defense against pathogens. The β-1,3-glucan layer is in the seed coat of cucurbitaceous species and confers seed semi-permeability [64]. In tobacco seeds, β-1,3-Glucanase was shown to be at the micropylar part of the endosperm prior to radicle protrusion, and seems to play a role in cell wall loosening [65]. Pectinesterases were isolated from germinating seeds of various species and are assumed to play an important role in loosening cell walls [66]. Polygalacturonases (EC 3.2.1.15) are another cell wall degrading enzyme. These were shown to play an essential role in pollen maturation and in pectin metabolism during fruit softening and weakening of the endosperm cell walls [67]. Exo-(1→4)-β-galactanases (EC 3.2.1.23) play various roles in physiological events, including cell wall expansion and degradation during soft fruit ripening, and were found to be involved in the mobilization of polysaccharides from the cotyledon cell walls of *Lupinus angustifolius* following germination [68]. Nitric oxide (NO) was recently shown to be involved in plant development including seed germination [69]. NO-dependent protein post-translational modifications are proposed as a key mechanism underlying NO signaling during early seed germination. Our GWAS analysis identified seven putative peroxidase and thio-/peroxiredoxin genes. Peroxiredoxins (EC 1.11.1.15) catalyze the reduction of hydroperoxides, conferring resistance to oxidative stress. Recent studies have demonstrated that Reactive Oxygen Species (ROS) have key roles in the release of seed dormancy, as well as in protection from pathogens [70–72]. Thioredoxins were identified to promote seed germination [73]. Peroxidases (EC 1.11.1.7) are also implicated in lignin/suberin formation during the polymerization of monolignols synthetized in the final steps of the phenylpropanoid pathway [74]. Phytohormones and especially gibberellins are known to play important roles in seed development and germination (reviewed in [75]), and since *Medicago* seeds have both physical and physiological dormancy, it is not surprising to find gibberellin 20-oxidase and two ent-kaurenoic acid oxidase genes to be associated with dormancy release (AUC25) or environmental factors (BIO12), respectively. The genomic signature of *M. truncatula* adaptation to the climate was studied by Yoder et al. [76] using essentially the same set of lines. They analyzed the relationship to BIO1, BIO3, and BIO16; thus, there is only overlap in BIO1 (annual mean temperature) with our study. The different candidate genes were detected. This could be attributed to differences in accessions and analytical methods, as well as *Medicago* genome versions. However, despite these differences, some similar genes were detected, such as 1, 3-glucanase or kinases. Several kinases and disease resistance (TIR-NBS-LRR class) genes are associated with dormancy release traits. These have been implicated in pathogen sensing and host resistance, which might reflect the sensing of cell wall modulating enzyme activities, similar to pathogen attack [77]. As the result of bet-hedging, seeds in the soil form long-term seed banks where they need to be protected from microbial decay by presence of secondary metabolites, as well as seed defense enzymes [72,77]. Therefore, one of the possible future directions of seed dormancy release studies should be the study of seed–soil–microbiome relationships and seed coat enzymatic activities.

#### **4. Materials and Methods**

## *4.1. Plant Material*

Seeds of *Medicago truncatula* inbred lines were selected from HapMap collection [36,76] based on accuracy of coordinates and were obtained from INRA, Montpellier, France and University of Minnesota, USA. Plants were grown in glasshouse conditions at the Department of Botany, Palacký University, Olomouc, Czechia, from March to July (2016) and from September to May (2017, 2018). Plants were cultivated in 3 L pots with sand peat substrate (1:9) mixture (Florcom Profi, BB Com Ltd., Letohrad, Czechia), watered as required, and fertilized weekly (Kristalon Plod a Kvˇet, Agro, Czechia). Temperature varied according to weather from a minimum of 15 ◦C during winter to a maximum of 40 ◦C in late spring. Supplementary light was provided (Sylvania Grolux 600 W, Hortilux Schreder, Holland) to extend the photoperiod to 8 h during September–February and to 14 h from February to stimulate flowering. Mature pods were collected, packed in paper bags, and dried at 20 ◦C and 60–63% relative humidity to allow post ripening for a period of 4 to 6 weeks prior to testing. Sufficient seed stock was obtained from 178 accessions using equipment made in-house.

#### *4.2. Seed Dormancy Release Experiments*

Release of seed dormancy was tested as imbibition (e.g., uptake of water) and terminated when the radicle protruded the seed coat [10]. As our study was aimed to study PY release and not the germination, the values we used in all subsequent analysis correspond to imbibed seeds. To mimic natural conditions, two temperature treatments (alternating temperatures of 35/15 ◦C and 25/15 ◦C at 14/10 h (day/night) regime) were applied to intact seed batches (50 seeds, in 2 to 3 replicas per treatment). Seeds were placed onto water saturated filter papers (Whatman Grade 1, Sigma, CZ) in 60 mm Petri dishes (P-Lab, CZ) in temperature-controlled chambers (Laboratory Incubator ST4, BioTech, CZ). In order to prevent fungal growth (as seed sterilization would alter seed coat properties), fungicide (Maxim XL 035 FS; containing metalaxyl 10 g and fludioxonil 25 g) was applied. Seeds were monitored at 24 h intervals for total of 88 days. After each scoring, the plates were randomly relocated within chambers. Germinated seeds (e.g., when the radicle protruded from the testa) were removed. At the end of the testing, remaining seeds were scarified and let germinate to verify their viability. This typically was over 98%. Although we selected macroscopically intact seeds for experiments, we cannot exclude some microscopic cracks on seeds resulting from mechanical damage during threshing. We observed that in the course of the first hours of seed germination experiments, a certain proportion of the seeds imbibe. Therefore, we subtracted the first day imbibition value from the analysis. In 2016, 2017, and 2018, a total of 147, 74, and 130 accessions were tested (Supplementary data Figure S8). In total, seeds of 178 accessions were included in the experiments (Supplementary data Table S1). Forty-seven accessions were tested in all three years, and 129 accessions in at least two years.

#### *4.3. Evaluation of Dormancy and Germination Traits*

Several statistics (traits) characterizing dynamics and final state of dormancy release of seeds of each accession for each treatment (i.e., 25/15 ◦C and 35/15 ◦C) were calculated as follows: *(i)* Final PY dormancy (%; FPYD25, FPYD35): represents percentage of dormant seeds at the end of experiment after excluding seeds germinating during the first day of each experiment (i.e., 100 – final germination percentage after 88 days + germination percentage after first day), calculated separately for two germination treatments. *(ii)* Germination pattern (AUC25, AUC35): this trait represents the area under curve (AUC) coefficient that takes into account both dynamics of germination as well as final germination percentage. Original germination data (daily counts of imbibed seeds) were considered as discrete realizations of an asymptotically continuous process, approximated by spline functions [78,79]. The resulting smoothing spline, called the absolute germination distribution function (AGDF; as applied in pea [10]), was used in analysis. Accordingly, the area under curve (AUC) of the spline function takes into account both the course of the germination as well as the final score of germinated seeds, which captures the dynamics of seed germination better than existing germination coefficients [80,81]. High AUC values mean rapid and early germination of the majority of seeds. *(iii)* FPYDM and AUCM: these two coefficients represent means of respective coefficients estimated for the two temperature treatments (e.g., FPYDM = (FPYD25 + FPYD35)/2)). *(iv)* Germination response (AUC35-25): this is the difference between two AUC (i.e., AUC35 − AUC25) of each accession calculated for two germination treatments (i.e., 35/15 ◦C and 25/15 ◦C). Higher absolute values of germination response mean larger differences in germination pattern of the same accession between germination treatments, while the sign of the difference suggests which of the treatment shows the larger AUC. *(v)* Phenotypic plasticity index of the germination pattern (PIAUC) and *(vi)* Phenotypic plasticity index of final PY dormancy (PIPY): these

traits were calculated for each accession as (traitmax − traitmin)/traitmax, where traitmax and traitmin were, respectively, the maximal and minimal value of the trait measured on the same accession across the two temperature treatments (25/15 ◦C and 35/15 ◦C). These estimates characterize the maximal plastic capacity of an individual in variable environments without taking into account the direction of the plastic response or the change in intensity with environmental variation. Phenotypic plasticity index ranges from 0 (no plasticity) to 1 (maximal plasticity) [2]. For the purpose of multivariate analyses, we calculated the average of each dormancy trait for each accession over the years. Consequently, the matrix of averages of each dormancy trait for each accession was used in multivariate analyses. Multicollinearity among variables was assessed by the variance inflation factor (VIF) for quantitative traits using the library usdm in R [82]. Only variables whose VIF was lower than 15 were retained in the analyses. Except for FPYD M and AUC M, none of the above-mentioned traits had a collinearity problem.

#### *4.4. Extraction of Environmental Variables and Spatial Accuracy*

Due to di fferent spatial accuracies of accessions and in order to minimize the spatial error caused by imprecise coordinates, we developed a geoprocessing model in the ArcGIS PRO environment [83]. The model automated the calculation of mean values of selected variables from within a 5 km bu ffer around each collection site in order to smooth the uncertainty caused by imprecise localization. The WorldClim database version 2.0 [84] was used to extract climatic data (period 1970–2000) from GeoTIFF rasters in theWGS-84 coordinate system (EPSG: 4326) with a spatial resolution of 30 arc-seconds (~1 km). Bioclimatic variables (BIO1–BIO19) were derived from the monthly temperature and rainfall values [85], and represent annual trends (e.g., mean annual temperature BIO1, annual precipitation BIO12), seasonality (e.g., annual range in temperature and precipitation BIO4 and BIO15) and extreme or limiting environmental factors (e.g., the temperature of the coldest and warmest month BIO5 and BIO6, and amount of precipitation in the wet and dry quarters BIO16 and BIO17). In order to determine the inter-annual variability in selected bioclimatic variables (BIO1, BIO5, BIO10, and BIO12) during the period 1981–2010, the index of variability (IV) was calculated following the percentile-analysis method [86]. To obtain yearly mean values for years 1981–2010, we used 2 m air temperature (Kelvin degrees) and 2 m specific humidity (kg of water/kg of air) hourly data from the Modern Era Retrospective Analysis for Research and Applications Reanalysis (MERRA) 2D Incremental Analysis Update atmospheric single-level diagnostics product (MAT1NXSLV), provided by the NASA Global Modelling and Assimilation O ffice [87]. Data were interpolated in spatial resolution of 2.5 arc-minutes. Temperature data were converted to degrees of Celsius (BIO1, 5, and 10). The resulting BIO12 (1981–2010) describes the annual mean of specific humidity instead of cumulative annual rainfall.

Twelve month means (BIO1, 5, 10, and 12) for each site were calculated as follows:

$$\text{IV} = \text{[(90th percentile -- 10th percentile)/50th percentile]} \ast 10 \tag{1}$$

The di fferent IV classes are low (IV <0.50), low–moderate (0.50–0.75), moderate (0.75–1.00), moderate–high (1.00–1.25), high (1.25–1.50), very high (1.50–2.00), and extreme (IV >2.00).

Soil data were extracted from the SoilGrids database [88]. SoilGrids prediction models are fitted using over 230,000 soil profile observations from the World Soil Information Service (WoSIS)database and a series of environmental covariates. Covariates were selected from environmental layers from Earth observation-derived products and other environmental information, including climate, land cover and terrain morphology [89].

#### *4.5. Climate and Soil Characteristics of Localities of Studied Accessions*

The set of accessions originates from rather contrasting climatic conditions (Supplementary data Table S2). Mean annual temperature ranges from ca 9 to 22 ◦C and annual precipitation ranges from 154 to 1028 mm; consequently, temperature annual range (min–max) is from 13 to 35 ◦C. Some accessions originate from sites with minimal winter temperatures below zero while maximal temperature of warmest months is rather similar among accessions. However, sites di ffer considerably in precipitations of driest and warmest periods. The basic descriptive statistics of the index of variability (IV) for BIO1, BIO5, BIO10, and BIO 12 are present in Table S2. IV BIO1 ranged from 0.33 to 1.60 with mean 0.77 (SD = 0.20), IV BIO5 from 0.57 to 2.26 with mean 1.01 (SD = 0.32), IV BIO10 from 0.63 to 1.76 with mean 0.98 (SD = 0.20), and IV BIO12 from 0.46 to 2.03 with mean 0.96 (SD = 0.28). The most variable IV was IV BIO5 (range 1.69) followed by IV BIO12 (range 1.57). Concerning soil variables, considerable variability among sites was found in volumetric percentage of coarse fragments (CRFVOL) and soil organic carbon content (ORCDRC), while other soil variables were more consistent among sites (Supplementary data Table S2).

#### *4.6. Testing of Relationships Among Dormancy Traits, Geography and Environmental Variables*

The matrix of environmental variables was checked for the presence of the multicollinearity using VIF. The reduced set of environmental variables (with VIF <15), including 14 bioclimatic variables and eight soil variables, was used in all further analyses. For each pair of variables, bivariate scatter plots together with fitted locally weighted smoothing were displayed and Pearson's Correlation Coe fficient was calculated using the library Performance Analytics in R [90].

The matrix of the reduced set of environmental variables was analyzed by principal component analysis (PCA; [91]) using Canoco 5.10 [92] to find the main environmental gradients within the dataset. Several precipitation variables were log(x+1) transformed and subsequently each variable was standardized to zero mean and unit variance before PCA. A set of germination traits and geographic coordinates (latitude, longitude) were used as supplementary variables and correlated with the first two principal components. To control for possible spatial autocorrelation between each germination trait and principal components representing environmental gradients, a modified version of the *t*-test [93] was performed in SAM 4.0 (Rangel et al., 2010) [94]. To assess whether there is spatial autocorrelation present in the PCA scores along the first two axes and dormancy traits, Moran's I spatial correlation statistic [88] was calculated for each variable using PASSaGE v. 2.0 [95]. Ten distance classes with equal widths were created and Moran's I and its 95% CI were calculated for each distance class.

#### *4.7. Phenotypic Plasticity by Macro-Environmental Clusters*

We used accessions that had been tested over 3 years (Supplementary data Table S1) to calculate a norm of reaction for final PY dormancy (FPYD25 and FPYD35). First, selected *Medicago* accessions were grouped into four macro-environmental clusters (Supplementary data Table S3) based on Euclidean distance of environmental variables used for calculations of PCA. Agglomeration was performed using Ward's minimum-variance linkage algorithm. Before clustering, all variables were standardized to zero mean and unit variance. Second, a norm of reaction for each macro-environmental cluster was estimated as follows: each line represents the data for a di fferent cluster and the e ffect of "environment" (treatments, 25/15 ◦C and 35/15 ◦C), separately for each experimental year [2]. To focus on the change of the trait in response to two temperature treatments we analyzed the FPYD means within each cluster per each year by ANOVA using the InfoStat software [96].

#### *4.8. Genome-Wide Association Analysis*

Genome-wide association analysis was performed on seven seed dormancy traits (FPYD25, FPYD35, AUC25, AUC35, AUC35-25, PIPY, PIAUC) and three bioclimatic variables (BIO1, BIO9, BIO12) on 178 accessions. Prior to GWA analyses, normal distribution of each trait was checked using the Shapiro–Wilk test. Two contrasted algorithms were used to test markers–traits associations: E fficient Mixed Model Association (EMMA), a classical mixed linear model (MLM) for single locus GWAS [97], and FarmCPU, a multi-locus method combining the fixed e ffect model and random e ffect model iteratively in order to improve the statistical power of MLM methods [98]. Both algorithms were implemented in the R package rMVP using default parameters (P-value threshold 0.01) and run using a Single Nucleotide Polymorphism (SNP) dataset containing 5.85 million SNPs remapped in the *Medicago*

genome v.5 [99]. Population structure, calculated using STRUCTURE by Bonhomme et al. [100], was used as a covariable. Normal distribution, QQ plots, and single/multiple Manhattan plots were performed using R package rMVP. *Medicago* genome version 5.0 of A17 genotype [101] was used to search for the encoded genes within the region of 10 kb from detected SNP. Transposable elements were excluded from the search. To link identified QTNs with putative causal gene by considering the linkage disequilibrium (LD), we selected all SNPs correlated (r2 > 0.7) with the top identified QTNs within a 15kb genomic range, corresponding to the average LD block size present in the *Medicago* Hapmap population [100,102], and we listed gene IDs closely related to these SNPs. Seed expression pattern of the candidate genes was assessed using published *Medicago* seeds or seed coat expression studies [103,104] and web-based expression atlas [105]
