*4.3. Differentiation and Stratification*

Nucleotide diversity (*π*) of three bi-parental populations and genetic differentiation (*Fst*) between the populations were estimated using the R package "PopGenome" [73]. The genetic structures of the three separate inbreeding populations and the combined population were assessed using both PCA and DAPC [74]. Analyses with DAPC included several steps: (1) PCA was conducted using the imputed SNPs. According to the curve of accumulative variances against the number of principle components (PCs), the optimum number of PCs was chosen at which the cumulative variance had no obvious increase; (2) *k*-means clustering analysis was performed based on the chosen PCs. To identify the optimal number of clusters, *k*-means was run sequentially with increasing *k* values and the Bayesian information criterion (BIC) was calculated for each *k*. The optimum *k* corresponded to the lowest BIC; (3) Discriminant analysis was conducted based on the chosen clusters and individuals were reassigned to the different clusters. The posterior probability of cluster membership was calculated based on the retained discrimination functions and used as the Q matrix for GWAS and heritability estimation. PCA was performed using the function implemented in TASSEL while DAPC was conducted using the R package "adegenet" 2.0 [75].
