**2. Results**

### *2.1. Genetic Diversity at Different Loci among Populations*

The genetic diversity analysis was performed on 480 individuals from 16 natural *P. koraiensis* populations using 15 EST-SSRs markers (Table 1). The allele size ranged from 151 bp at locus NEPK-65 to 301 bp at loci NEPK-168 and NEPK-184. In total, 155 alleles across all 15 loci were detected in the sampled individuals; the number of alleles per locus ranged from 4 (NEPK-67) to 21 (NEPK-145), with a mean value of 10.33. There were 58 private alleles, accounting for 37.42% of the alleles. The number of effective alleles (Ne) ranged from 1.170 at locus NEPK-40 to 6.605 at locus NEPK-145, with an average of 2.514 per locus. The observed (Ho) and expected (He) heterozygosity ranged from 0.008 to 0.984 and from 0.145 to 0.849, respectively, with mean values of 0.374 and 0.521, respectively. The polymorphic information content (PIC) varied from 0.142 (NEPK-40) to 0.833 (NEPK-145), with a mean value of 0.461. Four loci exhibited high polymorphism (PIC > 0.5) and 8 loci exhibited moderate polymorphism (0.2 < PIC < 0.5). In addition, across the 480 samples, all of the loci conformed to Hardy–Weinberg equilibrium. F-statistics were calculated to detect genetic subdivision and revealed moderate inbreeding and the mean value of Fst was 0.347, indicating moderate genetic variation. Regarding gene flow, the number of effective migrants (Nm) value ranged from 0.080 to 17.691 among populations, with an average of 2.667.

**Table 1.** Characteristics of the 15 polymorphic EST-SSR markers used in this study.


Note: Na, number of different alleles; Ne, number of effective alleles; I, Shannon's diversity index; Ho, observed heterozygosity; He, expected heterozygosity; PIC, polymorphic information content; HWE, deviation from Hardy–Weinberg equilibrium (\*\*\* *p* < 0.001); NRA, number private allele; Fis, inbreeding coefficient; Fit, over inbreeding coefficient; Fst, genetic differentiation index; Nm, number of effective migrants.

### *2.2. Genetic Diversity within Pinus koraiensis Populations*

The levels of genetic diversity in the 16 populations are shown in Table 2. Across the sampled populations, the number of different alleles (Na) varied from 2.667 (HL) to 4.467 (TL), with a mean value of 3.271, and the number of effective alleles (Ne) ranged from 1.586 (Jiaohe) to 2.257 (Linjiang), with a mean value of 1.870. The populations with the highest levels of genetic diversity were Heihe (Ne = 1.939, Ho = 0.340 and He = 0.439), Zhanhe (Ne = 2.009, Ho = 0.356 and He = 0.413), Liangshui (Ne = 1.914, Ho = 0.470 and He = 0.370) and Tieli (Ne = 2.222, Ho = 0.373 and He = 0.414), whereas those with the lowest levels were Jiaohe (Ne = 1.586, Ho = 0.293 and He = 0.275) and Helong (Ne = 1.663, Ho = 0.390 and He = 0.310). The Zhanhe population had not only the highest genetic diversity but also the largest number of private alleles, identifying it as a unique natural *P. koraiensis* population. The F value ranged from -0.235 to 0.325 among the populations, with a mean value of 0.02, indicating that there existed a deficiency of heterozygosity in the natural *P. koraiensis* populations.

**Table 2.** Genetic diversity estimates for the 16 *P. koraiensis* populations based on 15 EST-SRRs markers.


Note: Na, number of different alleles; Ne, number of effective alleles; I, Shannon's diversity index; Ho, observed heterozygosity; He, expected heterozygosity; uHe, unbiased expected heterozygosity; F (null), null allele frequencies; NRA, number private allele.

### *2.3. Genetic Variation among Pinus koraiensis Populations*

To evaluate the genetic variation among the collected samples, AMOVA was performed, and Fst among natural populations, genetic clusters and geographical regions were calculated; the results are shown in Table 3. The AMOVA results indicate that 67% of the total genetic variation existed within populations, indicating high genetic diversity within populations. AMOVA of the two genetic clusters identified by the STRUCTURE analysis indicated that 63.79% of the total variation was attributable to differences within populations, and the overall Fst was 0.362 (Fst > 0.25), indicating high genetic differentiation between the two clusters. In addition, the AMOVA of two groups classified according to geographical location indicated low genetic variation among populations within each group (2.77%). All of these results indicated high genetic differentiation within populations and groups.


**Table 3.** Analysis of molecular variance (AMOVA) results for 16 populations of *Pinus koraiensis* in China.

Note: a The analysis included all collected populations as one hierarchical group. b The analysis included two geographical groups (G1 and G2). c The analysis included two genetic clusters (Clusters 1 and Clusters 2).

> The Nei's genetic distance and pairwise Fst values are shown in Table 4. Fst was considered the main genetic parameter for evaluating genetic differentiation among populations. In this study, the pairwise Fst values ranged from 0.014 to 0.348, and most of the *P. koraiensis* population pairs exhibited high values (Fst > 0.15), indicating high levels of genetic diversity. The greatest level of differentiation was observed between populations Helong and Liangshui, and the lowest was observed between Jiaohe and Hulin. The highest genetic distance was observed between populations Helong and Liangshui (0.813), consistent with the pairwise Fst values and indicating pronounced differentiation between these two populations. The relative migration network among the 16 *P. koraiensis* populations was constructed using relative migration rate with the divMigrate function in R software. Analysis of gene flow between populations suggested a biased geographic distribution, and gene flow was not uniform among all populations (Figure 1). A high degree of gene flow was observed among three populations located near one another (Muling, Maoershan and Fangzheng), consistent with the principal coordinate analysis and dendrogram analysis. In addition, one genetically isolated population (Boli) displayed high levels of gene flow with the three nearby populations Muling, Maoershan and Fangzheng. Moreover, a moderate level of gene flow was found among three admixed populations, and two genetically distinct populations (Zhanhe and Wangqing) exhibited distant segregation from the other populations.

**Table 4.** Pairwise genetic differentiation index values (below the diagonal) and Nei's genetic distance values (above the diagonal). \*\*\*\* indicates the diagonal division of the pairwise genetic differentiation index values and Nei's genetic distance values.


**Table 4.** *Cont.*


**Figure 1.** The relative migration network of 16 populations generated via divMigrates. The width of the line and the number shown next to the arrows indicate the migration rate.

### *2.4. Population Structure*

The population structure analysis of the 16 natural *P. koraiensis* populations was performed based on a Bayesian approach using STRUCTURE software. The number of clusters within the range of 1 to 10 was evaluated for 10 repetitions in each run. In the structure plot (Figure 2), the maximum delta K value appeared at K = 2, with an obvious peak apparent

at this value; this value was considered the optimal genetic cluster number for all EST-SSR markers (Figure 2B,C). The 480 sampled individuals of *P. koraiensis* were divided into two genetic groups (Group 1 and Group 2) at K = 2: Group 1 comprised 149 individuals from 5 populations (Heihe, Liangshui, Zhanhe, Tieli and Hegang), and Group 2 comprised a higher number of individuals (331) from 11 populations (Liangzihe, Helong, Lushuihe, Linjiang, Jiaohe, Hulin, Boli, Muling, Maoershan, Fangzheng and Wangqing). Group 1 comprised almost all of the *P. koraiensis* plant materials from Xiaoxinganling Mountains, whereas Group 2 comprised almost all of the individuals from Changbaishan Mountains, suggesting a relationship between genetic structure and geographical distribution of the populations.

**Figure 2.** Population structure determined from 480 *Pinus koraiensis* individuals based on microsatellite data. (**A**) Estimation of population structure using mean lnP (D) with ten repetitions for K ranging from 1 to 10. (**B**) Estimation of population structure using delta K (ΔK) with the number of clusters (K) ranging from 1 to 10. (**C**) Estimation of population structure of 16 populations based on structure analysis.

To further analyze cluster patterns, principal component analysis (PCA) based on the pairwise genetic distance matrix of 15 EST-SSRs was performed; the results are shown in Figure 3. The 480 individuals from the 16 populations were roughly divided into two clusters according to the first two axes in the PCA plot. Principal axes 1 and 2 accounted for 22.99% and 12.46%, respectively, of the total genetic variation among the individuals, together accounting for 35.45% of the total genetic variation (Figure 3A). Five populations (Heihe, Liangshui, Zhanhe, Tieli and Hegang) were grouped into cluster 1, and the remaining populations (Liangzihe, Helong, Lushuihe, Linjiang, Jiaohe, Hulin, Boli, Muling, Maoershan, Fangzheng and Wangqing) were grouped into cluster 2. The same clustering was obtained in the STRUCTURE analysis using the same dataset, indicating marked genetic differentiation. Furthermore, the Neighbor-joining (NJ) dendrogram based on Nei's genetic distance clustered the 480 *P. koraiensis* individuals from the 16 populations into 2 clusters, consistent with the above results (Figures 3B and 4).

**Figure 3.** Genetic variation and relationships among the 16 sampled natural *Pinus koraiensis* populations in northeast China. (**A**) Principal coordinate analysis (PCA) based on pairwise genetic distance. (**B**) NJ dendrogram of 480 individuals based on Nei's (1983) genetic distance.

**Figure 4.** NJ dendrogram of 16 populations based on Nei's (1983) genetic distance and a heatmap of expected heterozygosity (He) of 16 populations.

### *2.5. Correlations between Genetic Distance and Geographic Distance*

The genetic distance estimated based on molecular markers may be related to the distribution of the species under study and the geographic distance between individuals or populations. In this study, the geographic distance and genetic distance values ranged from 37.72 km to 825.45 km and from 0.02 to 0.83, respectively. To investigate the correlations between genetic distance and geographic distance, the Mantel test was carried out. The results showed that genetic distance was not significantly correlated with the geographic distance among the *P. koraiensis* populations (*p* = 0.26, R<sup>2</sup> = 0.01), indicating a lack of association between geographical distance and the genetic differentiation of *P. koraiensis* (Figure 5). Liangzihe and Hegang populations exhibited the lowest geographic distance and were not grouped in the same cluster. Therefore, there was no obvious isolation by genetic and geographical distance among the sampled populations.

**Figure 5.** Correlation between genetic distance and geographic distance in *P. koraiensis* populations determined using the Mantel test.
