*2.3. Accuracy of Genomic Prediction in Relation to Marker Sets and Pasmo Severity Datasets*

*2.3. Accuracy of Genomic Prediction in Relation to Marker Sets and Pasmo Severity Datasets*  Genomic prediction models were constructed using RR-BLUP with pairwise combinations of the four marker sets and the six PS datasets. Statistical models for the 24 combinations were generated and evaluated for their accuracy (*r*) and relative efficiency (*RE*) using a five-fold random crossvalidation scheme (Table 3). *RE* represents the relative efficiency of GP over direct phenotypic selection which depends on the heritability of a selective trait. Direct phenotypic selection for a trait was considered to have a baseline efficiency of 1. Thus, *RE* values greater than 1 indicate GP models more efficient than direct phenotypic selection in one selection cycle [25–27]. Analysis of variance (ANOVA) (Table S1) indicated that *r* and *RE* both significantly differed among the four marker sets and the six PS datasets; there was also a significant interaction effect between marker sets and PS datasets (Table S1). Owing to the significant marker × phenotype dataset interaction, multiple comparisons of the 24 combinations were performed. For all marker sets, the PS-mean models significantly outperformed those based on individual year datasets (Table 3). The SNP-500QTL marker set models generated significantly higher *r* and *RE* values than any other marker sets (Figure 3). Interestingly, the SNP-67QTL derived models produced slightly but significantly higher values of *r* and *RE* than SNP-134QTL models. The highest *r* and *RE* values were obtained for models combining the SNP-500QTL and PS-mean datasets (Table 3, Figure 3). Intriguingly, the SNP-52347 models yielded the lowest *r* and *RE* values despite including all QTL markers (Table 3, Figure 3); both BL and Bayesian ridge regression (BRR) corroborated this finding (Figure S1). No significant differences in *r* and *RE* values were observed among the three statistical models: RR-BLUP, BL and BRR (Figure Genomic prediction models were constructed using RR-BLUP with pairwise combinations of the four marker sets and the six PS datasets. Statistical models for the 24 combinations were generated and evaluated for their accuracy (*r*) and relative efficiency (*RE*) using a five-fold random cross-validation scheme (Table 3). *RE* represents the relative efficiency of GP over direct phenotypic selection which depends on the heritability of a selective trait. Direct phenotypic selection for a trait was considered to have a baseline efficiency of 1. Thus, *RE* values greater than 1 indicate GP models more efficient than direct phenotypic selection in one selection cycle [25–27]. Analysis of variance (ANOVA) (Table S1) indicated that *r* and *RE* both significantly differed among the four marker sets and the six PS datasets; there was also a significant interaction effect between marker sets and PS datasets (Table S1). Owing to the significant marker × phenotype dataset interaction, multiple comparisons of the 24 combinations were performed. For all marker sets, the PS-mean models significantly outperformed those based on individual year datasets (Table 3). The SNP-500QTL marker set models generated significantly higher *r* and *RE* values than any other marker sets (Figure 3). Interestingly, the SNP-67QTL derived models produced slightly but significantly higher values of *r* and *RE* than SNP-134QTL models. The highest *r* and *RE* values were obtained for models combining the SNP-500QTL and PS-mean datasets (Table 3, Figure 3). Intriguingly, the SNP-52347 models yielded the lowest *r* and *RE* values despite including all QTL markers (Table 3, Figure 3); both BL and Bayesian ridge regression (BRR) corroborated this finding (Figure S1). No significant differences in *r* and *RE* values were observed among the three statistical models: RR-BLUP, BL and BRR (Figure S1).

#### S1). *2.4. Sample Size of Training Populations versus Genomic Prediction Accuracy*

TP size reached 185 (Figure 4).

*2.4. Sample Size of Training Populations versus Genomic Prediction Accuracy*  To find an optimal size for the TP, the relationship between TP size and prediction accuracy was analysed. TPs of various sizes from 18 to 351, corresponding to 5% to 95% of the total 370 accessions, were used to build models with the SNP-500QTL marker set and the PS-mean phenotypic dataset. The prediction accuracy significantly increased for TP sizes up to 100, followed by smaller accuracy To find an optimal size for the TP, the relationship between TP size and prediction accuracy was analysed. TPs of various sizes from 18 to 351, corresponding to 5% to 95% of the total 370 accessions, were used to build models with the SNP-500QTL marker set and the PS-mean phenotypic dataset. The prediction accuracy significantly increased for TP sizes up to 100, followed by smaller accuracy gains with every additional TP size increments (Figure 4). A GP accuracy >0.9 was obtained once the TP size reached 185 (Figure 4).

gains with every additional TP size increments (Figure 4). A GP accuracy >0.9 was obtained once the

cross-validation.


**Table 3.** Accuracy (*r*) and relative efficiency (*RE*) values of the 24 combinations representing the four marker sets and six pasmo severity (PS) datasets using RR-BLUP obtained using a random five-fold cross-validation. **Marker set PS dataset** *r***(**ഥ **±** *s***) 1** *RE* **(**ഥ **±** *s***) 1** PS-mean 0.92 ± 0.02a 1.84 ± 0.04a

*Int. J. Mol. Sci.* **2018**, *19*, x FOR PEER REVIEW 6 of 18

marker sets and six pasmo severity (PS) datasets using RR-BLUP obtained using a random five-fold

<sup>1</sup> Different letters represent multiple test significance among the 24 combinations at the 0.05 probability level. 1 Different letters represent multiple test significance among the 24 combinations at the 0.05 probability level.

**Figure 3.** Accuracy (*r*) (**a**) and relative efficiency (*RE*) (**b**) of RR-BLUP prediction models built with combinations of four marker sets using the five-year average PS dataset (PS-mean) and random fivefold cross-validations. Letters above box plots indicated statistical significance (*p* < 0.05) for *r* and *RE* among marker sets. **Figure 3.** Accuracy (*r*) (**a**) and relative efficiency (*RE*) (**b**) of RR-BLUP prediction models built with combinations of four marker sets using the five-year average PS dataset (PS-mean) and random five-fold cross-validations. Letters above box plots indicated statistical significance (*p* < 0.05) for *r* and *RE* among marker sets.

*Int. J. Mol. Sci.* **2018**, *19*, x FOR PEER REVIEW 7 of 18

**Figure 4.** Relationship between the genomic prediction accuracy (*r*) and the size of the training population based on the SNP-500QTL marker set, the PS-mean dataset and the RR-BLUP models. The dash line represents a prediction accuracy of 0.9. **Figure 4.** Relationship between the genomic prediction accuracy (*r*) and the size of the training population based on the SNP-500QTL marker set, the PS-mean dataset and the RR-BLUP models. The dash line represents a prediction accuracy of 0.9.

#### *2.5. Prediction Models of Pasmo Resistance 2.5. Prediction Models of Pasmo Resistance*

All 370 accessions were used as a training population to build a prediction model using the SNP-500QTL genotypic dataset and the PS-mean phenotypic dataset because this combination outperformed all other models. The model was then employed to predict PS in each year (Table 4). Prediction accuracies (*r*) ranging from 0.71 to 0.81 and *RE* values of 1.42 to 1.62 were obtained when predicting PS for individual years (Table 4). All 370 accessions were used as a training population to build a prediction model using the SNP-500QTL genotypic dataset and the PS-mean phenotypic dataset because this combination outperformed all other models. The model was then employed to predict PS in each year (Table 4). Prediction accuracies (*r*) ranging from 0.71 to 0.81 and *RE* values of 1.42 to 1.62 were obtained when predicting PS for individual years (Table 4).

A prediction accuracy as high as 0.98 and a *RE* value of 1.96 were obtained when the model was used to predict PS-means of the 370 accessions (Table 4). A linear relationship was observed between the observed (*y*) and predicted PS (*x*): *y* = 1.0522*x* − 0.3267 (*R*2 = 0.96) (Figure 5a). Based on this equation, the average prediction interval between the two red dashed lines, representing the 95% confidence interval, was only less than 1 (an average of 0.97) on the PS ratings (Figure 5a). A prediction accuracy as high as 0.98 and a *RE* value of 1.96 were obtained when the model was used to predict PS-means of the 370 accessions (Table 4). A linear relationship was observed between the observed (*y*) and predicted PS (*x*): *y* = 1.0522*x* − 0.3267 (*R* <sup>2</sup> = 0.96) (Figure 5a). Based on this equation, the average prediction interval between the two red dashed lines, representing the 95% confidence interval, was only less than 1 (an average of 0.97) on the PS ratings (Figure 5a).

NPQTL in the 370 accessions for the 500 QTL set was tallied. Significant linear correlation between PS-mean and NPQTL (*r* = 0.86 or *R*2 = 0.73) was observed (Figure 5b). This correlation was less than but close to the accuracy of the GP model with SNP-500QTL and higher than the GP models using other marker sets (Table 3). However, the single linear regression equation (*y* = −0.0262*x* + 11.934) of the observed PS (*y*) to NPQTL (*x*) had a large standard deviation for each prediction value, with an average prediction interval width of 2.70, nearly three times the average prediction interval width of the GP model; that is, the NPQTL model had a higher prediction error than the GP model. NPQTL in the 370 accessions for the 500 QTL set was tallied. Significant linear correlation between PS-mean and NPQTL (*r* = 0.86 or *R* <sup>2</sup> = 0.73) was observed (Figure 5b). This correlation was less than but close to the accuracy of the GP model with SNP-500QTL and higher than the GP models using other marker sets (Table 3). However, the single linear regression equation (*y* = −0.0262*x* + 11.934) of the observed PS (*y*) to NPQTL (*x*) had a large standard deviation for each prediction value, with an average prediction interval width of 2.70, nearly three times the average prediction interval width of the GP model; that is, the NPQTL model had a higher prediction error than the GP model.

**Table 4.** Accuracy (*r*) and relative efficiency (*RE*) of genomic prediction for pasmo severity in different years using the RR-BLUP model built with the SNP-500QTL marker set and the PS-mean phenotypic data using all 370 accessions as training data set. **Table 4.** Accuracy (*r*) and relative efficiency (*RE*) of genomic prediction for pasmo severity in different years using the RR-BLUP model built with the SNP-500QTL marker set and the PS-mean phenotypic data using all 370 accessions as training data set.


PS-2016 0.77 1.55

**Figure 5.** Relationship of observed pasmo severity (PS) with PS predicted by a GP model (**a**,**c**) or with PS predicted by the number of QTL with positive-effect alleles (NPQTL) (**b**,**d**). (**a**) Linear regression of observed PS (*y*) to predicted PS (*x*) using the genomic prediction model built with the PS-mean dataset and the SNP-500QTL marker set of all 370 accessions as training data set. (**b**) Linear regression of observed PS (*y*) to NPQTL (*x*) in the 370 flax accessions. (**c**) Relationship of observed PS of 93 randomly chosen accessions with the PS predicted by the genomic model constructed with the SNP-500QTL marker set and PS-mean dataset when a random subset of 277 accessions was used as training population. (**d**) Relationship of observed PS of 93 randomly chosen accessions with the PS predicted by NPQTL (Figure S2) The red dashed lines represent upper and lower boundaries of the 95% prediction intervals, that is, it is expected that the value of a sample lies within that prediction interval in 95% of the samples. The grey band represents the 95% confidence interval, that is, 95% of those intervals include the true value of the population mean. **Figure 5.** Relationship of observed pasmo severity (PS) with PS predicted by a GP model (**a**,**c**) or with PS predicted by the number of QTL with positive-effect alleles (NPQTL) (**b**,**d**). (**a**) Linear regression of observed PS (*y*) to predicted PS (*x*) using the genomic prediction model built with the PS-mean dataset and the SNP-500QTL marker set of all 370 accessions as training data set. (**b**) Linear regression of observed PS (*y*) to NPQTL (*x*) in the 370 flax accessions. (**c**) Relationship of observed PS of 93 randomly chosen accessions with the PS predicted by the genomic model constructed with the SNP-500QTL marker set and PS-mean dataset when a random subset of 277 accessions was used as training population. (**d**) Relationship of observed PS of 93 randomly chosen accessions with the PS predicted by NPQTL (Figure S2) The red dashed lines represent upper and lower boundaries of the 95% prediction intervals, that is, it is expected that the value of a sample lies within that prediction interval in 95% of the samples. The grey band represents the 95% confidence interval, that is, 95% of those intervals include the true value of the population mean.
