**1. Introduction**

Flax *(Linum usitatissimum* L.) is an important food and fibre crop cultivated and grown in cooler regions of the world, such as Canada [1]. Pasmo, elicited by the fungus *Septoria linicola*, is one of the most widespread diseases of flax, causing reductions in seed and oil yield, as well as fibre quality and durability [2]. Developing resistant cultivars is the most viable and effective option to control this disease that has become widespread in all flax production areas of North America and other parts of the world. Resistance to pasmo has a low heritability [3] and is quantitatively inherited [4]. Large variations in pasmo disease severity were observed in the flax core collection, which can be capitalized upon to develop resistant cultivars [3]. Phenotypic recurrent selection is typically used to develop cultivars with improved resistance and selection is usually carried out based on phenotypic

assessments of resistance in field conditions [5]. However, field assessment of pasmo severity (PS) in germplasm and breeding lines is costly and, is heavily influenced by the environments due to strong genotype × environment (G × E) interactions [3,4].

With the advancements in molecular marker development over the last decade, efforts to use marker-assisted breeding strategies have been pursued. One such strategy involves identifying quantitative trait loci (QTL) in biparental mapping populations and using markers to efficiently backcross QTL into elite breeding materials [6]. This so-called marker-assisted recurrent selection (MARS) or simply marker-assisted selection (MAS) characterizes many breeding programs that employ molecular markers to select non-phenotyped individuals for crossing and downstream selection of segregating populations [7]. This method is suitable for the selection of monogenic or oligo-genic architectures but has limited use for quantitative traits controlled by many genes of smaller effects [8]. Genomic selection (GS) or prediction (GP) is an alternative marker-assisted breeding strategy better suited to polygenic quantitative traits, especially those with low heritability, because it makes use of all marker effects across the entire genome to calculate genomic estimated breeding values (GEBVs) [9] for individual plant selection [9,10].

In GP, a training population (TP) is genotyped with genome-wide markers and phenotyped for the trait(s) under selection; statistical models that best predict the breeding values from the marker data are then applied to select non-phenotyped germplasm. GP has been used to select for disease resistance in several crops such as *Fusarium* head blight (FHB) in wheat, a typically quantitatively inherited trait with predominantly additive genetic variation, where GP had a significantly higher accuracy than pedigree-based information alone [11]. GP feasibility has also been studied for selection of wheat rust resistance and was found particularly effective when validation lines had at least one which is close to the reference lines [12]. The implementation of GP on northern leaf blight, a complex genetic architecture trait in maize, resulted in superior gains and reduced breeding cycle time to ≤80% of the phenotypic cycle [13]. Despite the many successful examples, the use of GP to improve disease resistance in crops has been challenging for two reasons: (i) selection for major resistance genes can be ephemeral due to changes in pathogen races; and (ii) breeding for minor resistance genes with small effects may face the remarkable complexities encountered in GP [14].

The fast-evolving genotyping platforms have been a game-changer in the implementation of GP, allowing the production of large numbers of genome-wide markers, whereas progresses in phenotyping were not associated with similar cost reduction or quantum leaps in throughput. Given the number of markers (*p*) and sample size (*n*) in a given population, there are many more *p* effects to be estimated than the *n*, leading to an infinite number of possible marker effect estimates [15], that is, the so-called "large *p*, small *n* problem" (*p* >> *n*) when applying markers to predict phenotypes [11]. Several GP statistical models have been proposed to address this issue [16]. For example, the ridge-regression best linear unbiased prediction (RR-BLUP) is a mixed linear model that considers markers as random effects. Covariance between markers is considered to be zero and the marker variance is assumed to be the total genetic variance divided by the number of markers. The variance is assumed to be equal for all markers, allowing many more marker effects to be estimated than there are phenotypic records [17]. Unlike RR-BLUP, the Bayesian LASSO (BL) assumes markers to have unequal variances and, performs continuous shrinkage and variable selection simultaneously, with small-effect markers shrinking more severely than larger-effect loci. In the *p* >> *n* setting, LASSO will select at most *n* − 1 variables and set the effects of the remaining predictors at zero [18]. Although the problem is solved statistically in these models, improving the accuracy and efficiency of GP by reducing the number of genome-wide markers would be advantageous because any increment in the TP size comes at a cost [19–22]. Genome-wide association study (GWAS) is an approach to identify genome-wide markers linked to QTL, resulting in a limited number of favourable genetic loci responsible for traits of interest [23]. For example, GP of crown rust resistance in *Lolium perenne* demonstrated GWAS's ability to identify and rank markers, which enabled the identification of a small subset of single nucleotide polymorphisms (SNPs) that

could achieve predictive abilities close to that attained using the complete marker set [24]. Utilization of GWAS removes a large proportion of unrelated markers and in the construction of prediction models.

The only GP empirical study published to date in flax, which used bi-parental populations for yield, oil content and fatty acid composition traits, indicated that GP could increase genetic gain per unit time in linseed breeding. The GP results significantly exceeded those from direct phenotypic selection, especially for traits with low broad-sense heritability [25]. Resistance to flax pasmo is polygenic. Our previous study reported 500 non-redundant QTL for PR from 370 diverse flax accessions of a core collection based on five-year pasmo field assessments; of those, 134 QTL were statistically stable in all five years and 67 had relatively stable and large effects [4].

The objective of this study was to evaluate the potential of QTL markers in GP and compare the GP efficiency affected by different markers, including genome-wide SNPs and QTL markers, to provide a realistic and highly accurate model for germplasm evaluation and parent selection in pasmo resistance breeding.
