*2.2. Validation Phase*

The clinical characteristics of the patients (*n* = 802) are summarized in Table S3. A subgroup of patients treated with NACT composed of 168 patients. In total, 371 patients received adjuvant cytotoxic therapy. Patients with localized disease and good prognosis were untreated (*n* = 83) and a portion of patients was treated only with hormonal therapy (*n* = 311). The mean follow-up of the patients was 76 ± 30 months.

Out of 52 variants subjected to genotyping, attempts to optimize detection techniques failed in 10 variants (5 indels and 5 SNVs) and could not be further evaluated for clinical associations. One variant was additionally included into the list (rs2893007) based on haplotype tagging (*r* <sup>2</sup> = 1) to replace the variant rs11764054 whose analysis failed. No tagging variants (*r* <sup>2</sup> > 0.8) were found to replace the rest of the failed variants. In the end, 43 variants were successfully genotyped. No variants significantly departed from Hardy–Weinberg equilibrium and less than 1% of theoretical data points were missing due to uncertainty in genotype calling, or an absent signal. MAFs of variants in the validation phase did not substantially differ from those observed in the testing one (Table 3).


**Table 3.** Distribution of genotypes for variants assessed in the validation phase.

<sup>1</sup> Reference number in dbSNP (https://www.ncbi.nlm.nih.gov/snp/); <sup>2</sup> Genotypes do not sum up to 802 due to missing data; <sup>3</sup> MAF = minor allele frequency.

We further evaluated associations of variants with the response and DFS of patients in the validation set. For six variants with less frequent homozygous genotypes (*n* < 5), the recessive genetic model was used for the sake of the statistical power of comparisons. The variants that associated with response in both testing and validation phases are listed in Table 4 and thus these variants are considered replicated with putative clinical importance. However, none of these associations passed the false discovery rate (FDR) test for multiple testing (*q* = 0.0025) and, thus, cannot be deemed statistically significant after such correction.


**Table 4.** Validated variants in ABC transporters significantly associating with the response of patients to the neoadjuvant cytotoxic therapy.

<sup>1</sup> Numbers of patients with specified genotypes divided by response to the neoadjuvant treatment.

Subsequently, associations of variants with DFS of all patients and patients stratified according to the received therapy were evaluated. Significant results for all patients with complete follow-up data (*n* = 744) are displayed in Figure 2a. Results for patients treated with cytotoxic therapy (*n* = 371) are presented in Figure 2b. No significant association was observed in a subgroup of patients treated only with hormonal therapy (*n* = 312).

Of these variants, rs74859514 did not pass the gene dosage condition (Figure 2b). None of the associations has passed the FDR test for multiple testing (*q* = 0.0023) and, thus, cannot be further considered statistically validated.

**Figure 2.** Kaplan–Meier plots with validated associations of variants with disease-free survival (DFS) of all patients unstratified (**a**) or subgroup of patients treated with cytotoxic therapy (**b**). The blue line represents the common homozygous genotype, the green line the heterozygote, and the magenta line the rare homozygote. The violet color represents rare variant carriers. Significance was evaluated by the log-rank test; numbers represent individuals in compared groups.

We further clarified the effect of molecular subtype on prognosis of the patients by their stratification according to the intrinsic molecular subtype. Associations with DFS were then calculated separately for each subtype (Table 5 and Figure S1).


**Table 5.** Validated associations of variants in ABC transporters associating with DFS of patients treated with cytotoxic therapy according to their molecular subtypes.

<sup>1</sup> Numbers of patients for each genotype stratified by molecular subtypes are displayed; HER2 = HER2 enriched, TNBC = triple negative breast cancer. <sup>2</sup> *p*-value departed from log-rank test (significant results in bold).

This analysis showed that prognostic associations differ according to the molecular subtype. In the whole group of patients, rs9282562 in *ABCA7* and rs17548783 in *ABCA13* were prognostic only for HER2 enriched and triple negative (TNBC) subtypes, respectively. In the subgroup treated with cytotoxic therapy, again single nucleotide polymorphisms (SNPs) were prognostic for patients with HER2 enriched and TNBC subtypes (rs74859514 in *ABCA13*). Carriers of the rare variant in *ABCG8* rs34198326 had better DFS than those with the common homozygous genotype—this association was significant only in patients with the luminal B subtype.

In order to clarify associations of variants with gene expression, we used transcriptomic data from previous gene expression profiling [14] and compared it with variants that significantly associated with DFS or response to NACT in the validation set (*n* = 168 patients for whom gene expression was available). We also analyzed expression quantitative trait loci (eQTL) associations of these variants using gene expression in healthy tissues available on the GTEx portal (https://www.gtexportal.org). The SNP rs17548783 was significantly associated with *ABCA13* transcript levels in tumors assessed in the previous study [14] (Table 6), but no eQTL were found in the GTEx database. No eQTL were found also for rs2275032 in *ABCA4*, rs71428357 in *ABCA12*, rs3210441 in *ABCB5*, and rs34198326 in *ABCG8*. Significant results from eQTL analysis are summarized in Table 7 and Figure S2.

**Table 6.** Association of single nucleotide polymorphism (SNP) rs17548783 in ABCA13 transporter with intratumoral transcript levels.


<sup>1</sup> Transcript levels expressed as Ct (cycle threshold) value of ABCA13 subtracted from mean Ct of reference genes [14]. <sup>2</sup> S.D. = Standard deviation.


**Table 7.** eQTL analysis of SNPs significantly associating with patients' DFS or response to neoadjuvant cytotoxic therapy (NACT).

<sup>1</sup>*p*-value of the most significant association is shown. <sup>2</sup> Significant results in more than three different tissues. <sup>3</sup> The highest expression is seen for TT genotype in whole blood and esophageal mucosa; the opposite i.e., highest expression of CC genotype is seen for the rest of the tissues. <sup>4</sup> Alternative eQTL.
