*2.4. Statistical Analysis*

The statistical analyses were performed in IBM SPSS Statistics 24. The level of significance for all statistical tests was set to 0.05. We checked for outliers by using the more liberal definition of extreme outliers ("outer fences": Q3 + 3 × IQR) [31], and according to this criterion, all participants could be included in the analysis.

Based on the two thresholds, participants were classified into a 2 × 2 table below or above the respective cut-off values. Pearson's chi-square and exact tests were used to analyze the association of the grouping results. Moreover, the group differences for the Sniffin' Sticks scores and the olfactometer-based threshold were analyzed by Mann–Whitney U tests.

A Pearson correlation was computed between the Sniffin' Sticks-based and olfactometer-based threshold for n-butanol to compare the methods.

Next, the two established thresholds were correlated with the exposure lab-based threshold using further Pearson correlations. All correlations were adjusted (Bonferroni method) for the total number of computed multiple comparisons. Bonferroni-adjusted *p*-values are shown in addition to the non-adjusted correlations for these analyses.

The experimental data from the ammonia exposure were analyzed using full-factorial analyses of variance (ANOVAs), with time as the repeated measures factor and group as the between-subjects factor. Models were calculated taking into account, on the one hand, the grouping factor Sniffin' Sticks threshold (cut-off value: 9, see Sniffin' Sticks norms) and taking into account, on the other hand, the grouping factor olfactometer-based threshold (cut-off: 80 ppb, see DIN EN norm 13725 [7]). If the assumption of sphericity was violated, Greenhouse–Geisser-corrected degrees of freedom were used. Significant interaction effects were further analyzed using Bonferroni-adjusted post hoc tests.

### **3. Results**

#### *3.1. Results of the Psychometric Threshold Assessments*

Table 2 presents the descriptive statistics of the three olfactory measures of n-butanol sensitivity for the total sample and after applying the respective cut-offs. Unsurprisingly, when a cut-off was applied based on one of the thresholds, Mann-Whitney U tests indicated a significant difference between resultant groups in this threshold. Moreover, participants more and less sensitive in the Sniffin' Sticks tests also differed significantly in their olfactometry-based threshold. Participants did not differ in relevant psychological variables for odor effects [29] such as negative affectivity and self-reported chemical sensitivity (see supplement Table S1).


**Table 2.** Description of total sample and classified subgroups.

Note. IQR = inter-quartile range, T = threshold, \* *p* ≤ 0.05 subgroup comparison using Mann-Whitney U tests.
