*2.8. Statistical Testing*

We tested how DT and ER changed at 1–4 intervals using additional controls of stimulus orientation and ambiguity. We performed repeated measures ANOVA with 1–4 intervals, ambiguity (HA and LA), and orientation (Left and Right) as within-subject factors. In general, ANOVA requires the homogeneity assumption: the population variances of the dependent variable must be equal for all groups. At the same time, this assumption may be ignored if the sample size is equal for each group. In this study, we used a withinsubject design with repeated measures. Therefore, the sample size for each condition was equal, and we did not control for variance homogeneity. If the tested samples did not obey the normality condition, we applied Greenhouse–Geisser correction to ANOVA results. For significant main effects, we performed a post hoc analysis using parametric or nonparametric tests, depending on sample normality, which was determined using the Shapiro–Wilk test. All test types are specified in the Result section and in the figures captions. A statistical analysis was performed in IBM SPSS Statistics.

Statistical analyses of brain activity were carried out based on the subject-level wavelet power, averaged over trials and over TOI1. Contrasts between the four intervals were tested for statistical significance using a permutation test combined with the cluster-based

correction for multiple comparisons. Specifically, the *F*-tests compared four wavelet power sets for all pairs (channel, frequency). Items that passed the threshold corresponding to a *p*-value of 0.001 (one-tailed) were labeled along with their adjacent items and collected in separate negative and positive clusters. The minimum required number of neighbors was set to 2. The *F*-values in each cluster were summarized and corrected. The maximum amount was entered into the permutation structure as a test statistic. A cluster was considered significant if its *p*-value was below 0.01. The number of permutations was 2000.

A similar procedure was followed on the source level results. We performed a cluster-corrected statistical intra-subject permutation test on the test-averaged and TOI1 averaged source power distributions to determine significant differences between four intervals [30,31]. The threshold for paired comparisons with *F*-test was *p* = 0.005. The p-threshold for the cluster was 0.025. The number of permutations was 2000. Finally, we calculated, for each subject, the average power of the source activity in the region of the identified cluster for each of the four intervals.

All described operations were performed in Matlab using the Fieldtrip toolbox [21,32].
