*2.1. Participants*

Twenty-one women with FM were enrolled in this cross-sectional study. The sample size and statistical power were calculated using the PASS software (version 11.0; PASS; Kaysville, Utah). In this regard, with two samples per participant, there is a 98% power to detect an intra-class correlation of 0.95 under the alternative hypothesis when the intraclass correlation under the null hypothesis is 0.75, using an F-test with a significance level of 0.05.

The participants fulfilled the following inclusion criteria for this study: (a) to be a female between 35 and 65 years old, (b) to be diagnosed with FM by a rheumatologist according to the criteria established by the American College of Rheumatology [35], and (c) to understand the physical fitness protocols. Participants were excluded if they: (a) were pregnant, (b) were enrolled in another clinical trial or research that could impact the results, and (c) had any condition where exercise is contraindicated.

All the participants gave written informed consent. The Research Ethics Committee of the University of Extremadura approved the protocols of the current study (approval reference: 51/2021).

#### *2.2. Procedure*

The Spanish version of the Revised Fibromyalgia Impact Questionnaire (FIQR) was administered [36]. This instrument is composed of 21 items divided into three domains (function, overall impact, and symptoms). The maximum score is 100, which corresponds to the worst overall impact. In addition, age and anthropometric measurements were acquired using a Tanita Body Composition Analyzer BC-418 MA (Tanita Corp., Tokyo, Japan).

The 3MBWT and (2) the TUG were performed under single and dual-task conditions. The dual-task condition consisted of subtracting two by two (a random number lower than 100) while performing the physical fitness tests.

The 3 m Backward Walk Test (3MBWT) was performed according to the procedure proposed by Carter et al. [23]. A distance of three meters was measured with black tape establishing the start and finish. Participants were asked to place their heels on the start mark. Then, they had to walk backward as fast and safely as possible at the "go" signal. Running was not allowed, and they could look behind themselves if they wished.

In the Timed Up and Go (TUG) test, participants had to get up from a chair without armrests, walk a distance of 3 m without running, turn around a cone, walk back to the chair, and sit down [37].

Simultaneous stopwatch and automatic timer records were obtained. For the TUG, the Chronopic (Chronojump, BoscoSystem®, Barcelona, Spain) time was obtained using a DIN A4-sized contact platform placed on the back of the chair, which was used to open and close the circuit to obtain the test time [9,11]. For the 3MBWT a DIN A2-sized contact platform on the start line combined with a photocell on the end line was used. Physical tests were repeated after seven days to avoid learning effect [11,18,38,39]. Participants performed three trials for each condition (single and dual-task), and the order of TUG test and 3MBWT was randomized.

#### *2.3. Statistical Analysis*

Statistical analysis was conducted using the Statistical Package for the Social Sciences (SPSS, version 24.0; IBM Corp., Armonk, NY, USA) software. Based on data provided by the Shapiro–Wilk test, parametric tests were employed. The statistical significance was established at the *p* ≤ 0.05 level. To estimate the intraclass correlation coefficient (ICC) and its 95% confidence intervals of the 3MBWT in the single and dual-task conditions at test and retest times, the 3,1 (Two-way mixed effects, consistency, single rater/measurement) model was used following the recommendations by Weir [40] and Koo [41]. Regarding the ICC classification, an ICC value lower than 0.50 indicates "poor" reliability, an ICC value between 0.50 and 0.75 indicates "moderate" reliability, an ICC value between 0.75 and 0.90 indicates "good" reliability, and an ICC value higher than 0.90 indicates "excellent" reliability. This ICC classification was interpreted according to the guideline proposed by Koo [41].

The standard error of measurement (SEM) was calculated using the following formula:

$$\text{SEM} = \text{SD} \times \sqrt{1 - \text{ICC}} \tag{1}$$

The minimal detectable change (MDC) was obtained according to the formula:

$$\text{MDC} = 1.96 \times \text{SEM} \times \sqrt{2} \tag{2}$$

The SEM and MDC were expressed as a percentage according to the following formula, SEM% or MCD% = (SEM or MCD/mean) × 100, where the mean is the average of the test and retest.

To identify the level of agreement between the test and retest, and the measuring devices in the 3MBWT under single and dual-task conditions, Bland–Altman plots were performed [42].

The Pearson's product–moment correlation coefficient (r) was used to explore the concurrent validity comparing the 3MBWT and the TUG. Finally, the relationship between 3MBWT and the impact of the disease was also analyzed through the total value of the FIQR. Cohen's recommendations [43] were followed to interpret the correlation coefficient. A score ≥ 0.5 was strong, moderate if the score was between 0.5 and 0.35, and poor if the score was ≤0.35.
