**2. Materials and Methods**

The study was conducted in the largest county of the federal state of Tyrol, Austria. Of the total 71 elementary schools in the county, 15 schools were selected via a random number generator and received information about the study. One school declined to participate due to organizational problems. The final sample, therefore, consisted of 14 schools that participated in data collection throughout the 4-year observation period. In order to track participants throughout their entire elementary school time, only students who were in first grade at baseline were eligible for participation. In addition, participants needed to be able to complete a physical fitness test battery, and children with mental, neurological, or physical diagnoses were excluded from the study. This resulted in a sample size of 392 children (55.4% male; age: 6.9 ± 0.5 years). The study protocol was approved by the Institutional Review Board of the University of Innsbruck (certificate of good standing, 16/2014), the school authorities of the federal state of Tyrol, and the school board of each participating school. Written parental consent was obtained prior to baseline data collection and children provided oral assent at the time of data collection. All study procedures were in accordance with the ethical standards of the Declaration of Helsinki (as amended in 2013).

Participants completed anthropometric measurements and physical fitness tests during each fall and spring semester over their four years in elementary school, which resulted in up to eight measurements throughout the entire observation period. Baseline data collection occurred during the school entry evaluation in October 2014 and the final followup measurements were completed in June 2018 when children were in their final grade (fourth grade) of elementary school. In order to be included in the analysis, participants needed to provide valid and complete data for at least five measurements, including at baseline and the last follow-up assessment.

Data collection occurred in the participating school's gymnasium during regular class time in a single session. Anthropometric measurements and fitness tests were administered by exercise science graduate students, who were well trained in conducting these measurements in a pediatric population during the course of a research seminar prior to data collection. A total of 14 students were involved in the measurements throughout the 4-year study period, with 6 to 7 students present during each measurement session in the schools. An overview of the procedures for each testing session is provided in Figure 1. class time in a single session. Anthropometric measurements and fitness tests were administered by exercise science graduate students, who were well trained in conducting these measurements in a pediatric population during the course of a research seminar prior to data collection. A total of 14 students were involved in the measurements throughout the 4‐year study period, with 6 to 7 students present during each measurement session in the schools. An overview of the procedures for each testing session is provided in Figure 1.

Data collection occurred in the participating school's gymnasium during regular


*Int. J. Environ. Res. Public Health* **2022**, *19*, x FOR PEER REVIEW 3 of 12

**Figure 1.** Data collection procedure at each measurement time. **Figure 1.** Data collection procedure at each measurement time.

Body height (cm) was measured with a portable stadiometer (SECA® 217, Hamburg, Germany) and weight (kg) was measured with a calibrated digital scale (SECA® 803, Hamburg, Germany) to the nearest 0.1 cm and 0.1 kg, respectively, with children wearing gym clothes and barefoot. Body mass index (BMI) was calculated (kg/m2) and converted to BMI percentiles (BMIPCT) using German reference values [34]. Children with a BMIPCT above the 90th percentile were classified as overweight/obese. For the statistical analyses, quartiles of baseline BMI percentiles were established (Quartile 1: BMIPCT < Body height (cm) was measured with a portable stadiometer (SECA® 217, Hamburg, Germany) and weight (kg) was measured with a calibrated digital scale (SECA® 803, Hamburg, Germany) to the nearest 0.1 cm and 0.1 kg, respectively, with children wearing gym clothes and barefoot. Body mass index (BMI) was calculated (kg/m<sup>2</sup> ) and converted to BMI percentiles (BMIPCT) using German reference values [34]. Children with a BMIPCT above the 90th percentile were classified as overweight/obese. For the statistical analyses, quartiles of baseline BMI percentiles were established (Quartile 1: BMIPCT < 29.0; Quartile 2: 29 ≤ BMIPCT < 50.2; Quartile 3: 50.2 ≤ BMIPCT < 76; Quartile 4: BMIPCT > 76.0).

29.0; Quartile 2: 29 ≤ BMIPCT < 50.2; Quartile 3: 50.2 ≤ BMIPCT < 76; Quartile 4: BMIPCT > 76.0). Upon the completion of anthropometric measurements, participants completed the German Motor Test (DMT6‐18) [35], which has been shown to provide valid and reliable information on physical fitness in children and adolescents [35,36]. The DMT6‐18 consists of eight test items that assess cardiorespiratory endurance, muscular endurance, muscular strength, power, speed and agility, and balance and flexibility. Specifically, participants performed a 6 min run, sit ups, push ups, a standing long jump, a 20 m sprint, 20 s sideways jumping, backwards balancing, and a stand and reach test, with practice trials and measured attempts as specified in the test manual. Fitness tests were administered in random order after a standardized 5 min warm up, except for the 20 m sprint, which was completed at the beginning, and the 6 min run, which was completed at the end of the test session. In addition to raw performance values, the DMT6‐18 provides sex‐ and age‐ standardized scores. The average of these scores is used as an indicator for overall physical fitness, with a value of 100 indicating average physical fitness for the respective age and sex; higher scores indicate above average physical fitness and lower scores indicate below average physical fitness [35]. As shown for baseline BMIPCT, quartiles for baseline physical fitness were established based on overall physical fitness scores Upon the completion of anthropometric measurements, participants completed the German Motor Test (DMT6-18) [35], which has been shown to provide valid and reliable information on physical fitness in children and adolescents [35,36]. The DMT6-18 consists of eight test items that assess cardiorespiratory endurance, muscular endurance, muscular strength, power, speed and agility, and balance and flexibility. Specifically, participants performed a 6 min run, sit ups, push ups, a standing long jump, a 20 m sprint, 20 s sideways jumping, backwards balancing, and a stand and reach test, with practice trials and measured attempts as specified in the test manual. Fitness tests were administered in random order after a standardized 5 min warm up, except for the 20 m sprint, which was completed at the beginning, and the 6 min run, which was completed at the end of the test session. In addition to raw performance values, the DMT6-18 provides sexand age-standardized scores. The average of these scores is used as an indicator for overall physical fitness, with a value of 100 indicating average physical fitness for the respective age and sex; higher scores indicate above average physical fitness and lower scores indicate below average physical fitness [35]. As shown for baseline BMIPCT, quartiles for baseline physical fitness were established based on overall physical fitness scores (Quartile 1: overall physical fitness < 100; Quartile 2: 100 ≤ overall physical fitness < 105; Quartile 3: 105 ≤ overall physical fitness < 108; Quartile 4: overall physical fitness ≥ 108).

(Quartile 1: overall physical fitness < 100; Quartile 2: 100 ≤ overall physical fitness < 105; Quartile 3: 105 ≤ overall physical fitness < 108; Quartile 4: overall physical fitness ≥ 108). Statistical Analysis. Normal distribution of the data was confirmed prior to statistical analyses. Cross‐sectional associations between BMIPCT and components of physical fitness were examined via Pearson correlation analysis. Linear mixed models (LMMs) were used to determine change in BMIPCT and overall physical fitness throughout the observation period in order to account for different time intervals between measurement periods. Subsequently, ANOVA was used to examine differences in the development of BMIPCT and physical fitness across quartiles of baseline BMIPCT and baseline physical Statistical Analysis. Normal distribution of the data was confirmed prior to statistical analyses. Cross-sectional associations between BMIPCT and components of physical fitness were examined via Pearson correlation analysis. Linear mixed models (LMMs) were used to determine change in BMIPCT and overall physical fitness throughout the observation period in order to account for different time intervals between measurement periods. Subsequently, ANOVA was used to examine differences in the development of BMIPCT and physical fitness across quartiles of baseline BMIPCT and baseline physical fitness, respectively. Additionally, quantile regression analyses were performed to determine the effect of change in physical fitness and BMIPCT on physical fitness and BMIPCT at followup, respectively, across baseline quartiles of BMIPCT and baseline quartiles of physical fitness. In addition to change in BMIPCT or physical fitness (based on LMM), baseline

BMIPCT and physical fitness were included in the regression models. Secondary analyses included sex as a co-variate to examine potential sex-specific associations. All statistical tests were performed in SPSS V26.0 software (SPSS Inc., IBM Corp., Armonk, NY, USA) with the significance level set at α < 0.05.
