*2.1. Participants*

The participants in this study were community dwelling women aged 50–89 years who had not been diagnosed as having postmenopausal osteoporosis. As described in a previous report [19], we randomly recruited participants from a local population between October 2014 and June 2017 to build a cohort for an epidemiology study on locomotor function termed the Obuse study. Briefly, we randomly selected men and women from among 11,326 citizens aged 50–89 years who were registered in a cooperating town office and asked them to participate in this study. Requests were made until the number of consenting participants reached approximately 50 for each age group and gender (8 groups: 50 s, 60 s, 70 s, and 80 s of male and female). The final Obuse study cohort contained 415 participants (212 women and 203 men). The participation rate was 32.0%. Of the 212 women, 168 were included in this study after excluding those taking osteoporosis drugs (28 women), receiving hormonal therapy (7 women), premenopausal (6 women), or unable to perform all of the physical performance tests (3 women) (Figure 1).

**Figure 1.** Participant selection flowchart.

All participants were surveyed after obtaining written consent based on the Helsinki declaration. This study was conducted after review by our ethics committee (approval number: 2729). The authors declare no conflict of interest of interest.

#### *2.2. Osteoporosis Diagnosis*

L2-L4 spine (L2-4), bilateral total hip, and bilateral femoral neck BMD were measured using dual-energy X-ray absorptiometry (Prodigy; GE Healthcare, Chicago, IL, USA). The T-scores for each site were calculated based on the manufacturer-provided reference [20]. The smallest T-score was treated as the representative value among the 5 measurement sites. Based on WHO diagnostic criteria, T-score ≥ −1 was classified as healthy, −2.5 < T-score < −1 was classified as osteopenia, and T-score ≤ −2.5 was judged as osteoporosis [1]. FRAX scores were also calculated using patient information and T-scores.

#### *2.3. Physical Performance Tests*

We measured grip strength, knee extension muscle strength, and one-leg standing time with eyes open. As diagnostic criteria for the recently established locomotive syndrome, two-step test (TST), stand-up test, and Locomo25 scores were evaluated as well [21]. Grip strength in kilograms was determined using a Jamar Hydraulic Hand Dynamometer (Performance Health, Chicago, IL, USA) to obtain the mean values for each side. Knee extension strength was measured with the Leg Extension/Curl Rehab 5530 (HUR, Kokkola, Finland), with measurements taken for both lower limbs, averaged, and divided by body weight (% weight). One-leg standing time was assessed once for each side, with an upper limit of 60 s. The average value of the left and right sides (seconds) was used.

The TST, stand-up test, and Locomo25 are evaluation items for locomotive syndrome proposed by the Japanese Orthopaedic Association. The TST is performed by taking 2 maximum-stride steps and calculating the distance (centimeters) divided by body height (centimeters) [21]. The stand-up test consists of standing up from a sitting position. Participants progressively rise from boxes of 40, 30, 20, and 10 cm in height with both legs or one leg. The tasks were performed in the following order, from easiest to most difficult: both legs 40 cm →30 cm →20 cm →10 cm →one leg 40 cm →30 cm →20 cm →10 cm. The most difficult task completed was used as the subject's evaluation value. A score of 1 point was allotted for the first task, with 1 additional point given for each subsequent task [21]. Locomo25 is a questionnaire survey consisting of 25 items about pain and difficulties in daily life during the previous month. Each item is graded from 0 to 4 points for a total score of 100 points. A higher score indicates less activity [21]. The questionnaires were mailed to each participant's home before the screening and collected at the screening venue.

#### *2.4. Statistical Analysis*

Based on T-score, participants were classified as having healthy BMD, osteopenia, or osteoporosis. Fracture probabilities after 10 years were estimated based on the FRAX computer-based algorithm [8]. The clinical factors of FRAX included age, gender, height, weight, prior fragility fracture, parental history of hip fracture, current smoking habit, glucocorticoid use, rheumatoid arthritis, other causes of secondary osteoporosis, alcohol consumption of 3 units or more per day, and total hip T-score. Next, associations between each physical performance test and T-scores were evaluated by Spearman's correlation coefficient. Univariate logistic regression analysis was employed to detect physical performance tests related to low BMD. The objective variable was the presence of osteoporosis (i.e., T-score ≤ −2.5), and the explanatory variables were each physical performance test. Next, stepwise logistic regression analysis was performed using the explanatory variable items whose *p*-value was < 0.2 in univariate analysis. This analysis method was chosen to identify factors that were useful for simple screening by clinicians. The best model selected was evaluated by receiver operating characteristic (ROC) curve analysis. If the area under the ROC curve was ≥ 0.7, the combination of exams was considered appropriate for osteoporosis screening. Afterwards, matrices of positive/negative likelihood ratios were constructed for combinations of osteoporosis detection items, whereby a positive likelihood ratio of ≥ 5.0 was considered useful for a suspected diagnosis and a negative likelihood ratio of ≤ 0.2 was judged as useful for an exclusion diagnosis. Likelihood ratios between 0.2 and 5.0 were interpreted as having no screening value. We used R software version 3.6.1

(The R Foundation for Statistical Computing, Vienna, Austria) and EZR Version 2.4-0 [22] for statistical analyses. *p*-values of < 0.05 were considered statistically significant.

## **3. Results**

#### *3.1. Baseline Data and Osteoporosis Prevalence*

Table 1 shows the baseline characteristics of the participants in this study. The mean ± standard deviation age of the cohort was 68.2 ± 10.6 (range: 51–88) years. Height and weight decreased with age, while BMI increased. L2-4 BMD dropped remarkably from the 60 s, and femoral neck BMD was notably low in the 80 s. Table 2 presents the prevalence of osteoporosis by age group. Of the 168 participants, 46 (27.4%) had healthy BMD, 86 (51.2%) had osteopenia, and 36 (21.4%) had osteoporosis. According to FRAX, the incidence of osteoporosis after 10 years by FRAX and the risk of fractures due to falls both increased with age.



Note: Values are presented as the mean ± standard deviation. Abbreviation: BMI, body mass index.

**Table 2.** Prevalence of osteoporosis and osteopenia and 10 year probability of fractures calculated by FRAX.


Notes: Values are presented as the number (prevalence). † The 10 year probability of fracture (%) is expressed as the mean ± standard deviation.

> *3.2. Physical Performance Test Results and Correlations with BMD*

Physical performance diminished with age, especially in women aged 70 years and above (Table 3). L2-4 T-score had a significant but weak positive correlation with grip strength (Table 4). Femoral neck BMD was significantly correlated with all physical performance tests apart from the stand-up test. Total hip BMD was significantly correlated with all physical performance tests. Both types of femoral BMD exhibited moderate positive correlations with grip strength and TST, while displaying moderate negative associations with age.

**Table 3.** Results of physical performance tests by age stratum.


Note: Values are presented as the mean ± standard deviation.


**Table 4.** Correlations between bone mineral density and physical performance.

Notes: Values represent Spearman's rho (correlation coefficient). \* *p* < 0.05. Abbreviation: BMI, body mass index.

#### *3.3. Physical Performance Tests Associated with Osteoporosis*

Age, BMI, grip strength, one-leg standing, and TST were significantly related factors to osteoporosis in univariate analysis (Table 5). Multivariate analysis revealed significant associations for BMI and TST with osteoporosis (both *p* < 0.01).



Notes: Values are presented as the odds ratio (95% confidence interval). \* *p* < 0.05. Abbreviation: BMI, body mass index.

#### *3.4. Osteoporosis Screening by Physical Performance Tests*

Screening for osteoporosis using the combination of BMI and TST was judged as valid by ROC analysis, with an area under the curve of 0.73 (95% confidence interval 0.64–0.82) (Figure 2). Table 6 displays a positive likelihood ratio matrix with incremental values of BMI and TST. For cases of TST ≤ 1.30 and BMI ≤ 23.4, TST ≤ 1.32 and BMI ≤ 22.4, TST ≤ 1.34 and BMI ≤ 21.6, or TST < 1.24 and any BMI, the positive likelihood ratio exceeded 5.0 and osteoporosis could therefore be suspected. On the other hand, no negative likelihood ratios of < 0.2 were detected (Table 7).

**Figure 2.** Receiver operating characteristic curve for detecting osteoporosis with the combination of body mass index and two-step test.

**Table 6.** Calculations of positive likelihood ratios for combinations of body mass index and twostep test.


Notes: Leftmost column shows BMI values and top row shows two-step test scores. Values represent positive likelihood ratios. Shaded values indicate ratio ≥ 5.0. Abbreviations: BMI, body mass index; TST, two-step test.

**Table 7.** Calculations of negative likelihood ratios for combinations of body mass index and twostep test.


Notes: Leftmost column shows BMI values and top row shows two-step test scores. Values represent negative likelihood ratios. Abbreviations: BMI, body mass index; TST, two-step test.

## **4. Discussion**

According to the results of this study of postmenopausal women aged 50–89 years not treated for bone loss, 21.4% had latent osteoporosis and 51.2% had osteopenia. After adjustment for the age distribution in Japan, the rates of osteopenia and osteoporosis were estimated as 50.5% and 21.9%, respectively. L2-4 BMD correlated significantly with grip strength, while total femur and femoral neck BMD correlated with almost all physical performance tests. The combination of BMI and TST appeared useful to identify possible osteoporosis; this condition may be suspected for TST ≤ 1.30 and BMI ≤ 23.4, TST ≤ 1.32 and BMI ≤ 22.4, TST ≤ 1.34 and BMI ≤ 21.6, or TST < 1.24 regardless of BMI.

We observed that aging affected not only BMD, but also physical performance. Each physical performance test result decreased with age, with marked declines from the age of 70 years. Many studies have described the relationship between muscle and bone [9,10,16,18]. Furthermore, Tachiki et al. [23] reported that maximal muscle strength related to BMD more strongly than did muscle mass because it was an index including stimulation of bone. Similarly in this study, the correlations between many physical performance tests and BMD were considered the result of interactions involving muscle and bone.

Interestingly, TST and Locomo25, which have been used to diagnose locomotive syndrome, showed significant associations with femoral BMD. Locomotive syndrome is a concept defined as a decrease in movement capabilities as proposed by the Japanese Orthopaedic Association [21]. One cause of locomotive syndrome is osteoporosis. No reports have shown a significant association between locomotive syndrome diagnosis and BMD to date. However, lower limb function was found to significantly correlate with locomotion ability, and the amount of activity in daily life was related to BMD [24–26]. The present study also supports a relationship between locomotive syndrome and osteoporosis.

Lastly, TST appeared useful as a screening tool for osteoporosis. TST is an examination in which the subject takes two steps at maximum width without losing balance. Earlier studies revealed that TST correlated significantly with such lower limb functions as 6 min walking distance and maximum 10 m walking speed [27] as well as with the ability to perform activities of daily living [28]. Ashe et al. [29] also described that muscle power correlated more strongly with bone density than did maximum muscle tension or muscle mass. Our results sugges<sup>t</sup> that TST may be an indicator of osteoporosis since it can reflect lower limb power more directly than can maximum knee extensor strength or balance of standing on one leg. The influence of TST increased considerably after multivariate analysis, such that the effect of age might have been absorbed. It was also relevant that the cohort's age range was 50–89 years and did not include young adults with less frequent osteoporosis.

We witnessed that the combination of BMI and TST provided a clinically effective combination of values for osteoporosis screening. Based on the findings in Table 6, there was no single combination of note. However, BMI was a unique value for each participant, with only 1 TST threshold for each participant. TST and BMI can be easily and inexpensively tested anywhere. Using them, it may be possible to encourage residents to undergo osteoporosis testing before symptom onset. On the other hand, no clinically effective negative likelihood ratios were detected in the cohort, indicating that the possibility of osteoporosis in postmenopausal women cannot be excluded by any particular body function test. In the osteoporosis high-risk group with a FOSTA score of less than −4, the positive likelihood ratio was 2.2 and the negative likelihood ratio was 0.5, indicating an inadequacy in detecting osteoporosis.

This study had several limitations. First, there was a deviation in participant selection. Random sampling from a resident registry was presumed as an effective way to construct a study population that faithfully reproduced the target cohort. However, the process of passive participation may have contributed to a high non-participation rate and incomplete removal of extraction bias. Nonetheless, our passive participation method that randomly selected subjects from the general population could create a study group that was more

reflective of the actual conditions of community dwelling residents than could an active participation method, by which volunteers were recruited. Another limitation was the existence of regional characteristics. As the local governmen<sup>t</sup> in our study was located in a suburban area in Japan, which likely differed from the environment of urban areas, our target population might not be qualitatively representative of the general Japanese population and the cut-off range for detecting osteoporosis could be slightly wider. Such regional differences may become more pronounced when race is taken into consideration. The physical characteristics of the cohort could also have limited the study. The small number of patients with very low BMI might have influenced the results; thus, if BMI is higher than 21, it should be assessed in combination with TST for osteoporosis screening. Furthermore, smoking was excluded from the list of factors associated with osteoporosis because few participants smoked. Lastly, this was a small study due to human resource and financial constraints. Larger scale, multiregional surveys are needed.
