Predictive Ability of Systems of Postural Control for 1-Year Risk of Falls and Frailty in Community-Dwelling Older Adults: A Preliminary Study
Round 1
Reviewer 1 Report (Previous Reviewer 1)
Comments and Suggestions for AuthorsThe authors have clearly addressed prior concerns and have added data to the study. No further concerns at this time.
Author Response
> The authors have clearly addressed prior concerns and have added data to the study. No further concerns at this time.
Response: Thank you for your thoughtful confirmation and for your kind understanding of our study.
Reviewer 2 Report (Previous Reviewer 2)
Comments and Suggestions for AuthorsIn Section 2.3.1 or 4. Discussion -Limitations, you can consider add a short note on whether strategies were used to minimize recall bias in 1-year fall history assessment.
In Section 4 Discussion, you can add a paragraph interpreting the clinical relevance of small effect sizes for postural control-outcome associations, to link statistics to practical application
Add visual markers, e.g., *p < 0.05, **p < 0.01 for significant p-values in Tables 2–5 and standardize decimal places for effect sizes to improve readability .
Author Response
> In Section 2.3.1 or 4. Discussion -Limitations, you can consider add a short note on whether strategies were used to minimize recall bias in 1-year fall history assessment.
Response: Thank you for this valuable suggestion. We have revised the manuscript accordingly. In Section 2.3.1 (Assessment of falls), we added the following sentence:
“If participants were uncertain about the criteria for a fall or the circumstances of their fall, they were encouraged to ask the researchers for clarification.”
In addition, in the Discussion (Limitations), we added the sentence:
“In addition, no specific strategies were implemented to minimize recall bias in the assessment of falls.”
> In Section 4 Discussion, you can add a paragraph interpreting the clinical relevance of small effect sizes for postural control-outcome associations, to link statistics to practical application
Response: In accordance with your suggestion, we have added the following paragraph in the Discussion section to interpret the clinical relevance of small effect sizes:
“Regarding the associations between balance function and falls/frailty, some relationships evaluated by effect sizes were observed, with values around 0.3 at most, indicating a moderate magnitude. This suggests that falls and frailty are not solely related to balance function but also to multiple physical and psychological conditions, highlighting the need for comprehensive assessment and intervention in clinical practice.”
> Add visual markers, e.g., *p < 0.05, **p < 0.01 for significant p-values in Tables 2–5 and standardize decimal places for effect sizes to improve readability.
Response: Thank you for your helpful comments. We have revised the tables as follows:
- In Tables 2 and 3, we added visual markers (*p < 0.05, **p < 0.01) to indicate statistical significance.
- Effect sizes are consistently presented with three decimal places throughout the tables.
- For Tables 4 and 5 (ROC analyses), we did not add p-values because all indices were derived using bootstrap resampling. Instead, we report AUCs with 95% CIs, and significance was determined based on whether the 95% CI excluded 0.5.
Reviewer 3 Report (Previous Reviewer 4)
Comments and Suggestions for AuthorsThank you. I have read the response carefully and have looked at my own and other reviewers' feedback. You seem to have tacked in a construct that you were looking at feasibiity and validity - but there are no feasibility objectives and not much stress on feasibility in the results and discussion. You could revisit this if you wish.
Author Response
> Thank you. I have read the response carefully and have looked at my own and other reviewers' feedback. You seem to have tacked in a construct that you were looking at feasibiity and validity - but there are no feasibility objectives and not much stress on feasibility in the results and discussion. You could revisit this if you wish.
Response: Thank you for pointing this out. We agree with your observation that feasibility objectives and related discussion were not sufficiently addressed in the manuscript. To avoid any confusion, we have removed the explicit reference to “feasibility” from the study objectives. The revised objectives now focus solely on validity and associations, which are supported by the presented results and discussion.
This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsArticle Feedback:
This manuscript presents the results of a prospective cohort study examining the relationships between postural control subsystems and incidence of falls and progression to frailty in community-dwelling older adults over a 1-year period. This concept has been studied previously but not over this time-period or in a prospective study format per the authors. Thank you for your work and desire to publish your findings. This reviewer has concerns about the detail provided in the manuscript including introduction, assessments, results, and discussion. The statistical analysis lacks analysis of correlation. They do assess predictive validity but the number of subjects with frailty was significantly limited with analysis lacking justification with this sample size. The conclusions made regarding frailty when only 8 of 101 participants had “frailty” based on their limited assessment of frailty (physical only).
Title/Abstract
In this reviewer’s opinion, the choice of words, “components of balance function” or “balance function components”, does not accurately reflect what is being measured by the Brief-BESTest. The Brief-BESTest measures postural control through 6 subcomponents (i.e. biomechanical constraints; stability limits/verticality; anticipatory postural adjustments, postural responses, sensory orientation, and gait stability). The authors describe the BESTest well on lines 52-53. Suggest using this approach consistently throughout paper. The TUG (component 6) is the only measure of “functional balance” in this reviewer’s opinion. In addition, the researchers appear to be analyzing the predictive validity of Brief-BESTest for rall risk and frailty using ROC and AUC values. ROC and AUC analysis does not directly determine an association between balance subcomponents/total score and fall risk or frailty onset. It assesses how well the balance tool can differentiate between different classes, in this case frail/non-frail and faller/non-faller. Consider changing “associations” in the title.
Introduction
The introduction is lacking sufficient detail and review of literature. The authors do not provide a sufficient definition of frailty and do not justify the use of a physical frailty measure only with frailty being multidimensional. Specifically, the authors do not provide a definition of frailty, nor do they review literature already published in this area. They do note (lines 42-43) that frailty is a known risk factor for falls but they did not use the frailty measure as a predictor of falls but instead as a longitudinal outcome for persons who have impaired postural control. It is not clear from the introduction what their analysis is adding to the existing body of knowledge/literature summarized in lines 57 – 72. Suggest a more detailed review of Magnani et al (61-63) which appears to have addressed part of their research question already. Their justification of the benefits of their analysis is not adequate or they did not address some of what they state is lacking (lines 73 – 78).
Materials and Methods
Study Design: Please clarify which study you are reporting on in this manuscript. You note an observational study and a 1-year prospective cohort study.
Participants: Lacking detail on how participants were recruited from the “public facilities”. Clarify what “public facilities” are for readers not from Japan. For inclusion criteria #4 (absences of missing data in the relevant variables), please clarify what the “relevant variable” addressed (lines 102-103) – is this participant characteristics or baseline measures, or data across time-period of study. How did you address missing data for those enrolled. Provide some additional detail on scoring of Rapid Dementia Screening Test (RDST) in text or table.
Assessment
Fall Assessment: Need to provide more detail on how falls were defined and recorded. How was a “fall” defined? How were the participants classified as faller/non-faller” defined? Also consider discussing the significant limitations of recalling falls over a one-year time-period. Where the participants asked at start of prospective study to record falls – if so, how?
Frailty Status: Please clearly define type of frailty measured in this section as physical frailty. You do note this in the discussion (line 310). Why measure only physical frailty? Is there evidence that other types of frailty are impacted by balance dysfunction as well? Is the BESTest measuring the same thing as physical frailty – are they associated?
Statistical Analysis:
Why did you include those rated as pre-frail in the non-frail category. Are that not a distinctive category (line 146)? Did you analyze this subgroup before combining them with non-frail? Is there literature to support this decision?
Results
Overview of the follow-up: In your description of the fallers (lines 186-187), how many individuals had more than one fall. Was this an important differential? With only 8% of your cohort developing frailty in the 1-year study period, how do you justify the use of your statistical analysis with this very small sample (lines 194-195).
Minimally important change anchored to frail status: In my opinion, the interpretation by the authors for this data is not completely accurate based on words “…associated with…” in the following phrase, “…a decrease of approximately 6.5 points was associated with frailty…” (lines 236-237). My interpretation of this data, even with minimal subjects (n=8) developing frailty in study population, is that a decrease of at least 6.478 points in the total postural control score is a minimally important change that may indicate an increased risk of developing frailty – not associated with frailty. Important to consider the CI as well.
Discussion
This reviewer is quite confused by the results presented and the discussion. There seems to be a lack of agreement between the 2. In addition, authors use the term “associated” throughout discussion. The authors found differences in baseline and 1-year follow-up between faller/non-faller (small effect size) and frailty/non-frailty (larger effect size than faller/non-faller). The AUC and MIC values were predictive or frailty only which is not what is stated in discussion.
Tables/Figures:
In table 2 and 3 – suggest using a symbol for all measures with significant p-value and include effective size interpretation in the description below the table.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis manuscript, through a prospective cohort design, clarified the specific associations of different components of balance function with falls and frailty in community-dwelling elderly individuals. The proposed MIC value of a 7-point decrease in the Brief-BESTest total score for frailty transition has potential clinical reference value and provides a quantitative basis for early intervention. The study sample was from Japan, which is also the country with the most severe aging problem, and thus is representative. The writing is relatively well.
I have the following minor revision suggestions:
1. Methods and Data Analysis:
The sample size is relatively small (n=101), and only 8 participants transitioned to frailty, which may limit statistical power. A more detailed explanation is needed to justify why a small sample size can yield reliable results.
The control methods for confounding factors (such as underlying diseases, exercise habits) were not mentioned.
2. Results and Implications:
The possible mechanism for the lack of association between postural response and frailty needs to be further explained in the discussion (e.g., insufficient sensitivity of the measurement tool?).
3. Limitations:
The follow-up period (1 year) was relatively short , and long-term associations need further verification.
The distinction between single and multiple falls was not made.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe manuscript entitled "Associations between Balance Function Components and 1-Year Risk of Falls and Frailty Onset in Community-Dwelling Older Adults” is valuable and beneficial, particularly in terms of analyzing the components of balance related to falls and the onset of frailty. I have some questions and comments below:
- How can you control the participants who experience significant events during the follow-up period, such as strokes or other neurological conditions.
- Please add more information regarding the sample size calculation and sampling technique.
- How can you control other confounders related to falls, such as medication use or participation in an exercise program?
- I highly recommend adding the clinical implication to the discussion part.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsAbstract
Please consider adding Brief BESTest to the key words. Line 30 is vague - addressing various balance components - with what?
Introduction
Line 44 - we are trying to move away from 'fear of falling' to concerns about falling. These can be protective as well as negative.
Lines 51-65 - I would give a little more detail about the components of the BriefBESTest by name - for example, many clinicians will be more familiar with terms like single leg stand, functional reach, condition 4 on the mCTSIB and TUG simple. It will save readers from having to familiarise themselves with the test itself. Which country is Magnani from and is it reasonable to think their normative data might be helpful in Japan?
Please define which definition of falls you are using and motivate the choice (e.g., WHO, World Falls Guidelines etc).
Methods
Line 92 please state which version of Helsinki you are using.
Line 98 please justify why age 65y when WHO defines older adulthood as starting at 60y. How was the sample size derived and how does this speak to power? Attrition during the study?
Line 99 - what are public facilities? Hospitals? Clinics? More detail regarding the setting is required.
Line 106 RDST might be well known in Japan but in the West could be things like MiniCog etc - please give a few more details.
Morbidity data in table 1 - how many of them had multi- or comorbidities? Unusual at that age just to have diabetes, for example, and not hypertension.
Line 109 - was renewed consent taken at one year?
Line 118 again, fall definition as explained to the participants is important. Did it include slips, trips, near misses, falls associated with medical events like syncope and stroke?
Line 150 - if doing repeated tests was a Bonferonni used to adjust? Otherwise description of stats is good.
Results
Line 186 - no mention of attrition - so nobody died or moved or similar? Very unusual in an older adult population. Otherwise clearly explained and presented.
Discussion
Line 247 you use the word hypothesis - did I miss this earlier in the methods?
Lines 271-275 - fall prevalence could also be related to how a fall is defined and described to participants, don't you think?
Line 285 and elsewhere when TUG is discussed: TUG measures more than just gait stability, don't you think? ability to transition from sit to stand and back again, turn around at the end of 3m etc. It may be a little simplistic the way you have reported/interpreted it.
Line 291 - not everyone would agree regarding TUG. I do take the point but possibly a more nuanced discussion is warranted. See: Christopher, A., Kraft, E., Olenick, H., Kiesling, R., & Doty, A. (2021). The reliability and validity of the Timed Up and Go as a clinical tool in individuals with and without disabilities across a lifespan: A systematic review: Psychometric properties of the Timed Up and Go. Disability and rehabilitation, 43(13), 1799-1813. And: Rydwik, E., Bergland, A., Forsén, L., & Frändin, K. (2011). Psychometric properties of timed up and go in elderly people: a systematic review. Physical & Occupational Therapy in Geriatrics, 29(2), 102-125. I wonder if you pulled out the specific TUG times and looked at those it might be helpful or interesting?
Overall the discussion and conclusion reflect the results.
Did I miss what method you used to ascertain if falls occurred during the interval between the two assessments? Calendars, recall etc? Could have introduced bias of various types.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors of this manuscript have taken the original feedback constructively in many areas including but not limited to use of "postural control systems", defining frailty, defining what a "fall" was considered, acknowledging limitations of self-reported falls, clarity of purpose and value of the findings of this study, and expanded description of the imitations of their work.
Unfortunately, I am still very concerned that the accuracy of the data presentation and conclusions being drawn are not accurately stated throughout the manuscript. The use of group comparison of median/mean values allows for conclusions regarding associations between Brief BESTest scores (each item and or total score) between two groups/ordinal categories (faller/non-faller, frailty/no frailty). The use of ROC analysis provides the ability to assess predictive performance (sensitivity/specificity/MIC values) of a diagnostic test (Brief BESTest). A high AUC which in this study was shown only with frailty analysis (using total score) suggesting good predictive ability of the Brief BESTest to accurately identify individuals at higher or lower risk of new-onset frailty. Yet, great caution needs to be taken with their conclusions on frailty as they only had 8 of 99 individuals in study develop new-onset frailty. I have concerns about this manuscript being published without further revision and/or their consultation with a statistician who can assist them to improve the accurate and clear interpretation of their findings.
Review items:
Abstract
Line 23 - "...to evaluate the onset of falls and the frailty." This portion of sentence needs to be grammatically corrected (e.g., "...to evaluate falls and new-onset frailty."). I don't think you can say to evaluate onset of falls because you did not document whether the participants had a prior history of falls. For frailty, you excluded those at baseline that were already considered frail.
For both fall and frailty conclusions, please accurately clarify your conclusions are accurate regarding an association (group comparisons) versus prediction (ROC/AUC/MIC) - comparison of group means allows for conclusions about association and ROC analysis allows for suggestions of predictive ability of the Brief BESTest to accurately identify individuals at higher or lower risk of falls or new-onset frailty.
Methodology
2.3.1. Fall Assessment
Lines 150-156 - The authors have more clearly defined what a fall is considered. Did they consider excluding falls that were due to a medical incident (e.g., orthostasis, vasovagal, being pushed/bumped, etc.).
2.4. Statistical Analysis
Lines 183-184 - "Those classified as frail at baseline were excluded from the analysis of frailty." Was this also done for fallers? You study included those who had fallen before and within the 1 year study period. This will impact interpretation.
Lines 201 - 203 - Reasoning for not doing a multi-comparison correction is not appropriate since you have stated hypotheses. If you stated that this is an exploratory study to generate hypotheses that would be more appropriate.
Lines 204-205 - Strongly suggest clarifying what an ROC analysis provides in first sentence of this paragraph. The ROC curve in this study allows you to assess predictive performance (sensitivity/specificity/MIC values) of a diagnostic test (Brief BESTest - you looked at baseline test as a whole and by area of postural control AND change from baseline to 1-year) in classifying individuals 1 year later into categories (faller/non-faller AND no frailty/new-onset frailty).
Discussion:
Please correct assumptions made from analysis (association - group differences vs prediction - ROC/AUC/MIC) in discussion and conclusion sections of paper. A few examples provided below.
Lines 294-295 - For sentence, "Furthermore, a decrease of seven points or more in the total Brief-BESTest score may represent the MIC for predicting new-onset frailty.", you have rounded the MIC value inaccurately. This is also done on line 359 and 389. The reported MIC value in Table 5 is -6.478. If you want to round to whole number, the MIC would be 6, not 7. In other places in manuscript such as in the Abstract line 26, in Results line 279, you state value as 6.5, which is accurately rounded. Suggest consistency in reporting MIC value for falls as well (results 1.5, discussion 2.0).
In paragraph 1, you are stating that significant differences in the group comparisons leads to the conclusion that the items are predictive in nature. This is NOT accurate. There is a association between Brief BESTest items/or total scores and fall/non-faller or frail/non-frail categories. The last sentence is an accurate predictive conclusion using the MIC value but as noted above if you are going to use a whole number the value should be 6, not 7.
Check paragraph 3 thoroughly for interpretation errors. For example, lines 321 - 323 and lines 342-345 are referencing differences in scores between groups as predictive versus an association. ROC/AUC/MIC analysis can demonstrate prediction accuracy. On lines 333 - 334, you note that MIC analysis for falls was predictive with MIC 1.5 points but the AUC values do not support this conclusion as you accurately note in your results section.
Tables:
Table 2 - Is the data presented accurate or presented with enough decimal values (score range is so small (0 to 3) that additional decimals are likely needed. For example, the median non-faller and faller baseline S6 values are equivalent (3.0) but the p-value is significant for a difference between scores. But for the S5 item, the values are also both 3.0 for faller and non-faller groups, but p-value is different than for S6. Need to provide more decimals if the values are truly different. This occurs in other areas of the table as well.
Table 3 - same comments as above. May need more decimal values. Also, there appears to be an error in p values listed below the table - the last p-value should be ***p<0.001?
Author Response
> The authors of this manuscript have taken the original feedback constructively in many areas including but not limited to use of "postural control systems", defining frailty, defining what a "fall" was considered, acknowledging limitations of self-reported falls, clarity of purpose and value of the findings of this study, and expanded description of the imitations of their work.
Unfortunately, I am still very concerned that the accuracy of the data presentation and conclusions being drawn are not accurately stated throughout the manuscript. The use of group comparison of median/mean values allows for conclusions regarding associations between Brief BESTest scores (each item and or total score) between two groups/ordinal categories (faller/non-faller, frailty/no frailty). The use of ROC analysis provides the ability to assess predictive performance (sensitivity/specificity/MIC values) of a diagnostic test (Brief BESTest). A high AUC which in this study was shown only with frailty analysis (using total score) suggesting good predictive ability of the Brief BESTest to accurately identify individuals at higher or lower risk of new-onset frailty. Yet, great caution needs to be taken with their conclusions on frailty as they only had 8 of 99 individuals in study develop new-onset frailty. I have concerns about this manuscript being published without further revision and/or their consultation with a statistician who can assist them to improve the accurate and clear interpretation of their findings.
Response:
Thank you for your detailed and constructive feedback. Your comments were very helpful in clarifying our manuscript.
In light of the advice we received, we revised the manuscript as extensively as possible and resubmitted it. We clarified in the Methods, Results, and Discussion that group comparisons examine associations, whereas AUC and MIC estimates evaluate predictive performance. As you repeatedly noted, the number of participants with new-onset frailty was very small; therefore, we added our most recent longitudinal data and expanded the inclusion criteria for participants. Nevertheless, the data remain limited, so we now explicitly position this work as a preliminary study to inform a future large-scale cohort, and we have revised the title and stated aims accordingly. With input from colleagues experienced in statistics and given the limited sample size, we adopted conservative analytical procedures. Whereas earlier versions also analyzed the total Brief-BESTest score, because clinical utility metrics such as the MIC for the total score were secondary, we now focus on how the six postural control systems are associated with—and may predict—frailty and falls.
Please see below for further details.
> Abstract
Line 23 - "...to evaluate the onset of falls and the frailty." This portion of sentence needs to be grammatically corrected (e.g., "...to evaluate falls and new-onset frailty."). I don't think you can say to evaluate onset of falls because you did not document whether the participants had a prior history of falls. For frailty, you excluded those at baseline that were already considered frail. Response:
Response:
As you correctly pointed out, for both falls and frailty we revised the outcomes so that they are defined irrespective of baseline status: the presence of falls during the one-year follow-up and frailty status at one year. We have adjusted the wording throughout the manuscript accordingly.
> For both fall and frailty conclusions, please accurately clarify your conclusions are accurate regarding an association (group comparisons) versus prediction (ROC/AUC/MIC) - comparison of group means allows for conclusions about association and ROC analysis allows for suggestions of predictive ability of the Brief BESTest to accurately identify individuals at higher or lower risk of falls or new-onset frailty.
Response:
Thank you for your advice. In the revised manuscript, we carefully distinguished between association and prediction in both wording and interpretation. We have made corresponding revisions in the Methods, Results (including subheadings), and Discussion sections.
> Methodology
2.3.1. Fall Assessment
Lines 150-156 - The authors have more clearly defined what a fall is considered. Did they consider excluding falls that were due to a medical incident (e.g., orthostasis, vasovagal, being pushed/bumped, etc.).
Response:
We sincerely appreciate this important comment. As you correctly pointed out, one limitation of our study is that falls were not prospectively monitored at specific time points (e.g., by telephone confirmation). In addition, we did not collect detailed information on the causes of falls in this study, and we have explicitly clarified this in the revised manuscript. In future cohort studies, we intend to improve fall assessment by implementing prospective monitoring methods and by gathering more detailed information on the causes of falls. We believe these improvements will strengthen the validity of future research.
> 2.4. Statistical Analysis
Lines 183-184 - "Those classified as frail at baseline were excluded from the analysis of frailty." Was this also done for fallers? You study included those who had fallen before and within the 1 year study period. This will impact interpretation.
Response:
In this revision, we set the inclusion criteria without applying baseline conditions for either frailty or falls. Therefore, we have been careful to avoid expressions such as “onset of frailty” or “new occurrence of falls.”
> Lines 201 - 203 - Reasoning for not doing a multi-comparison correction is not appropriate since you have stated hypotheses. If you stated that this is an exploratory study to generate hypotheses that would be more appropriate.
Response:
Thank you for this thoughtful suggestion. In the previous version, we did not apply multiple comparison adjustments; however, given the limited sample size and the risk of overestimation, we have now introduced multiple comparison correction in the group comparisons. Specifically, for the item-level analyses of the Brief-BESTest, we applied a False Discovery Rate (FDR) adjustment using the Benjamini–Hochberg procedure. In contrast, we did not apply multiple comparison correction to the analyses of the total Brief-BESTest score, as these were considered secondary and involved a limited number of comparisons.
In future studies, if sufficient data on the occurrence of frailty and falls become available, it would be ideal to conduct logistic regression analyses including all Brief-BESTest items. Furthermore, it would be desirable to incorporate covariates beyond balance function, and we consider this an important task for future research. By doing so, we will be able to evaluate which postural control systems have the most prominent influence on frailty and falls.
> Lines 204-205 - Strongly suggest clarifying what an ROC analysis provides in first sentence of this paragraph. The ROC curve in this study allows you to assess predictive performance (sensitivity/specificity/MIC values) of a diagnostic test (Brief BESTest - you looked at baseline test as a whole and by area of postural control AND change from baseline to 1-year) in classifying individuals 1 year later into categories (faller/non-faller AND no frailty/new-onset frailty).
Response:
Thank you very much for this helpful suggestion. In accordance with your comment, we have revised the manuscript to explicitly state that the ROC analyses were conducted to examine the predictive performance of each item of the Brief-BESTest.
> Discussion:
Please correct assumptions made from analysis (association - group differences vs prediction - ROC/AUC/MIC) in discussion and conclusion sections of paper. A few examples provided below.
Lines 294-295 - For sentence, "Furthermore, a decrease of seven points or more in the total Brief-BESTest score may represent the MIC for predicting new-onset frailty.", you have rounded the MIC value inaccurately. This is also done on line 359 and 389. The reported MIC value in Table 5 is -6.478. If you want to round to whole number, the MIC would be 6, not 7. In other places in manuscript such as in the Abstract line 26, in Results line 279, you state value as 6.5, which is accurately rounded. Suggest consistency in reporting MIC value for falls as well (results 1.5, discussion 2.0).
In paragraph 1, you are stating that significant differences in the group comparisons leads to the conclusion that the items are predictive in nature. This is NOT accurate. There is a association between Brief BESTest items/or total scores and fall/non-faller or frail/non-frail categories. The last sentence is an accurate predictive conclusion using the MIC value but as noted above if you are going to use a whole number the value should be 6, not 7.
Check paragraph 3 thoroughly for interpretation errors. For example, lines 321 - 323 and lines 342-345 are referencing differences in scores between groups as predictive versus an association. ROC/AUC/MIC analysis can demonstrate prediction accuracy. On lines 333 - 334, you note that MIC analysis for falls was predictive with MIC 1.5 points but the AUC values do not support this conclusion as you accurately note in your results section.
Response:
Thank you very much for carefully reviewing our manuscript. To clarify interpretation and align with your comments, we removed the analyses and discussion of the total Brief-BESTest score and focused on item-level performance. We revised the Discussion and Conclusions to consistently distinguish associations (group differences) from prediction (ROC/AUC/MIC). We also adopted a uniform rounding policy for MIC values: MICs are reported to one decimal place using conventional rounding. For integer expressions, when the scale is continuous we round to the nearest integer; however, for ordinal scales in which only integer values are possible, we interpret the MIC as the smallest integer change that exceeds the calculated threshold. In the revised manuscript, we highlight S6 (Timed Up and Go test) for falls, with an MIC of −1.285. Because the S6 score is ordinal (0–3 scale), a 1-point decrease would not exceed this threshold; therefore, the MIC was interpreted as corresponding to a decrease of at least 2 points. Importantly, we have clarified throughout the Abstract, Results, and Discussion that although this MIC was statistically significant, its predictive value is limited due to the low AUC, and therefore its clinical utility should be interpreted with caution.
> Tables:
Table 2 - Is the data presented accurate or presented with enough decimal values (score range is so small (0 to 3) that additional decimals are likely needed. For example, the median non-faller and faller
baseline S6 values are equivalent (3.0) but the p-value is significant for a difference between scores. But for the S5 item, the values are also both 3.0 for faller and non-faller groups, but p-value is different than for S6. Need to provide more decimals if the values are truly different. This occurs in other areas of the table as well.
Table 3 - same comments as above. May need more decimal values. Also, there appears to be an error in p values listed below the table - the last p-value should be ***p<0.001?
Response:
Thank you for pointing this out. We agree that when only the median values are shown, some results appear to have no differences despite significant p-values. Because each item is scored on a 0–3 ordinal scale, presenting values with two decimal places would not be meaningful. Instead, we clarified the distribution of scores by presenting the interquartile ranges, which more appropriately reflect variability. In addition, we revised the analyses to incorporate multiple comparison correction, which makes the interpretation of these results more cautious and robust.
Reviewer 4 Report
Comments and Suggestions for AuthorsThank you for your response. It is a pity that this vulnerable population was no re-consented at follow up. In addition, the lack of prospective falls monitoring is very problematic. I realise that neither can be managed post-hoc.
Author Response
Thank you very much for your thoughtful comments. As you noted, particularly with the small number of participants who developed frailty after one year, we recognize the limitations of our analyses. We therefore revised the title and positioned this work explicitly as a preliminary study. In addition, for the present revision, we added our most recent data and slightly expanded the inclusion criteria to increase the sample size. In Japan, the prevalence of frailty in community-based surveys is often less than 10%, which means that large-scale cohort studies are required to establish frailty as a reliable outcome. We view the present study as a preliminary step to confirm the validity of our research perspective and to inform the design of future investigations. We also appreciate your comment regarding fall assessment. As our current evaluation was not sufficient, we plan to devote more effort to improving the assessment of falls and to implement more robust monitoring methods in future cohort studies.
