Next Article in Journal
Functional Impact of Early Prosthetic Implantation in Children with Upper Limb Agenesis or Amputation
Previous Article in Journal
Modelling Human-Structure Interaction in Pedestrian Bridges Using a Three-Dimensional Biomechanical Approach
Previous Article in Special Issue
Application of Artificial Intelligence in the Mammographic Detection of Breast Cancer in Saudi Arabian Women
 
 
Article
Peer-Review Record

Understanding Risk Factors of Recurrent Anxiety Symptomatology in an Older Population with Mild to Severe Depressive Symptoms: A Bayesian Approach

Appl. Sci. 2024, 14(16), 7258; https://doi.org/10.3390/app14167258 (registering DOI)
by Eduardo Maekawa 1,2,*, Mariana Mendes de Sá Martins 3, Carina Akemi Nakamura 3,4, Ricardo Araya 5, Tim J. Peters 6, Pepijn Van de Ven 1,2,† and Marcia Scazufca 3,7,†
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Appl. Sci. 2024, 14(16), 7258; https://doi.org/10.3390/app14167258 (registering DOI)
Submission received: 29 July 2024 / Revised: 13 August 2024 / Accepted: 16 August 2024 / Published: 18 August 2024
(This article belongs to the Special Issue Novel Approaches for Machine Learning in Healthcare Applications)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have presented a study to determine the risk factors of recurrent anxiety symptomatology in an older population with mild to severe depressive symptoms. The Bayesian Network method is used to identify the risk factors based on the survey data. The BN method has the advantage of providing explainability and the order of importance of predictor variables in causing recurrent anxiety symptoms as compared to other ML methods. The results of this study show that the variables "Not being able to stop or control worrying", "Becoming easily annoyed or irritable", and "Trouble relaxing", along with "depressive symptomatology severity" were the most important predictors for understanding recurrent anxiety symptomatology in this population. Using BN, only 4 predictor variables can be used to determine the association of these to the outcome instead of many variables to be used in ML which can be prone to overfitting

The authors should mention which software tools they used for analysing the data using BN and other ML methods.

I find the BN approach used in this study is quite revealing and innovative and it can be applied in other fields besides the health studies.

I recommend the manuscript to be accepted for publication

Author Response

Thank you for your thoughtful review. We appreciate the time and effort invested in evaluating our work.

The authors should mention which software tools they used for analysing the data using BN and other ML methods.

RESPONSE: We have included the software information used in the manuscript.

Reviewer 2 Report

Comments and Suggestions for Authors

Although the research goals are clearly expressed, there is a lack of discussion on what are the critical aspects and limitations of the current models of study of the recurrence of anxiety symptomatology. The authors should describe in the introductory section what these limitations are and how they intend to address them.

In section 2 it is necessary to insert an architectural scheme or a flow diagram of the proposed BN model. It is described in a destructured way; I recommend the authors to review it in order to highlight the sequence of the single steps of the model.

The results in Table 6 show that the BN does not perform better than the other three methods, but only comparable. Furthermore, the MEAN AUC obtained with the BN is lower than that obtained with the other three methods, and the standard deviation (0.056), is the highest. Therefore, a thorough discussion to justify these results is necessary.

The concluding section should include a brief discussion of possible future research perspectives.

Author Response

First of all we would like to thank the reviewer for your considerate and valuable feedback.

Comment1: Although the research goals are clearly expressed, there is a lack of discussion on what are the critical aspects and limitations of the current models of study of the recurrence of anxiety symptomatology. The authors should describe in the introductory section what these limitations are and how they intend to address them.

RESPONSE: We have addressed this point by expanding on the current models used to study the recurrence of anxiety symptomatology. Specifically, we described their techniques and discussed their limitations in the paragraphs from lines 34 to 55. Additionally, we provided a more detailed explanation of how we intend to address these issues in the paragraphs from lines 66 to 83.

Comment 2: In section 2 it is necessary to insert an architectural scheme or a flow diagram of the proposed BN model. It is described in a destructured way; I recommend the authors to review it in order to highlight the sequence of the single steps of the model.
RESPONSE: Thank you for your helpful suggestion. We have addressed this by adding a flowchart (Figure 1) that outlines the main steps in developing the methodology. Additionally, we have reorganised the main methods section into subsections corresponding to the flowchart, improving the overall structure and readability.

Comment 3: The results in Table 6 show that the BN does not perform better than the other three methods, but only comparable. Furthermore, the MEAN AUC obtained with the BN is lower than that obtained with the other three methods, and the standard deviation (0.056), is the highest. Therefore, a thorough discussion to justify these results is necessary.
RESPONSE: We have addressed this point in the discussion section, beginning in the paragraph on line 443. Additionally, in response to another reviewer’s comments, we included an additional machine learning model and expanded the comparison to highlight the advantages of the proposed model over the machine learning models.

Comment 4: The concluding section should include a brief discussion of possible future research perspectives.
RESPONSE: We have added a new Section 7 (Future Work), where we outline future research directions involving multiple mental health issues. In response to recommendations provided by reviewer 4, this section discusses the consideration of comorbidities using an extended methodology of the Bayesian network known as the object-oriented Bayesian network—a modular technique that integrates and unifies different systems within a single network.

 

Reviewer 3 Report

Comments and Suggestions for Authors

This study aimed to model the recurrence of anxiety symptoms in an older population over a five-month period. The data comprised baseline socio-demographic and general health information from adults aged 60 years or older who exhibited at least mild depressive symptoms. A Bayesian network model was employed to examine the relationship between baseline data and the recurrence of anxiety symptoms. The main issues with the paper are as follows:

1. The primary theoretical tool used in this study is the Bayesian approach. However, the abstract only briefly mentions that "understanding the factors influencing its recurrence is important for improved management," which does not clearly convey the research motivation.

2. The introduction is too simplistic and lacks a detailed literature review, particularly regarding the development and research motivation for Bayesian networks, machine learning, and anxiety.

3. In Section 3, statistical analyses such as F-tests and N-tests could be included based on the comparative analysis results.

4. The structure of the paper could be more concise, for example, by combining the last two sections.

5. The references do not include recent works from the past two years, which diminishes the quality of the literature review.

Comments on the Quality of English Language

Minor editing of English language required.

Author Response

First of all we would like to thank the reviewer for your considerate and valuable feedback.

1. The primary theoretical tool used in this study is the Bayesian approach. However, the abstract only briefly mentions that "understanding the factors influencing its recurrence is important for improved management," which does not clearly convey the research motivation.
RESPONSE: We have addressed this by including in the abstract the last two sentences in bold: “Anxiety in older individuals is understudied despite its prevalence. Investigating its occurrence can be challenging, yet understanding the factors influencing its recurrence is important. Gaining insights into these factors through an explainable, probabilistic approach can  enhance improved management. A Bayesian network (BN) is well-suited for this purpose” (lines 3-4). 

2. The introduction is too simplistic and lacks a detailed literature review, particularly regarding the development and research motivation for Bayesian networks, machine learning, and anxiety.
RESPONSE: Thank you for this suggestion. We have addressed it by expanding the research motivation in the Introduction section (lines 66-83) and incorporating 10 new references from 2024, which also address point 5. Additionally, we have elaborated on the techniques used in anxiety studies, discussed their limitations, and included more references (lines 34-55).

3. In Section 3, statistical analyses such as F-tests and N-tests could be included based on the comparative analysis results.
RESPONSE: Regarding this point, our assumption is that the F-test is related to variance. We apologise for our lack of knowledge about the N-Test; despite my efforts to find relevant information through searches, I did not find anything useful related to our work.

That said, our understanding is that these approaches aim to derive p-values for comparing model fits, typically applied to nested models. In such cases, models are compared by excluding certain variables in one model that are included in another, within the same framework (e.g., logistic regression).

In contrast, our study involves evaluating AUCs from different models that are not necessarily nested but employ various fitting techniques on the same dataset. Consequently, instead of focusing on p-values from these tests, we prefer to emphasise the magnitude of differences in AUCs observed in both training and test datasets. We present information about variability through standard deviations to support our analysis.


4. The structure of the paper could be more concise, for example, by combining the last two sections.
RESPONSE: We have addressed this point by merging and summarising the two subsections in the Discussion section. Additionally, we have included a flowchart (Figure 1) in Section 2.3 that provides an overview of the methodology. This flowchart relates to newly grouped subsections, improving the overall structure and readability.

5. The references do not include recent works from the past two years, which diminishes the quality of the literature review.
RESPONSE: We acknowledge this was a weakness and thank the reviewer for highlighting this. As discussed in point 2, we have added 15 additional references, including 10 from 2024.

Reviewer 4 Report

Comments and Suggestions for Authors

The article is dedicated to the practical application of Bayesian networks. The topic of the article is relevant. The structure of the article conforms to the format accepted by MDPi for research papers (Introduction (including analysis of analogs), Models and Methods, Results, Discussion, Conclusions). The level of English is acceptable. The article is easy to read. The figures in the article are of acceptable quality. The article cites 27 sources, many of which are outdated. The References section is poorly formatted.

The following comments and recommendations can be made about the article:

1. The Support Vector Machine (SVM) method shows good results compared to other algorithms with small training datasets and is also used in analog circuit fault diagnosis with wavelet transform as a preprocessor, achieving high classification accuracy. It is one of the most popular learning methods applied to classification and regression tasks. The Relevance Vector Machine (RVM) method can also be noted. Unlike SVM, this method provides probabilities for the object's membership to a particular class. For example, if SVM says "x belongs to class A," RVM would say "x belongs to class A with probability p and to class B with probability 1-p." Why not apply RVM to the problem? It would be a more modern approach compared to the Bayesian network.

2. Decision trees are visually more intuitive, simpler, and easier for engineers to understand and interpret. Unlike other classification methods, decision tree classifiers allow for root cause analysis based on data; one can trace the path from the end state to the initiation, following the sequence and chronology of event relationships. Decision trees are very robust to noisy and incomplete data. However, they require pruning parameters to reduce the need for overfitting. This approach is simpler and more understandable than the one used by the authors and is more suitable for solving the relatively simple problem presented in the article.

3. Object-oriented Bayesian networks provide an approach for achieving a hierarchical model representation, with each level corresponding to an abstraction level, showing encapsulated nodes for the current object level. This approach reduces the complexity of constructing BBNs and increases the likelihood of model reuse. The methodology for real-time diagnosis of complex systems with recurring structures is proposed using an object-oriented Bayesian network. For a system with a specific situation, the operator can input some known experience information into additional informational levels of subnetworks for additional information and common cause failures. This BBN variant is suitable for describing the problem studied by the authors and is less commonly used. Overall, in this version, the scientific novelty of the article is minimal.

Author Response

First of all we would like to thank the reviewer for your considerate and valuable feedback. 

Cooment: The article cites 27 sources, many of which are outdated. The References section is poorly formatted.

RESPONSE: We have added 15 additional references, including 10 from 2024. The references are formatted using the Latex template provided and hence we assume these are in the required format, or can be changed to the required format during final editing.

The following comments and recommendations can be made about the article:
1. The Support Vector Machine (SVM) method shows good results compared to other algorithms with small training datasets and is also used in analog circuit fault diagnosis with wavelet transform as a preprocessor, achieving high classification accuracy. It is one of the most popular learning methods applied to classification and regression tasks. The Relevance Vector Machine (RVM) method can also be noted. Unlike SVM, this method provides probabilities for the object's membership to a particular class. For example, if SVM says "x belongs to class A," RVM would say "x belongs to class A with probability p and to class B with probability 1-p." Why not apply RVM to the problem? It would be a more modern approach compared to the Bayesian network.
RESPONSE: We thank the reviewer for their interesting suggestion. We acknowledge the important role of SVMs in dealing with small datasets and have indeed used SVMs on previous work (A Machine Learning approach to optimize the assessment of depressive symptomatology - ScienceDirect). 

RVMs are an appealing extension of SVMs that indeed result in sparcity. However, this sparcity is in the space of basis functions, which are transformations of the input data. Hence, the sparcity results in a sparcity of input data points considered in the model output and does not result in a sparcity of features considered in the model output as required for interpretability. For this reason we had not initially considered RVMs a suitable approach in our particular application.

However, we implemented an RVM in our work and added it as another ML comparison with our approach (discussing in section 2.6.3 lines 253-261, and  section 4, lines 443-461). We understand the reviewer’s request for more modern approaches. The use of BNs falls within the wider field of probabilistic machine learning (of which RVMs are a very interesting example), and our use of BNs demonstrates that more traditional tools often provide comparable performance to popular state of the art algorithms such as SGD and XGBoost.


2. Decision trees are visually more intuitive, simpler, and easier for engineers to understand and interpret. Unlike other classification methods, decision tree classifiers allow for root cause analysis based on data; one can trace the path from the end state to the initiation, following the sequence and chronology of event relationships. Decision trees are very robust to noisy and incomplete data. However, they require pruning parameters to reduce the need for overfitting. This approach is simpler and more understandable than the one used by the authors and is more suitable for solving the relatively simple problem presented in the article.
RESPONSE: Whilst decision trees indeed provide a visually intuitive representation of the classification problem in hand, overfitting cannot always be prevented. This has led to the popularity of random forests, but their explainability is reduced due to the use of many decision trees in parallel. 

However, a more important reason for us not to focus on decision trees as a structure to increase explainability, is their inability to capture uncertainty in a probabilistically meaningful way. The latter is an important reason for us to use Bayesian networks instead.

We have also expanded the discussion on model comparison, emphasizing the advantages of Bayesian networks for investigating key factors related to recurrent anxiety (lines 443-474). Specifically, Bayesian networks provide simultaneous benefits: they offer explainability and probabilistic reasoning, capture complex interrelationships between features, and do not require the assumption of minimal multicollinearity between predictors. These combined features make Bayesian networks a compelling choice for our research.

3. Object-oriented Bayesian networks provide an approach for achieving a hierarchical model representation, with each level corresponding to an abstraction level, showing encapsulated nodes for the current object level. This approach reduces the complexity of constructing BBNs and increases the likelihood of model reuse. The methodology for real-time diagnosis of complex systems with recurring structures is proposed using an object-oriented Bayesian network. For a system with a specific situation, the operator can input some known experience information into additional informational levels of subnetworks for additional information and common cause failures. This BBN variant is suitable for describing the problem studied by the authors and is less commonly used. Overall, in this version, the scientific novelty of the article is minimal.
RESPONSE: Whilst the use of object-oriented Bayesian networks is an interesting suggestion and worthy of further exploration, we feel it is not in scope for the current paper. Although the dataset contains 122 features, only 9 of these were related to the outcome of interest, thus significantly limiting the opportunity for hierarchical models. However, integrating object-oriented Bayesian networks fits really well within the wider scope of our research. We propose that we add this suggestion to the future work section within the wider context of combining separate Bayesian networks for different purposes (e.g. identifying different mental health issues) into one object-oriented Bayesian network.

The scientific novelty of our paper lies in identifying the key factors influencing the recurrence of anxiety disorders through the use of probabilistic measures provided by an explainable model. This model offers clinicians valuable insights, enabling them to prioritise actions to prevent relapse. Our approach, which involves learning the network structure using bootstrap and constraint-based methods, is unique and, to our knowledge, has not been reported before. Furthermore, our method for ranking the importance of predictors in this population is a significant contribution to the field.

Upon reflection, we feel we have insufficiently highlighted this as a novel aspect of the paper. To address this, we have included a Contributions section (lines 476-489) in the manuscript. We thank the reviewer for highlighting novelty as a concern. 

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have taken into account all my suggestions. I consider this paper publishable in the current version.

Reviewer 4 Report

Comments and Suggestions for Authors

I have formulated the following comments on the previous version of the article:

1. The Support Vector Machine (SVM) method shows good results compared to other algorithms with small training datasets and is also used in analog circuit fault diagnosis with wavelet transform as a preprocessor, achieving high classification accuracy. It is one of the most popular learning methods applied to classification and regression tasks. The Relevance Vector Machine (RVM) method can also be noted. Unlike SVM, this method provides probabilities for the object's membership to a particular class. For example, if SVM says "x belongs to class A," RVM would say "x belongs to class A with probability p and to class B with probability 1-p." Why not apply RVM to the problem? It would be a more modern approach compared to the Bayesian network.

2. Decision trees are visually more intuitive, simpler, and easier for engineers to understand and interpret. Unlike other classification methods, decision tree classifiers allow for root cause analysis based on data; one can trace the path from the end state to the initiation, following the sequence and chronology of event relationships. Decision trees are very robust to noisy and incomplete data. However, they require pruning parameters to reduce the need for overfitting. This approach is simpler and more understandable than the one used by the authors and is more suitable for solving the relatively simple problem presented in the article.

3. Object-oriented Bayesian networks provide an approach for achieving a hierarchical model representation, with each level corresponding to an abstraction level, showing encapsulated nodes for the current object level. This approach reduces the complexity of constructing BBNs and increases the likelihood of model reuse. The methodology for real-time diagnosis of complex systems with recurring structures is proposed using an object-oriented Bayesian network. For a system with a specific situation, the operator can input some known experience information into additional informational levels of subnetworks for additional information and common cause failures. This BBN variant is suitable for describing the problem studied by the authors and is less commonly used. Overall, in this version, the scientific novelty of the article is minimal.

The authors have addressed all my comments. I found their responses quite convincing. I support the publication of the current version of the article. I wish the authors creative success.

Back to TopTop