Next Article in Journal
Interactive 3D Vase Design Based on Gradient Boosting Decision Trees
Previous Article in Journal
A Segmentation-Based Automated Corneal Ulcer Grading System for Ocular Staining Images Using Deep Learning and Hough Circle Transform
 
 
Article
Peer-Review Record

Advanced Detection of Abnormal ECG Patterns Using an Optimized LADTree Model with Enhanced Predictive Feature: Potential Application in CKD

Algorithms 2024, 17(9), 406; https://doi.org/10.3390/a17090406 (registering DOI)
by Muhammad Binsawad 1 and Bilal Khan 2,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Reviewer 4: Anonymous
Algorithms 2024, 17(9), 406; https://doi.org/10.3390/a17090406 (registering DOI)
Submission received: 10 June 2024 / Revised: 4 September 2024 / Accepted: 6 September 2024 / Published: 11 September 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

In this work, the authors compare the performance of several machine learning algorithms to predict chronic kidney disease from ECG signals. The authors made a comprehensive analysis and concluded that the LADtree algorithm is better than other algorithms. The overall analysis of the article is relatively complete and comprehensive, but some problems weakened the "LAD tree is better than other algorithms" assertion, and it is suggested that the authors revise the article from the following perspectives before accepting.

1. For the research method, the principle of K-fold cross-validation is almost the same as that of percentage classification, and the K-fold method is more comprehensive and widely used for data classification, the author does not need to use percentage classification alone. Suggest removing scenario 1.

2. As for the performance of LADtree, the results in the article show that it is not much different from the naive Bayes(NB) method, just a little better, and if the authors consider the consumption of computational resources, in which the NB method is much cheaper than that of LADtree, which concludes the article feel not strong enough. The authors did not highlight the biggest advantage of LADtree, that is, its interpretability, and it is suggested that the authors put more ink on LADtree's interpretability.

3, another decision tree algorithm like LADtree is J48, which also provides high interpretability, but the performance of this algorithm is relatively low, what is the reason for this difference? Hope to get more detailed explanations.

 

4, there is no meaning in fitting a line to the different ML methods results, the slope of the line is related to the order of the different methods on the x-axis, for example, by exchanging the arrangement of several of these methods, the fitted line will change, this practice lacks logic, the authors need to give a reason for doing so or abandon this practice.

Author Response

Comment 1:  For the research method, the principle of K-fold cross-validation is almost the same as that of percentage classification, and the K-fold method is more comprehensive and widely used for data classification, the author does not need to use percentage classification alone. Suggest removing scenario 1.

Response 1: Thank you for your feedback and suggestion regarding the research method. While it is true that K-fold cross-validation is a comprehensive and widely-used method for data classification, we believe that including both percentage splitting (Scenario 1) and K-fold cross-validation (Scenario 2) in our study provides a more thorough evaluation of the LADTree model's performance. Percentage splitting is a common practice in real-world applications where a fixed portion of the dataset is set aside for testing, thus simulating practical scenarios practitioners might encounter in healthcare settings. Additionally, percentage splitting offers complementary insights by evaluating the model's performance on a distinct and non-overlapping test set, which helps in understanding the model's behavior in a different validation setup. Many previous studies use percentage splitting for benchmarking purposes, allowing for direct comparison with our results and providing a broader context for our evaluation. Including both methods ensures a comprehensive analysis of the LADTree model's performance, reinforcing the robustness of our findings across different evaluation criteria and validation methods. This dual perspective enriches our study, ensuring our conclusions are well-supported and applicable in various contexts. We appreciate the reviewer's suggestion and hope this explanation clarifies our rationale.

 

Comment 2: As for the performance of LADtree, the results in the article show that it is not much different from the Naïve Bayes (NB) method, just a little better, and if the authors consider the consumption of computational resources, in which the NB method is much cheaper than that of LADtree, which concludes the article feel not strong enough. The authors did not highlight the biggest advantage of LADtree, that is, its interpretability, and it is suggested that the authors put more ink on LADtree's interpretability.

Response 2: Thank you for your valuable feedback regarding the performance and computational efficiency of the LADTree model compared to the Naïve Bayes (NB) method. We acknowledge that while the performance difference between LADTree and NB might appear marginal in some metrics, the distinction is significant in others, particularly in terms of classification accuracy and the ability to handle complex relationships in ECG data. The primary advantage of the LADTree model, which we agree warrants more emphasis, is its better interpretability. LADTree's hierarchical structure allows for transparent decision-making, making it easier for clinicians to understand and trust the model's predictions. This interpretability is crucial in medical applications where understanding the rationale behind a prediction can significantly impact patient care and treatment decisions. Moreover, while NB is computationally less expensive, it operates under the assumption of feature independence, which might not hold true for ECG data where features can be interrelated. LADTree, on the other hand, can capture these complex relationships more effectively, leading to more accurate predictions.

This is also mentioned in the third last paragraph of the discussion section highlighted with yellow color accordingly.

 

Comment 3: Another decision tree algorithm like LADtree is J48, which also provides high interpretability, but the performance of this algorithm is relatively low, what is the reason for this difference? Hope to get more detailed explanations.

Response 3: Thank you for your insightful comment regarding the performance differences between the LADTree and J48 algorithms. Both LADTree and J48 are decision tree algorithms known for their interpretability, which is crucial in medical applications. However, there are key differences in their underlying structures and mechanisms that account for the observed performance discrepancies.

The discussion on this regards in mentioned in the second last paragraph of discussion section highlighted with yellow color accordingly.

 

Comment 4: There is no meaning in fitting a line to the different ML methods results, the slope of the line is related to the order of the different methods on the x-axis, for example, by exchanging the arrangement of several of these methods, the fitted line will change, this practice lacks logic, the authors need to give a reason for doing so or abandon this practice.

Response 4: We appreciate your feedback regarding the fitting of a line to the results of different machine learning methods. While we understand your concerns, we believe that the use of trendlines serves a valuable purpose in our analysis. The trendlines provide a visual representation of the overall performance trends among the models, helping to illustrate how different algorithms compare in terms of error rates and accuracy. Although the arrangement of models can influence the slope of the line, our objective was not to imply a strict mathematical relationship but rather to offer a high-level overview of performance. The trendlines enable readers to quickly grasp the relative effectiveness of each model, reinforcing our findings and facilitating comparisons. Therefore, we maintain that this practice contributes meaningfully to the presentation of our results, enhancing the reader's understanding of the model performance landscape.

Reviewer 2 Report

Comments and Suggestions for Authors

This study showcased ECG data could be used for CKD diagnostic. But I have several questions, as listed below:

1. ECG data are always used to diagnose CVD events. Could the author provide more details and examples, illustrating the relationship between ECG and kidney diseases?

 

2. The dataset used in this study is MIT-BIH Arrhythmia dataset (Physionet). In fact, arrhythmia is a type of CVD event. The author should provide more evidence confirming this dataset could be used for the CKD diagnostics.

 

3. Feature selection for the input of the machine learning model is very important. Could the authors exhibit their feature selection criteria, using ECG pulse waves?

 

4. There are many studies about machine learning-assisted diseases diagnostics, by virtue of pulse waves (https://doi.org/10.1016/j.xcrp.2023.101690; Nature Communications | ( 2023) 14:5009) and ECG (Nature Medicine volume 29, pages18041813 (2023); https://doi.org/10.1016/j.knosys.2021.107187) data. What’s the algorithms difference them and this study? Feature selection? Algorithms configuration? Or any other difference? The discussion could be provided in the introduction section, enlarging the significance of this study.

 

5. In addition, in the abstract, full name of the abbreviation should be provided. For instance, MAE.  

Author Response

Comment 1: ECG data are always used to diagnose CVD events. Could the author provide more details and examples, illustrating the relationship between ECG and kidney diseases?

Response: Again we are thankful to the reviewer’s comment to highlight the relationship between ECG and kidney disease. The details according to the comment has been added at the second paragraph of subsection 3.1 “Data Acquisition and Preprocessing”, highlighted with yellow respectively.

 

Comment 2: The dataset used in this study is MIT-BIH Arrhythmia dataset (Physionet). In fact, arrhythmia is a type of CVD event. The author should provide more evidence confirming this dataset could be used for the CKD diagnostics.

Response: Once again we appreciate the reviewer’s concern. The MIT-BIH Arrhythmia Dataset provides a range of ECG recordings that provide meaningful information on cardiac abnormalities, which are often associated with chronic kidney disease (CKD), hence the use of this dataset is justified. Although the dataset primarily focused on arrhythmias, the ECG patterns displayed in these recordings reflect the cardiac issues frequently observed in individuals with chronic kidney disease (CKD), such as left ventricular hypertrophy and alterations in heart rate variability. Numerous studies have shown a connection between renal function and ECG findings, indicating that certain cardiac irregularities and arrhythmias may be indicators of the progression of chronic kidney disease. Furthermore, the MIT-BIH dataset facilitates cross-study and technique comparisons by acting as a recognized standard in the field of cardiovascular research. The study investigates the association between ECG features and chronic kidney disease (CKD) using this dataset, therefore bolstering the hypothesis that ECG analysis might help predict and detect kidney disorders early.

These details are also added in the first paragraph of subsection 3.1 “Data Acquisition and Preprocessing”, highlighted with yellow respectively.

Comment 3: Feature selection for the input of the machine learning model is very important. Could the authors exhibit their feature selection criteria, using ECG pulse waves?

Response: We are thankful to the reviewer’s comments. In this study, feature selection is vital for improving the machine learning models' predictive performance in chronic kidney disease (CKD) diagnosis using ECG data. The feature selection criteria focus on identifying relevant and non-redundant features related to ECG pulse wave characteristics, such as amplitude, frequency, and duration. Methods like Particle Swarm Optimization, Best First Search, and Harmony Search were used to optimize feature selection by evaluating each feature's importance in predicting CKD. This approach not only enhances model accuracy but also improves interpretability, making the findings more clinically relevant for CKD diagnosis.

Comment 4: There are many studies about machine learning-assisted diseases diagnostics, by virtue of pulse waves (https://doi.org/10.1016/j.xcrp.2023.101690; Nature Communications | ( 2023) 14:5009) and ECG (Nature Medicine volume 29, pages1804–1813 (2023); https://doi.org/10.1016/j.knosys.2021.107187) data. What’s the algorithms difference them and this study? Feature selection? Algorithms configuration? Or any other difference? The discussion could be provided in the introduction section, enlarging the significance of this study.

Response: Again we are thankful to the reviewer’s comment to highlight this issue. This study distinguishes itself from existing research on machine learning-assisted disease diagnostics using pulse waves and ECG data through several key aspects. First, the focus on chronic kidney disease (CKD) provides a unique perspective, as most studies primarily target cardiovascular diseases. In terms of algorithms, we specifically employed the LADTree model, which has demonstrated better performance and interpretability in our analyses compared to more conventional methods. Additionally, our comprehensive feature selection process, utilizing techniques such as Particle Swarm Optimization, Best First Search, and Harmony Search, is tailored to extract relevant ECG pulse wave features critical for CKD prediction, enhancing the models' accuracy and interpretability. By emphasizing these distinctions and the significance of our findings, we aim to contribute to the growing body of literature while addressing a critical healthcare challenge related to CKD diagnosis and management.

Comment 5: In addition, in the abstract, full name of the abbreviation should be provided. For instance, MAE.

Response: Once again we are thankful to the reviewer’s to highlighted this mistake. The full name of abbreviation has been provided and highlighted with yellow color accordingly.

Reviewer 3 Report

Comments and Suggestions for Authors

 

This paper describes a new machine learning method, LADTree, to classify chronic kidney disease (CKD) cohort from ECG data. First ECG data is pre-processed to handle missing data and reduce feature set, then machine learning techniques are used for classification. Results suggests generally highest accuracy of the new method, although the error rate was not the lowest. Overall this paper is detailed and described methods in depth. One major issues is that the data used does not seem to contain CKD cohort, so it is not clear how the study could be performed at all. Other moderate issues include unclear rationale for the proposed work (CHIRP method already achieved greater than 99% accuracy), and lack of introduction on various techniques used and why they were needed for the study.

 

 

Abstract

- line 11: "prophecy" seems inappropriate.  Do you mean "prediction?" 

- Abstract should make it clear if ECG data is used to classify CKD from non CKD, or if it actually "predicts" future CKD incidence.  

- line 17:  what are the two scenarios being compared?

 

 

1. Introduction

- the rationale or need for predicting CKD from ECG data is unclear.  What are current gold standard in CKD diagnosis, and what benefit does ECG-based diagnosis or prediction provide?  Would it be more accurate?

- The rationale to use LADTree is weak. What made the authors pursuit LADTree model over other models? Provide some background information on the model, advantages, disadvantages, etc.

 

2. Literature Study

- Line 127:  it is stated that CHIRP method achieved >99% accuracy in classifying or identifying people with CKD.  Isn't that good enough?  Can you explain what novelty or improvement your work is adding?

 

3. Research Design

- if CHIRP model was found to be the best, should your LADTree be compared to that?

- aren't error rate and accuracy complementary metrics, i.e, accuracy=1-error rate?

- Figure 1 caption needs more elaborate explanation of the figure. 

- There are many abbreviations that are not spelled out at the first instance.

- Table 1.  There isn't a class for CKD here?! 

- Table 2. What are the units of measurements?  Preprocessing to include min and max seems dangerous, when erroneous signals can cause temporary spikes in the signal?  why not use 1 or 99 percentile instead? 

- line 181:  Provide references for CfsSubsetEval method.  

- Line 182:  It is stated that 279 features were analyzed but Table 2 shows just a few?  

- Line 188:  How are IP and R calculated?

- 3.2 Feature selection:  Is the Search necessary?  Say if one calculated IP for individual feature and just ranked them, would the result be similar to performing CfsSubsetEval?  How about using all the features rather than thinning them out?  Isn't that the power of deep learning? 

 

4. Results

- All the plots should be column graph, not connected smooth lines.

- Figure 2, 5, 6, 9: fitted line should be removed. This isn't linear regression.

 

 

Comments on the Quality of English Language

It's fine overall.  Minor issues with using ill-fitting words such as "prophecy"

Author Response

Comment 1: Abstract

  • line 11: "prophecy" seems inappropriate.  Do you mean "prediction?" 

Response: We are very thankful to the reviewer’s suggestions and comments. the word has been changed from prophecy to prediction to make the sense clear.

  • Abstract should make it clear if ECG data is used to classify CKD from non CKD, or if it actually "predicts" future CKD incidence.  

Response: We are very thankful to the reviewer’s suggestions and comments. The abstract clarifies that the study uses ECG data to predict chronic kidney disease (CKD) from non-CKD cases, focusing on early diagnosis. By employing the LADTree algorithm, the research aims to enhance predictive accuracy and provide insights into using ECG data for proactive healthcare and improving patient outcomes through early identification and intervention.

  • line 17:  what are the two scenarios being compared?

Response: We are very thankful to the reviewer’s suggestions and comments. The two scenarios being compared in the study are percentage splitting, where the dataset is divided into training and testing sets based on a specified percentage, and K-fold cross-validation, which involves dividing the dataset into K subsets for iterative training and testing.

Comment 2: Introduction

- the rationale or need for predicting CKD from ECG data is unclear.  What are current gold standard in CKD diagnosis, and what benefit does ECG-based diagnosis or prediction provide?  Would it be more accurate?

Response: We appreciate the reviewer’s comments. The necessity for prompt and non-invasive diagnostic instruments is the foundation for the reasoning behind the prediction of CKD using ECG data. The current gold standard techniques for diagnosing CKD mostly include urine and blood tests (such as serum creatinine and glomerular filtration rate). These techniques can be time-consuming, invasive, and may not give clear insights into how the illness is progressing right away. With the possibility for early identification of cardiac problems linked to CKD, ECG-based diagnosis enables prompt treatment and therapies. Furthermore, ECG data is easily accessible, which makes it a useful tool for ongoing observation of individuals at risk for chronic kidney disease.

These details are also added in the fourth paragraph of Introduction section highlighted with yellow color accordingly.

- The rationale to use LADTree is weak. What made the authors pursuit LADTree model over other models? Provide some background information on the model, advantages, disadvantages, etc.

Response: Again we appreciate the reviewer’s suggestions. The details according to the comment are added at the end of Discussion section, highlighted with yellow color accordingly.

Comment 3: Literature Study

- Line 127:  it is stated that CHIRP method achieved >99% accuracy in classifying or identifying people with CKD.  Isn't that good enough?  Can you explain what novelty or improvement your work is adding?

Response: Again we are thankful to the reviewer’s concern. In the paper, where the CHIRP model achieved the said accuracy, they have used a sample dataset from UCI repository. They did not work on the ECG signals data.

Comment 4: Research Design

- if CHIRP model was found to be the best, should your LADTree be compared to that?

Response: In the paper, where the CHIRP model achieved the said accuracy, they have used a sample dataset from UCI repository. They did not work on the ECG signals data.

- aren't error rate and accuracy complementary metrics, i.e, accuracy=1-error rate?

Response: While error rate and accuracy are related, they provide different insights. Accuracy reflects overall correct predictions, while error rate emphasizes incorrect predictions, which is crucial in imbalanced datasets. Including both metrics in this study offers a more nuanced evaluation of the LADTree model's performance in predicting chronic kidney disease (CKD) from ECG data, enhancing the study's findings. Moreover, there are different types of error rate they need to be evaluate against the models used in research study, and we have done that.

- Figure 1 caption needs more elaborate explanation of the figure. 

Response: Figure 1 caption has been changed accordingly and highlighted with yellow color.

- There are many abbreviations that are not spelled out at the first instance.

Response: We are thankful to the reviewer’s suggestions. All the abbreviations are checked and followed with full spelled out form in their first use.

- Table 1.  There isn't a class for CKD here?! 

Response: Again we appreciate the reviewer’s comment, all these listed in Table 1 are the types of CKD.

- Table 2. What are the units of measurements?  Preprocessing to include min and max seems dangerous, when erroneous signals can cause temporary spikes in the signal?  why not use 1 or 99 percentile instead? 

Response: We appreciate the reviewer’s suggestion. All these are the default values presented in the dataset according to each attribute. We statistically analyzed these values with respect to each attribute.

- line 181:  Provide references for CfsSubsetEval method.

Response: We are thankful to the reviewer’s comment; the reference has been assigned accordingly as reference [20], highlighted with yellow color respectively.

Response:  

- Line 182:  It is stated that 279 features were analyzed but Table 2 shows just a few?

Response: Again we are thankful to the reviewer’s comment. It is also clearly mentioned in the paper that we have use different feature selection techniques, to select only those feature which are important for this study.

- Line 188:  How are IP and R calculated?

Response: We appreciate the reviewer’s concern that how Individual Predictive Power (IP) and Redundancy (R) are calculated. IP is calculated by assessing each feature's ability to predict the target variable independently, typically using metrics like correlation or information gain. R is calculated by measuring the overlap in information between features, usually through correlation or mutual information between feature pairs.

- 3.2 Feature selection:  Is the Search necessary?  Say if one calculated IP for individual feature and just ranked them, would the result be similar to performing CfsSubsetEval?  How about using all the features rather than thinning them out?  Isn't that the power of deep learning?

Response: We are thankful to the reviewer’s comment. Feature selection, including using CfsSubsetEval, is necessary to enhance model performance and efficiency by removing redundant and irrelevant features. Calculating IP and ranking features can provide valuable insights, but it may not capture feature interactions as effectively as CfsSubsetEval. Using all features without thinning can lead to overfitting and increased computational complexity, especially in traditional ML models. While deep learning can handle larger feature sets, optimal feature selection still improves performance and interpretability.

Comment 5: Results

- All the plots should be column graph, not connected smooth lines. Figure 2, 5, 6, 9: fitted line should be removed. This isn't linear regression.

Response: We appreciate the reviewer’s suggestions. The use of connected smooth lines in the plots and fitted lines in Figures 2, 5, 6, and 9 provides visual clarity and helps in identifying trends and patterns across different machine learning models. While column graphs and linear regression plots are standard, the chosen visualization methods emphasize the comparative performance and relationships between models, facilitating better understanding. These visual aids are intended to highlight key insights and differences in the models' performance, supporting the study's analysis and conclusions.

Reviewer 4 Report

Comments and Suggestions for Authors

The authors present LADTree, a ML model to predict chronic kidney disease from ECG data.

The paper should be extensively revised. 

In particular:
1. Clinical background is not clear. How ECG data may predict kidney disease?

2. The aim of the work is not well presented. The "novelty" of LADTree model is not immediately clear

3. The introduction is not well presented. For instance, it lacks logical coherence: while it is focusing on ECG data, a sentence about deep learning shifts the focus to something else

4. The authors use ECG data. Numerical data? Which are the extracted features from ECG signals? The acronyms are not explained. 

5. Which is the clinical meaning of the extracted features?

6. Mathematical formulas are not readable

7. I did not understand the comparative analysis between the different ML models. Why did the authors use a linear model to fit the relationship between the ML types and MAE (or another type of metric)? ML type is not a numerical variable, I did not catch the point of this analysis

8. The classification task is not immediately clear: it is not explained in the right section

9. Which are the limits of the study?

Comments on the Quality of English Language

English must be extensively revised to improve text comprehension. 

Author Response

Comment 1: Clinical background is not clear. How ECG data may predict kidney disease?

Response: We appreciate the reviewer’s concern that how ECG may predict kidney disease? This concern has been discussed in the first two paragraph of subsection 3.1 Data Acquisition and Preprocessing, highlighted with yellow color accordingly.

Comment 2: The aim of the work is not well presented. The "novelty" of LADTree model is not immediately clear.

Response: We are thankful to the reviewer’s comment. The aim of the work and the novelty of LADTree is presented in the fifth paragraph of Introduction section highlighted with yellow color accordingly.

Comment 3: The introduction is not well presented. For instance, it lacks logical coherence: while it is focusing on ECG data, a sentence about deep learning shifts the focus to something else.

Response: Again we are thankful to the reviewer’s suggestions. The introduction section has been updated accordingly. The updated text is highlighted with yellow color accordingly.

Comment 4: The authors use ECG data. Numerical data? Which are the extracted features from ECG signals? The acronyms are not explained.

Response: Once again we are thankful to the reviewer’s suggestion to highlight this short coming of the paper. The selected feature s presented in Table 2 are described properly, and highlighted with yellow color.

Comment 5: Which is the clinical meaning of the extracted features?

Response: We appreciate the reviewer’s concern. The meaning of each extracted feature is added in Table 2, description column, highlighted with yellow color.

Comment 6: Mathematical formulas are not readable.

Response: We are grateful to the reviewer for this comment. The mathematical formulas are clearly written as required. These are not in image format; each equation is written in text/formula style.

Comment 7: I did not understand the comparative analysis between the different ML models. Why did the authors use a linear model to fit the relationship between the ML types and MAE (or another type of metric)? ML type is not a numerical variable; I did not catch the point of this analysis.

Response: We appreciate the reviewer for this comment. The comparative analysis between the different ML models in the study aimed to provide a clear and visual understanding of how various models perform relative to each other across different metrics, such as MAE. The use of a linear model to fit the relationship between the ML types and MAE, despite ML type being a categorical variable, was intended to illustrate trends and highlight differences in performance. This approach allowed for a straightforward comparison and visualization of each model's predictive accuracy. By fitting a line, we aimed to provide a simplified view of the general performance trend across the models, which can be useful for identifying which models consistently perform better or worse. Although ML type is not a numerical variable, the line fitting serves as a heuristic tool to aid in interpreting the comparative performance data, rather than implying a precise numerical relationship.

Comment 8: The classification task is not immediately clear: it is not explained in the right section.

Response: We appreciate the reviewer’s feedback. However, the classification task in this study is centered on predicting chronic kidney disease (CKD) based on ECG signal data. While we recognize that the classification task may not have been explicitly detailed in the relevant section, it is indeed a fundamental aspect of the research. The study aims to classify individuals into CKD and non-CKD categories using various machine learning models, including the proposed LADTree model. This classification process is established through the analysis of specific ECG-derived features, allowing for a more precise diagnosis of CKD.

Comment 9: Which are the limits of the study?

Response: Once again we are thankful to the reviewer to highlight the limitation of the study. The limitation of the study has been added at the end of Discussion section highlighted with yellow color accordingly.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors solve all the problems, current version is of high quality.

Author Response

Comment 1: The authors solve all the problems, current version is of high quality.

Response: We are thankful to the reviewer for accepting our response to the comments.

Reviewer 2 Report

Comments and Suggestions for Authors

The authors resolved my questions as well. 

Author Response

Comment 1: The authors resolved my questions as well. 

Response: We are thankful to the reviewer for accepting our responses to the comments.

Reviewer 3 Report

Comments and Suggestions for Authors

INTRO

- there is no explanation of what is a LADTree model. 

 

 

Table 1

- I still don't understand your response.  Table 1 describes different cardiac conditions, not kidney conditions.  Does the MIT-BIH dataset include diagnosis or classification of the data into non-CKD vs. CKD? Otherwise, how is this study even possible?

 

Table 2:  can you provide appropriate units for each attribute?

 

IP and R:  Please include equations or more detailed description to help reader understand how these were calculated.

 

Figures:  I still stand by that the use of curved lines in this way is inappropriate and misrepresents the results. The use of linear regression line in this regard is clearly wrong.

 

 

 

Author Response

Comment 1: (Introduction) There is no explanation of what is a LADTree model.

Response 1: We appreciate the reviewer’s comment. The LADTree has been briefed in the introduction section at the start of 5th paragraph highlighted with yellow color accordingly. However, the detail of LADTree is presented in subsection 3.5 Proposed Methodology (LADTree).

Comment 2: (Table 1) I still don't understand your response.  Table 1 describes different cardiac conditions, not kidney conditions.  Does the MIT-BIH dataset include diagnosis or classification of the data into non-CKD vs. CKD? Otherwise, how is this study even possible?

Response 2: We appreciate the reviewer’s comments. The confusion is understandable, but let us clarify. Although Table 1 describes different cardiac conditions and not kidney conditions, the MIT-BIH Arrhythmia dataset is relevant to the study of chronic kidney disease (CKD) because of the well-documented links between certain cardiac abnormalities and CKD. Studies have shown that patients with CKD frequently exhibit specific cardiac issues such as left ventricular hypertrophy (LVH), decreased heart rate variability, and distinctive ECG changes like prolonged QT intervals and peaked T waves due to hyperkalemia. These cardiac abnormalities, which are reflected in the MIT-BIH dataset, provide valuable insights into the cardiovascular complications associated with CKD. The dataset's comprehensive ECG recordings allow researchers to analyze patterns and features that may indicate the progression of CKD. Therefore, even though the dataset focuses on arrhythmias, the cardiac data it provides is crucial for understanding and predicting CKD, making the study feasible and relevant.

Comment 3: (Table 2) Can you provide appropriate units for each attribute?

Response 3: We are thankful to the reviewer’s suggestion. Units for each attribute has been added and highlighted with yellow color accordingly.

Comment 4: IP and R:  Please include equations or more detailed description to help reader understand how these were calculated.

Response 4: We appreciate the reviewer’s suggestion, the Details and Equations for IP and R, are added in second and third paragraph of subsection 3.2 Feature Selection, highlighted with yellow color respectively.

Comment 5: Figures:  I still stand by that the use of curved lines in this way is inappropriate and misrepresents the results. The use of linear regression line in this regard is clearly wrong.

Response 5: We appreciate and we are agreeing with reviewer’s comments. In all the figures, curved lines are replaced with proper column bars and trend line analysis which was not suitable for this study are removed accordingly.

Reviewer 4 Report

Comments and Suggestions for Authors

The authors have revised the paper based on the feedback they received.

However, further revisions should be made.

1. lines 178-180: References missing, please add them

2. clinical background: it should be presented in the introduction, to let the reader understand the clinical focus of the study

3. once you have defined an acronym, you can use it in the whole document (for example, line 196)

4. lines 67-70: please, reformulate

5. Regarding the use of a linear model to fit the relationship between MAE and ML types: if you aim to visually emphasize the comparative analysis between the several ML models, you can use a bar chart.

6. Regarding the classfication task: Please, make the classification task clear by explaining it in the right section. 

Comments on the Quality of English Language

Moderate editing of English language required

Author Response

Comment 1: Lines 178-180: References missing, please add them.

Response 1: We are thankful to the reviewer to highlight this limitation of the study, the references are added as [19], [20], highlighted with yellow color accordingly.

Comment 2: Clinical background: it should be presented in the introduction, to let the reader. understand the clinical focus of the study

Response 2: We are thankful to the reviewer’s comment. The clinical background of the study is mentioned in the introduction section, yellow highlighted in first paragraph, third paragraph and fifth paragraph as per the proper flow of the study.

Comment 3: Once you have defined an acronym, you can use it in the whole document (for example, line 196).

Response 3: Again we are agreeing with reviewer’s suggestion. The said issue has been resolved and highlighted with yellow color accordingly.

Comment 4: lines 67-70: please, reformulate.

Response 4: We are thankful to the reviewer’s comments. In the said lines, the models are named to highlight that these are used in this study. However, these models are properly referenced in Table 4, that’s why there is no need to formulate these here.

Comment 5: Regarding the use of a linear model to fit the relationship between MAE and ML types: if you aim to visually emphasize the comparative analysis between the several ML models, you can use a bar chart.

Response 5: Once again we are thankful to the reviewer for this suggestion. All the figures are converted to bar charts accordingly.

Comment 6: Regarding the classification task: Please, make the classification task clear by explaining it in the right section. 

Response 6: We appreciate the reviewer’s suggestions. The classification task has been added as a subsection 3.4, highlighted with yellow color respectively.

Round 3

Reviewer 3 Report

Comments and Suggestions for Authors

One cannot claim that this study predicts CDK, when the data does not positively include ECG from CKD patients!  The entire paper is very misleading, starting with the title.  A major and an extensive revision (including title) must be made to make it absolutely clear that the model predicts abnormal ECG, not CKD.  Please remove any mention of "CKD prediction" in the context of the study design or the results of this study.  

Author Response

Comment 1:  One cannot claim that this study predicts CDK, when the data does not positively include ECG from CKD patients!  The entire paper is very misleading, starting with the title.  A major and an extensive revision (including title) must be made to make it absolutely clear that the model predicts abnormal ECG, not CKD.  Please remove any mention of "CKD prediction" in the context of the study design or the results of this study.

Response 1: We appreciate the reviewer’s suggestions and comments. The manuscript title and text have been revised accordingly. The revised text in various sections of the manuscript has been highlighted with yellow color accordingly.

Round 4

Reviewer 3 Report

Comments and Suggestions for Authors

Thank you for following the suggested modification. While a great progress has been made, there are still too many instances where the authors claim that the study "predict CKD using ECG signal."  Please carefully go through the new and old writing and fix those.

What I noticed are listed but this is by no mean complete.

Title:  It still is misleading.  I believe a more transparent title might be:  " Advanced Detection of Abnormal ECG Patterns Using an Optimized LADTree Model with Endhanced Predictive Feature: Potential Application in CKD"  Your data does not contain Abnormal ECG indicative of CKD.

 

Abstract:  

-remove "indicative of CKD prediction" from "This study compares a unique strategy for abnormal ECG patterns indicative of CKD prediction using the ...."

 

Introduction

- lines 87 to 90 needs to be revised.  The work does not use abnormal ECG associated with CKD.

- line 95 to 96:  please revise.  these models are not for CKD prediction, only abnormal ECG prediction.

 

Research Design

- line 161-162:  revise as "This study aims to enhance the detection of abnormal ECG patterns, known to be associated with CDK, by employing the LADTree model with enhanced predictive features.  Note that this study does not utilize data from CDK patients, but provides a novel methodology to detect abnormal ECG that is often associated with CDK. "

- Figure 1 caption: change "predicting CKD" to "predicting abnormal ECG pattern"

- line 186 to 189:  revise to  "The study investigates the association between ECG features and cardiac anomalies" 

- line 190:  It might be useful to add at the beginning, "Since a large ECG dataset from CKD patients is difficult to acquire, we sought instead to use a publicly available dataset that contained normal and abnormal ECG signals with known cardiac anomalies."

- line 366-367:  remove "associated with CKD"

- line 422-423:  remove "associated with CKD"

- line 433-435:  remove "associated with CKD"

- line 445-446:  remove "associated with CKD"

- please search for "associated with CKD" and delete.

 

I think you get the idea.  Please take a thorough look and remove all claims that this study predicts CKD. You can add in discussion that your study is relevant since CKD often has abnormal ECG, but cannot claim that your method can predict CKD.

 

 

 

 

 

Comments on the Quality of English Language

the authors should consult English editor to catch all instances of claims that the study predicts CKD. It is important that no false claims are made accidentally.

Author Response

Comment 1: Title:  It still is misleading.  I believe a more transparent title might be:  " Advanced Detection of Abnormal ECG Patterns Using an Optimized LADTree Model with Enhanced Predictive Feature: Potential Application in CKD"  Your data does not contain Abnormal ECG indicative of CKD.

Response 1: We appreciate the reviewer’s suggestion regarding the change of the title, the suggested title matched with the study, therefore, we updated the title accordingly.

Comment 2: Abstract:  -remove "indicative of CKD prediction" from "This study compares a unique strategy for abnormal ECG patterns indicative of CKD prediction using the ...."

Response 2: We are thankful for the reviewer’s suggestions and comments. The abstract has been updated accordingly, changes are highlighted with yellow color accordingly.

Comment 3: Introduction

- lines 87 to 90 needs to be revised.  The work does not use abnormal ECG associated with CKD.

- line 95 to 96:  please revise.  these models are not for CKD prediction, only abnormal ECG prediction.

Response 3: Once again we are thankful and agree with the reviewer’s comments and suggestions. The introduction sections have been updated accordingly.

Comment 4: Research Design

- line 161-162:  revise as "This study aims to enhance the detection of abnormal ECG patterns, known to be associated with CDK, by employing the LADTree model with enhanced predictive features.  Note that this study does not utilize data from CDK patients, but provides a novel methodology to detect abnormal ECG that is often associated with CDK. "

- Figure 1 caption: change "predicting CKD" to "predicting abnormal ECG pattern"

- line 186 to 189:  revise to  "The study investigates the association between ECG features and cardiac anomalies" 

- line 190:  It might be useful to add at the beginning, "Since a large ECG dataset from CKD patients is difficult to acquire, we sought instead to use a publicly available dataset that contained normal and abnormal ECG signals with known cardiac anomalies."

- line 366-367:  remove "associated with CKD"

- line 422-423:  remove "associated with CKD"

- line 433-435:  remove "associated with CKD"

- line 445-446:  remove "associated with CKD"

- please search for "associated with CKD" and delete.

Response 4: We are thankful to the reviewer for such significant comments and suggestions. The manuscript has been updated accordingly, all the updated text is highlighted with yellow color.

Comment 5: Comments on the Quality of English Language the authors should consult English editor to catch all instances of claims that the study predicts CKD. It is important that no false claims are made accidentally.

Response 5: Again we are thankful to the reviewer for this comment. The manuscript has been proofread for the quality of English Language and mistakes are removed accordingly.

Back to TopTop