Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Enhanced Precision in Chinese Medical Text Mining Using the ALBERT+Bi-LSTM+CRF Model

Appl. Sci. 2024, 14(17), 7999; https://doi.org/10.3390/app14177999

by Tianshu Fang^1,2

, Yuanyuan Yang^1,* and Lixin Zhou^1,2

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Appl. Sci. 2024, 14(17), 7999; https://doi.org/10.3390/app14177999

Submission received: 14 July 2024 / Revised: 31 August 2024 / Accepted: 3 September 2024 / Published: 7 September 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper "Enhanced Precision in Chinese Medical Text Mining Using the ALBERT+Bi-LSTM+CRF Model" presents a robust and innovative approach to medical text mining. However, several areas could benefit from further elaboration and correction. First, the reliance on public datasets and the relatively limited real medical text annotations could impact the generalizability of the results. Future studies should focus on incorporating more diverse and extensive real-world datasets. Second, while the authors mention the potential scalability of their model, they do not provide detailed insights into the computational requirements and potential bottlenecks. A more in-depth analysis of the scalability aspects, including computational overhead and resource requirements, would be beneficial. Third, the practical integration of the proposed model into existing clinical workflows is not extensively discussed. Future research should explore the implementation challenges and solutions for integrating the model into hospital management systems, including the potential impact on clinical decision-making processes. Additionally, the authors could expand on how their model compares with other state-of-the-art models in terms of efficiency and accuracy in real-world applications. Lastly, while the paper is well-referenced, drawing on a wide range of sources to support its methodology and findings, some references are relatively dated. The inclusion of more recent studies would strengthen the literature review and provide a more current context for the research. Addressing these areas would enhance the overall robustness and applicability of the research findings.

Comments on the Quality of English Language

The quality of English language in the paper is generally good, with clear and coherent sentence structures. However, there are a few areas that could benefit from further proofreading.

Author Response

Reviewer 1

Response to Reviewers

Thank you for pointing these out. We agree with your comments. Therefore, we responded to each issue raised by the reviewers in a bold and underlined manner.

Does the introduction provide sufficient background and include all relevant references?

Response: Up-to-date references are added and discussed (Page 2, Subsection 1, Lines 65-77).

Is the research design appropriate?

Response: The research design presented in the article is improved based on the suggestions of the reviewers (Page 8, Subsection 4.3.1, Lines 277-285).

Are the methods adequately described?

Response: More remarks and explanations are added to the text (Page 5, Subsection 3, Lines 177-181).

Are the conclusions supported by the results?

Response: The conclusion section is revised based on the suggestions of the reviewers to focus on what should be presented further (Page 12, Subsection 5.1, Lines 401-408; Page 13, Subsection 5.2, Lines 434-448, 462-469).

Comments and Suggestions for the Authors

Comment 1: The paper “Enhanced Precision in Chinese Medical Text Mining Using the ALBERT+Bi-LSTM+CRM Model” presents a robust and innovative approach to medical text mining. However, several areas could benefit from further elaboration and correction.

First, the reliance on public datasets and the relatively limited real medical text annotations could impact the generalizability of the results. Future studies should focus on incorporating more diverse and extensive real-world datasets.

Response 1: In this article, we focus on optimizing the model sizes while maintaining the precision of the BART algorithm. However, we are actively advancing collaboration projects with multiple hospitals to reach more real medical data, enabling the model to be implemented in real scenarios. More remarks and explanations are added where necessary to explain the details further (Page 13, Subsection 5.2, Lines 434-448).

Comment 2: Second, while the authors mention the potential scalability of their model, they do not provide detailed insights into the computational requirements and potential bottleneck. A more in-depth analysis of scalability aspects, including computational overhead and resource requirements would be beneficial.

Response 2: Since computational requirements and potential bottlenecks are great issues, we need to run the proposed algorithm many times to detect those issues. We can not envision them in advance. We need to run the proposed algorithm several times and check these issues due to the characteristics of the Chinese language and other aspects that other languages have. We acknowledge the importance of computational requirements and potential bottlenecks for practical implementation. We will address these aspects in our future work with a detailed analysis of computational overhead and resource demands (Page 6, Subsection 3.1, Lines 206-209).

Comment 3: Third the practical integration of the proposed model into existing clinical workflows is not extensively discussed. Future research should explore the implementation challenges and solutions for integrating the model into hospital management systems including potential impact on clinical decision-making processes.

Response 3: More explanations and remarks discussing practical applications have been added. Future studies will focus on these aspects to improve the proposed method and search for use in practical purposes (Page 13, Subsection 5.2, Lines 434-448,).

Comment 4: Additionally, the authors could expand on how their model compares with other state-of-the-art models in terms of efficiency and accuracy in real-world applications.

Response 4: A table is provided to present the comparison results on how effective the proposed algorithm is (Page 8, Subsection 4.3.1, Line 277-285).

Comment 5: Lastly, while the paper is well-referenced, drawing on a wide range of sources methodology, and findings, some references are relatively dated. The inclusion of more recent studies would strengthen the literature review and provide a more current context for the research. Addressing these areas would enhance the overall robustness and applicability of the research findings

Response 5: More up-to-date references are added and discussed (Page 2, Subsection 1, Line 65-77; Page 14-15, Subsection References, Lines 505-518, 521-524, 539-542).

Comments on the Quality of English Language

The quality structure of the English language in the paper is generally good, with clear coherent sentence structures. However, there are a few areas that could benefit from further proofreading.

Response: Proofreading is conducted. The new version of the text is improved.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The description of data preprocessing and annotation processes is somewhat ambiguous. For example, the paper mentions using BIO annotation rules but does not provide sufficient detail on how these were applied, especially in complex cases where entities may overlap or be nested. Clearer explanations and examples would help clarify these important steps in the methodology.

The experimental section compares the proposed ALBERT+Bi-LSTM+CRF model primarily against a baseline ALBERT model. However, it does not include comparisons with other state-of-the-art models used in medical text mining, such as transformer-based models or other deep learning techniques. This makes it difficult to gauge the relative performance of the proposed model within the broader research context.

The paper presents quantitative results (e.g., F1-score, Precision, Recall) but lacks a detailed analysis of error cases. Understanding where the model fails or underperforms could provide valuable insights into its limitations and areas for improvement. The absence of this analysis is a missed opportunity to strengthen the paper.

Author Response

Reviewer 2

Response to Reviewers

Thank you for pointing these out. We agree with your comments. Therefore, we responded to each issue raised by the reviewers in a bold and underlined manner.

Does the introduction provide sufficient background and include all relevant references?

Response: Up-to-date references are added and discussed (Page 2, Subsection 1, Lines 65-77).

Is the research design appropriate?

Response: The research design presented in the article is improved based on the suggestions of the reviewers (Page 8, Subsection 4.3.1, Lines 277-285).

Are the methods adequately described?

Response: More remarks and explanations are added to the text (Page 5, Subsection 3, Lines 177-181).

Is the result clear?

Response: A table was provided to demonstrate the comparative results of the effectiveness of the proposed algorithm (page 8, section 4.3.1, lines 277-285).

Comments and Suggestions for the Authors

Comment 1: The description of data preprocessing and annotation processes is somewhat ambiguous. For example, the paper mentions using BIO annotation rules but does not provide sufficient detail on how these were applied, especially in complex cases where entities may overlap or be nested. Clearer explanations and examples would help clarify these important steps in the methodology.

Response 1: Detailed explanation of the BIO rule has been added. We did not encounter any issues related to entity overlap and indentation in the dataset. However, considering the general practical applications, this is an unresolved issue. Therefore, from a practical perspective, it will be an important topic in this field (page 5, subsection 3, lines 177-181).

Comment 2: The experimental section compares the proposed ALBERT+Bi-LSTM+CRF model primarily against a baseline ALBERT model. However, it does not include comparisons with other state-of-the-art models used in medical text mining, such as transformer-based models or other deep learning techniques. This makes it difficult to gauge the relative performance of the proposed model within the broader research context.

Response 2: A table is provided to display the comparison results of the effectiveness of the algorithm (page 8, section 4.3.1, lines 277-285).

Comment 3: The paper presents quantitative results (e.g., F1-score, Precision, Recall) but lacks a detailed analysis of error cases. Understanding where the model fails or underperforms could provide valuable insights into its limitations and areas for improvement. The absence of this analysis is a missed opportunity to strengthen the paper.

Response 3: Added analysis of error cases and provided more annotations (pages 10-12, section 4.3.2, lines 330-381).

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have done the necessary revision and I have no further comments

Article Menu

Enhanced Precision in Chinese Medical Text Mining Using the ALBERT+Bi-LSTM+CRF Model

Further Information

Guidelines

MDPI Initiatives

Follow MDPI