Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering

Appl. Sci. 2021, 11(23), 11251; https://doi.org/10.3390/app112311251

by Shuohua Zhou^1,* and Yanping Zhang²

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Appl. Sci. 2021, 11(23), 11251; https://doi.org/10.3390/app112311251

Submission received: 16 October 2021 / Revised: 19 November 2021 / Accepted: 20 November 2021 / Published: 26 November 2021

(This article belongs to the Special Issue Advances in Artificial Intelligence Methods for Natural Language Processing)

Round 1

Reviewer 1 Report

The paper is well detailed on the approach to use BERT, GPT-2 and T5-small to construct a QA system for disease knowledge purposes. However, some concerns can be better addressed by the author: All these algorithms are computationally heavy. How much resource is required for such application to be used by health sectors. Does the dataset account for new diseases like covid-19? Why not use BERT embedding instead of word2vec for s1 and s2 in T5?

Author Response

Dear reviewer,

Thanks for taking your precious time to review this manuscript. We have thoroughly revised the manuscript based on your comments. The revised texts are marked in red in the manuscript. Below we provide our responses point by point.

Concern #1: All these algorithms are computationally heavy. How much resource is required for such application to be used by health sectors.

Concern #2: Does the dataset account for new diseases like covid-19?

Author response: Yes. COVID-19 is included in the disease dataset (pre-training dataset).

Concern #3: Why not use BERT embedding instead of word2vec for s1 and s2 in T5?

Author response: Word2vec is more efficient since it is a shallow network and BERT is a more complex neural network with more layers and heavier matrix calculation; the effects of the two embeddings are very similar on the result.

Author Response File: Author Response.docx

Reviewer 2 Report

The current research paper proposes a deep learning, transformer based methodology, in order to manage natural language questions. The paper is generally well organized, of a good technical quality, the experiments being well conducted. However, there are the following observations to be taken into account:

(1.) The original contribution with respect to the state of the art should be more clearly emphasized within the Introduction;

(2.) All the symbols within equation (1) should be explained;

(3.) More comparisons with the state of the art results could be illustrated in section 4, "Experiments and Results".

Author Response

Dear reviewer,

Concern # 1: The original contribution with respect to the state of the art should be more clearly emphasized within the Introduction.

Author response: Thanks for the comment. As suggested, we have revised the Introduction section and emphasized the advancement of our work compared to SOTA.

Concern # 2: All the symbols within equation (1) should be explained.

Author response: We extended the explanation of equation (1) in the manuscript and highlighted the revision.

Concern # 3: More comparisons with the state of the art results could be illustrated in section 4, "Experiments and Results".

Author response: Thanks for the comment. We updated Section 4. The comparative experiments with SOTA are shown in Table 5. Currently there is limit research related to these datasets. We will be more than happy to extend our experiments if more SOTA data becomes available.

Author Response File: Author Response.docx

Reviewer 3 Report

The authors propose a new method for medical question answering system that combines three lastest variants of transformer architecture (BERT, GPT-2, and T5-Small).
The novelty of the manuscript is clearly expressed and sufficient for publication and has the potential for significant impact in the field. The Paper is well written and organized.
The results are presented, and a comprehensive experimental study supports an initial thesis and is confirmed with results, which outperforms current SOTA.

The used dataset requires citation of the paper: PubMedQA: A Dataset for Biomedical Research Question Answering. (in references, not only as data availability statement)
Since proposed method outperforms current SOTA, the whole code implementation of the proposed method should be provided to alleviate further research.

All Table captions should be briefly extended to summarize tables content.

Some minor issues and suggestions:

Lines 168-169: unfinished sentence
Line 174: maybe more precisely write what are 'good' results
Line 178: to some readers it may be an unclear term 'golden dataset'
Formatting issue on page 5 (there is a huge blank space due big Table that went away on next page)
Line 333: typo, fin-tune -> fine-tune

Author Response

Dear reviewer,

Concern #1: The used dataset requires citation of the paper: PubMedQA: A Dataset for Biomedical Research Question Answering. (in references, not only as data availability statement)

Author response: Thanks for the comment. We have added the paper to our reference list (#53) and cited it in our manuscript.

Concern # 2: Since proposed method outperforms current SOTA, the whole code implementation of the proposed method should be provided to alleviate further research.

Author response: Here is the GitHub link for our implementation and data: https://github.com/ShuohuaZhou-NLPer/Question_Answering

The link is also included in our manuscript.

Concern # 3: All Table captions should be briefly extended to summarize tables content.

Author response: We have extended the captions to better summarize the content.

Concern # 4: Some minor issues and suggestions :

Lines 168-169: unfinished sentence

Line 174: maybe more precisely write what are 'good' results

Line 178: to some readers it may be an unclear term 'golden dataset'

Formatting issue on page 5 (there is a huge blank space due big Table that went away on next page)

Line 333: typo, fin-tune -> fine-tune

Author response: Thank you for your valuable feedback. We have provided more information in our manuscript for the issues in Line 174 and Line 178, and fixed other issues as well. We also thoroughly examined the writing and English issues with another round of proofreading and highlighted corrections in our manuscript.

Author Response File: Author Response.docx

Article Menu

DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering

Further Information

Guidelines

MDPI Initiatives

Follow MDPI