MARIE: A Context-Aware Term Mapping with String Matching and Embedding Vectors
Round 1
Reviewer 1 Report
Reviewer’s report for “ MARIE: A Context-aware Term Mapping with String Matching and Embedding Vectors”
By Han Kyul Kim et al
Manuscript ID: applsci-981401
In this manuscript, Kim and co-workers introduce MARIE, an unsupervised learning tool designed to find standardized clinical terminologies for queries such as a hospital’s own codes. By incorporating both string matching methods and term embedding vectors generated by BioBERT, it utilizes both structural and contextual information to calculate similarity measures between source and target terms. Compared to previous term mapping methods, our proposed method shows improved mapping accuracy. The proposed method can be expanded to incorporate any string matching or term embedding methods. They demonstrate the usefulness of the method by mapping medical terms from SNUH. MARIE provides an effective end practical term mapping method for text data standardization and pre-processing.
In General, the MS is well-written and addresses a very important issue. I do have a few questions and suggestions with regard to the writing of the Manuscript.
MAJOR POINTS
Q1: L 90: In relation to string matching methods, the description of the R/O method is unclear.
Q2: What does the term “aggregate” on L91 stand for?
Q3: I would suggest a brief description of the “Embedding vector” concept.
Q4 How is the weighting parameter alpha in eqn (1) determined?
MINOR POINTS
L 43: “Theses” should be “these”.
L 111: …in equation (1) below.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Reviewer 2 Report
The abstract is a bit too long (212 words) and should be shortened. The Introduction, as well as Methods are written clearly and all significant previous research are cited adequatly. A scheme that describes the overall mapping process of MARIE is easy to understand to the reader. The code is available online which is excellent for interested parties. The dataset is large enough so that the results can be considered reliable. All the potentials and limitations, as well as the plans for further research are satisfactorily explained.
The article should be accepted in present form with the addition of a rewritten abstract.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Reviewer 3 Report
Paper entitled „MARIE: A Context-aware Term Mapping with String 2 Matching and Embedding Vectors” meets the necessary standards for publication in this journal.
I recommend that the authors be careful when they writing the references. For example: the references [4-6], line 40, [9-11], line 41, and [12-14] (line 48) are misspelled.
Attention to the references which are written! The bibliographic recommendations are not written unitary, according to the instructions of the journal. For example: references 4, 8, 10, 11, 12, 16, 18, 24, 27, 31, 33, 37 and 38 are not written bold.
Please check the entire manuscript carefully for eventual typographical errors.
Final Conclusion: The paper meets the necessary standards for publication. I consider that, with minor revision this paper can be published in Applied Science.
Author Response
Please see the attachment.
Author Response File: Author Response.docx