Next Article in Journal
Topic Modelling: Going beyond Token Outputs
Next Article in Special Issue
International Classification of Diseases Prediction from MIMIIC-III Clinical Text Using Pre-Trained ClinicalBERT and NLP Deep Learning Models Achieving State of the Art
Previous Article in Journal
Autonomous Vehicles: Evolution of Artificial Intelligence and the Current Industry Landscape
Previous Article in Special Issue
From Traditional Recommender Systems to GPT-Based Chatbots: A Survey of Recent Developments and Future Directions
 
 
Article
Peer-Review Record

Knowledge-Enhanced Prompt Learning for Few-Shot Text Classification

Big Data Cogn. Comput. 2024, 8(4), 43; https://doi.org/10.3390/bdcc8040043
by Jinshuo Liu and Lu Yang *
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Big Data Cogn. Comput. 2024, 8(4), 43; https://doi.org/10.3390/bdcc8040043
Submission received: 17 March 2024 / Revised: 11 April 2024 / Accepted: 12 April 2024 / Published: 18 April 2024
(This article belongs to the Special Issue Artificial Intelligence and Natural Language Processing)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

See the major comments in the attached file.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

See the minor comments in the attached file.

Author Response

Please see the response letter in the attached file.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper introduces SKPT, a knowledge-enhanced prompt learning approach for few-shot text classification, which allows the incorporation of external knowledge. The topic and results presented are interesting, and the document is well-organized and clearly written. The proposed approach was compared with various baselines. I recommend including some insights from the ablation study in the conclusions. Below are some specific points for improvement:

-In the introduction, please include references to the AG’s News and DBpedia datasets.

-Could you expand on the discussion of the ablation study? There is a marginal gain with respect to some models (for example, model 3).

- In line 94, is the term "slow" correct?

- In line 214, "Related Word" needs a reference or a footnote with its URL.

- Please provide descriptions for the terms e_h and e_t in equations 2 and 3.

- In tables 1 and 2, add the name of the approach along with the term "ours".

Author Response

Please see the response letter in the attached file.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Knowledge-enhanced Prompt Learning for Few-shot Text Classification

The paper presents a novel approach, based on prompt tuning, for enhancing Transformer-based models for text classification using structured knowledge. In particular, the authors proposed SKPT (Structured Knowledge Prompt Tuning), a methodology structured in three components: i) prompt template construction; ii) prompt verbalizer construction; and iii) training strategies. In the first component, the method extracts open entities and relations from text data via OpenIE and incorporates this knowledge based on triples using a cloze-based prompt template. Then, a prompt verbalizer is used to expand and filter the label words. Finally, the structured knowledge constraints are used in the training phase of a Transformer-model trained on the Masked Language Modeling (MLM) function, i.e. RoBERTa. In order to do so, the loss function is modified in order to take into account: i) structured knowledge loss and ii) cross-entropy loss for the entire prompt.

The authors tested the approach on two topic classification benchmarks, AG's News and DBPedia and compared their approach with three different baselines approach: i) simple fine-tuning; ii) simple prompt-tuning; iii) Knowledgeable Prompt-tuning (KPT).
The results showed that SKPT generally outperforms the baseline methods in almost all configurations and tasks. In particular, the authors showed that their approach is particularly efficient in low-resource settings, i.e. when tested in 5- and 10-shot scenarios.

Main strengths:
-- The paper is well-written and interesting to read. The theoretical framework is well-explained, providing a solid foundation for understanding the proposed methodology.

-- The idea of enhancing a Transformer model through prompt tuning with external knowledge is intriguing and aligns with recent trends in leveraging additional information to improve model performance. This approach, as demonstrated in the study, proves particularly effective in low-resource scenarios, showing the potential to achieve comparable results with smaller models or with less data compared to fine-tuning bigger models on bigger datasets.

-- I think that the Related Work section enriches the paper by providing context within the broader landscape of this study.

-- The experimental setup is well designed, featuring a diverse range of baselines for fair comparison with the SKPT method.

Main weaknesses:
-- While the use of a model trained on the Masked Language Modeling (MLM) function aligns with the methodology proposed in the paper, it would have been beneficial to explore the effectiveness of SKPT with different types of models, particularly larger generative models. Given the current focus of prompt tuning research on decoder-based models trained on the Language Modeling (LM) function, examining SKPT's performance with such models could provide valuable insights into its versatility and applicability across diverse architectures.

-- I have some doubts about the comparison with the fine-tuning baseline, particularly in few-shot scenarios. The authors mention fine-tuning RoBERTa on the tasks with a limited number of samples (5 to 20 shots). This raises doubts about the fairness of comparison since traditional fine-tuning typically requires a larger amount of training data for optimal performance. To address this, it would have been beneficial to expand the fine-tuning baseline to include varying sizes of training data (e.g., 5 to 1000 samples) to determine the threshold where the performance of simple fine-tuning converges with that of the proposed SKPT approach.

Minor issues:
-- In the abstract and on page 2, the authors assert that "Prompt learning is used to formalize the text classification task into masked language modeling." However, this statement is somewhat misleading. Prompt learning serves as a mechanism to formalize various tasks, including classification, regression, and generation tasks, into a generative framework. It is not inherently tied to masked language modeling (MLM). Utilizing the MLM approach to address a classification task represents just one specific application of prompt learning methodology.

Author Response

Please see the response letter in the attached file.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The paper has great improvements after the revision. The additional details and expanded discussion significantly enhanced its quality. I believe it is a good paper to be published.

One comment: In Equation (4), please verify whether $\sum_{v\in V}$ should be changed to $\sum_{v\in V_{no}}$.

Comments on the Quality of English Language

One comment: Consider combine lines 290-298 into a single paragraph. 

Author Response

Please see the response letter in the attached file.

Author Response File: Author Response.pdf

Back to TopTop