Next Article in Journal
Post-Editese in Literary Translations
Next Article in Special Issue
A Bidirectional Context Embedding Transformer for Automatic Speech Recognition
Previous Article in Journal
Translation Alignment with Ugarit
Previous Article in Special Issue
Multi-Keyword Classification: A Case Study in Finnish Social Sciences Data Archive
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Performance Study on Extractive Text Summarization Using BERT Models

Computer Science Department, The American University in Cairo, AUC Avenue, Cairo 11835, Egypt
*
Author to whom correspondence should be addressed.
Information 2022, 13(2), 67; https://doi.org/10.3390/info13020067
Submission received: 24 December 2021 / Revised: 21 January 2022 / Accepted: 26 January 2022 / Published: 28 January 2022
(This article belongs to the Special Issue Novel Methods and Applications in Natural Language Processing)

Abstract

The task of summarization can be categorized into two methods, extractive and abstractive. Extractive summarization selects the salient sentences from the original document to form a summary while abstractive summarization interprets the original document and generates the summary in its own words. The task of generating a summary, whether extractive or abstractive, has been studied with different approaches in the literature, including statistical-, graph-, and deep learning-based approaches. Deep learning has achieved promising performances in comparison to the classical approaches, and with the advancement of different neural architectures such as the attention network (commonly known as the transformer), there are potential areas of improvement for the summarization task. The introduction of transformer architecture and its encoder model “BERT” produced an improved performance in downstream tasks in NLP. BERT is a bidirectional encoder representation from a transformer modeled as a stack of encoders. There are different sizes for BERT, such as BERT-base with 12 encoders and BERT-larger with 24 encoders, but we focus on the BERT-base for the purpose of this study. The objective of this paper is to produce a study on the performance of variants of BERT-based models on text summarization through a series of experiments, and propose “SqueezeBERTSum”, a trained summarization model fine-tuned with the SqueezeBERT encoder variant, which achieved competitive ROUGE scores retaining the BERTSum baseline model performance by 98%, with 49% fewer trainable parameters.
Keywords: extractive summarization; deep learning models; recurrent neural networks; supervised learning; transformers; BERT; DistilBERT; SqueezeBERT extractive summarization; deep learning models; recurrent neural networks; supervised learning; transformers; BERT; DistilBERT; SqueezeBERT

Share and Cite

MDPI and ACS Style

Abdel-Salam, S.; Rafea, A. Performance Study on Extractive Text Summarization Using BERT Models. Information 2022, 13, 67. https://doi.org/10.3390/info13020067

AMA Style

Abdel-Salam S, Rafea A. Performance Study on Extractive Text Summarization Using BERT Models. Information. 2022; 13(2):67. https://doi.org/10.3390/info13020067

Chicago/Turabian Style

Abdel-Salam, Shehab, and Ahmed Rafea. 2022. "Performance Study on Extractive Text Summarization Using BERT Models" Information 13, no. 2: 67. https://doi.org/10.3390/info13020067

APA Style

Abdel-Salam, S., & Rafea, A. (2022). Performance Study on Extractive Text Summarization Using BERT Models. Information, 13(2), 67. https://doi.org/10.3390/info13020067

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop