Leveraging Semantic Text Analysis to Improve the Performance of Transformer-Based Relation Extraction

Evans, Marie-Therese Charlotte; Latifi, Majid; Ahsan, Mominul; Haider, Julfikar

doi:10.3390/info15020091

Open AccessArticle

Leveraging Semantic Text Analysis to Improve the Performance of Transformer-Based Relation Extraction

¹

Solution Consultant, IDHL Group, Central House, Otley Road, Harrogate HG3 1UF, UK

²

Department of Computer Science, University of York, Deramore Lane, York YO10 5GH, UK

³

Department of Engineering, Manchester Metropolitan University, John Dalton Building, Chester Street, Manchester M1 5GD, UK

^*

Author to whom correspondence should be addressed.

Information 2024, 15(2), 91; https://doi.org/10.3390/info15020091

Submission received: 12 January 2024 / Revised: 2 February 2024 / Accepted: 2 February 2024 / Published: 6 February 2024

(This article belongs to the Section Information Applications)

Download

Browse Figures

Versions Notes

Abstract

Keyword extraction from Knowledge Bases underpins the definition of relevancy in Digital Library search systems. However, it is the pertinent task of Joint Relation Extraction, which populates the Knowledge Bases from which results are retrieved. Recent work focuses on fine-tuned, Pre-trained Transformers. Yet, F1 scores for scientific literature achieve just 53.2, versus 69 in the general domain. The research demonstrates the failure of existing work to evidence the rationale for optimisations to finetuned classifiers. In contrast, emerging research subjectively adopts the common belief that Natural Language Processing techniques fail to derive context and shared knowledge. In fact, global context and shared knowledge account for just 10.4% and 11.2% of total relation misclassifications, respectively. In this work, the novel employment of semantic text analysis presents objective challenges for the Transformer-based classification of Joint Relation Extraction. This is the first known work to quantify that pipelined error propagation accounts for 45.3% of total relation misclassifications, the most poignant challenge in this domain. More specifically, Part-of-Speech tagging highlights the misclassification of complex noun phrases, accounting for 25.47% of relation misclassifications. Furthermore, this study identifies two limitations in the purported bidirectionality of the Bidirectional Encoder Representations from Transformers (BERT) Pre-trained Language Model. Firstly, there is a notable imbalance in the misclassification of right-to-left relations, which occurs at a rate double that of left-to-right relations. Additionally, a failure to recognise local context through determiners and prepositions contributes to 16.04% of misclassifications. Furthermore, it is highlighted that the annotation scheme of the singular dataset utilised in existing research, Scientific Entities, Relations and Coreferences (SciERC), is marred by ambiguity. Notably, two asymmetric relations within this dataset achieve recall rates of only 10% and 29%.

Keywords: Joint Relation Extraction (JRE); digital libraries; Named Entity Recognition (NER); Relation Extraction (RE); Pre-trained Language Model; transformer; SCIBERT; Scientific Entity Relation and Coreferences (SciERC); PL-Marker; semantic text analysis; global context

Share and Cite

MDPI and ACS Style

Evans, M.-T.C.; Latifi, M.; Ahsan, M.; Haider, J. Leveraging Semantic Text Analysis to Improve the Performance of Transformer-Based Relation Extraction. Information 2024, 15, 91. https://doi.org/10.3390/info15020091

AMA Style

Evans M-TC, Latifi M, Ahsan M, Haider J. Leveraging Semantic Text Analysis to Improve the Performance of Transformer-Based Relation Extraction. Information. 2024; 15(2):91. https://doi.org/10.3390/info15020091

Chicago/Turabian Style

Evans, Marie-Therese Charlotte, Majid Latifi, Mominul Ahsan, and Julfikar Haider. 2024. "Leveraging Semantic Text Analysis to Improve the Performance of Transformer-Based Relation Extraction" Information 15, no. 2: 91. https://doi.org/10.3390/info15020091

APA Style

Evans, M.-T. C., Latifi, M., Ahsan, M., & Haider, J. (2024). Leveraging Semantic Text Analysis to Improve the Performance of Transformer-Based Relation Extraction. Information, 15(2), 91. https://doi.org/10.3390/info15020091

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Leveraging Semantic Text Analysis to Improve the Performance of Transformer-Based Relation Extraction

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI