Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (2)

Search Parameters:
Keywords = BIOES tagging scheme

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1241 KB  
Article
Cross-Lingual Transfer of Named Entity Markup with Large Language Models
by Vladimir Barakhnin, Rustam Mussabayev, Davlatyor Mengliev, Alexander Krassovitskiy, Alymzhan Toleu, Daniil Lyutaev, Iskander Akhmetov and Bahodir Ibragimov
Informatics 2026, 13(5), 70; https://doi.org/10.3390/informatics13050070 - 7 May 2026
Viewed by 940
Abstract
This paper investigates the problem of cross-lingual named entity recognition (NER), which involves automatically identifying entities such as persons, organizations, locations, and other structured elements in text. High-quality NER typically requires manually annotated corpora; however, for many low-resource languages, such data are scarce [...] Read more.
This paper investigates the problem of cross-lingual named entity recognition (NER), which involves automatically identifying entities such as persons, organizations, locations, and other structured elements in text. High-quality NER typically requires manually annotated corpora; however, for many low-resource languages, such data are scarce and costly to produce. The study addresses the following question: can annotated sentences in one language be used to transfer NER markup to their machine-translated counterparts in other languages? To explore this, we propose an approach based on a large language model (LLM) that performs two tasks simultaneously: translating a source sentence and generating BIOES-formatted entity tags for the translated output. To improve robustness and reduce semantic drift, a back-translation step is incorporated to verify meaning preservation by comparing the reconstructed source sentence with the original. The proposed method is compared with two baseline approaches: (1) annotation projection via machine translation and (2) automatic tagging using pre-existing NER tools. Performance is evaluated using standard metrics, including precision, recall, and F1-score. Experimental results demonstrate that the LLM-based approach provides a practical and efficient mechanism for transferring NER annotations across languages. While the method achieves strong and balanced performance, its quality remains influenced by translation accuracy and adherence to annotation constraints. Methodologically, the approach can be considered relatively language-independent, as it relies on general LLM capabilities, a universal tagging scheme, and multilingual semantic representations rather than language-specific model training. Full article
Show Figures

Figure 1

13 pages, 1576 KB  
Article
Chinese Named Entity Recognition in Football Based on ALBERT-BiLSTM Model
by Qi An, Bingyu Pan, Zhitong Liu, Shutong Du and Yixiong Cui
Appl. Sci. 2023, 13(19), 10814; https://doi.org/10.3390/app131910814 - 28 Sep 2023
Cited by 10 | Viewed by 3099
Abstract
Football is one of the most popular sports in the world, arousing a wide range of research topics related to its off- and on-the-pitch performance. The extraction of football entities from football news helps to construct sports frameworks, integrate sports resources, and timely [...] Read more.
Football is one of the most popular sports in the world, arousing a wide range of research topics related to its off- and on-the-pitch performance. The extraction of football entities from football news helps to construct sports frameworks, integrate sports resources, and timely capture the dynamics of the sports through visual text mining results, including the connections among football players, football clubs, and football competitions, and it is of great convenience to observe and analyze the developmental tendencies of football. Therefore, in this paper, we constructed a 1000,000-word Chinese corpus in the field of football and proposed a BiLSTM-based model for named entity recognition. The ALBERT-BiLSTM combination model of deep learning is used for entity extraction of football textual data. Based on the BiLSTM model, we introduced ALBERT as a pre-training model to extract character and enhance the generalization ability of word embedding vectors. We then compared the results of two different annotation schemes, BIO and BIOE, and two deep learning models, ALBERT-BiLSTM-CRF and ALBERT BiLSTM. It was verified that the BIOE tagging was superior than BIO, and the ALBERT-BiLSTM model was more suitable for football datasets. The precision, recall, and F-Score of the model were 85.4%, 83.47%, and 84.37%, correspondingly. Full article
(This article belongs to the Special Issue Application of Machine Learning in Text Mining)
Show Figures

Figure 1

Back to TopTop