MDPI - Publisher of Open Access Journals

18 pages, 1241 KB

Open AccessArticle

Cross-Lingual Transfer of Named Entity Markup with Large Language Models

by Vladimir Barakhnin, Rustam Mussabayev, Davlatyor Mengliev, Alexander Krassovitskiy, Alymzhan Toleu, Daniil Lyutaev, Iskander Akhmetov and Bahodir Ibragimov

Informatics 2026, 13(5), 70; https://doi.org/10.3390/informatics13050070 - 7 May 2026

Viewed by 940

Abstract

This paper investigates the problem of cross-lingual named entity recognition (NER), which involves automatically identifying entities such as persons, organizations, locations, and other structured elements in text. High-quality NER typically requires manually annotated corpora; however, for many low-resource languages, such data are scarce and costly to produce. The study addresses the following question: can annotated sentences in one language be used to transfer NER markup to their machine-translated counterparts in other languages? To explore this, we propose an approach based on a large language model (LLM) that performs two tasks simultaneously: translating a source sentence and generating BIOES-formatted entity tags for the translated output. To improve robustness and reduce semantic drift, a back-translation step is incorporated to verify meaning preservation by comparing the reconstructed source sentence with the original. The proposed method is compared with two baseline approaches: (1) annotation projection via machine translation and (2) automatic tagging using pre-existing NER tools. Performance is evaluated using standard metrics, including precision, recall, and F1-score. Experimental results demonstrate that the LLM-based approach provides a practical and efficient mechanism for transferring NER annotations across languages. While the method achieves strong and balanced performance, its quality remains influenced by translation accuracy and adherence to annotation constraints. Methodologically, the approach can be considered relatively language-independent, as it relies on general LLM capabilities, a universal tagging scheme, and multilingual semantic representations rather than language-specific model training. Full article

► Show Figures

Figure 1

13 pages, 1576 KB

Open AccessArticle

Chinese Named Entity Recognition in Football Based on ALBERT-BiLSTM Model

by Qi An, Bingyu Pan, Zhitong Liu, Shutong Du and Yixiong Cui

Appl. Sci. 2023, 13(19), 10814; https://doi.org/10.3390/app131910814 - 28 Sep 2023

Cited by 10 | Viewed by 3099

Abstract

Football is one of the most popular sports in the world, arousing a wide range of research topics related to its off- and on-the-pitch performance. The extraction of football entities from football news helps to construct sports frameworks, integrate sports resources, and timely capture the dynamics of the sports through visual text mining results, including the connections among football players, football clubs, and football competitions, and it is of great convenience to observe and analyze the developmental tendencies of football. Therefore, in this paper, we constructed a 1000,000-word Chinese corpus in the field of football and proposed a BiLSTM-based model for named entity recognition. The ALBERT-BiLSTM combination model of deep learning is used for entity extraction of football textual data. Based on the BiLSTM model, we introduced ALBERT as a pre-training model to extract character and enhance the generalization ability of word embedding vectors. We then compared the results of two different annotation schemes, BIO and BIOE, and two deep learning models, ALBERT-BiLSTM-CRF and ALBERT BiLSTM. It was verified that the BIOE tagging was superior than BIO, and the ALBERT-BiLSTM model was more suitable for football datasets. The precision, recall, and F-Score of the model were 85.4%, 83.47%, and 84.37%, correspondingly. Full article

(This article belongs to the Special Issue Application of Machine Learning in Text Mining)

► Show Figures

Figure 1

Search Results (2)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (2)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI