MDPI - Publisher of Open Access Journals

11 pages, 670 KB

Open AccessReview

Supporting Primary Care Communication on Vaccination in Multilingual and Culturally Diverse Settings: Lessons from South Tyrol, Italy

by Christian J. Wiedermann, Giuliano Piccoliori and Adolf Engl

Epidemiologia 2025, 6(3), 50; https://doi.org/10.3390/epidemiologia6030050 - 2 Sep 2025

Viewed by 116

Abstract

Background: Vaccine hesitancy is a major threat to public health. As part of efforts to increase vaccine uptake, the focus is on optimizing the quality of communication among healthcare workers. Physician shortages and workloads create time constraints, making communication interventions in primary care [...] Read more.

Background: Vaccine hesitancy is a major threat to public health. As part of efforts to increase vaccine uptake, the focus is on optimizing the quality of communication among healthcare workers. Physician shortages and workloads create time constraints, making communication interventions in primary care challenging. This study aimed to propose strategies to improve communication between general practitioners and vaccine-hesitant individuals. This narrative review addresses the specific needs of general practitioners for effective communication and proposes strategies to combat vaccine hesitancy in culturally and linguistically diverse regions. Methods: Systematic searches of EMBASE and PubMed were performed using terms related to vaccine hesitancy, communication strategies, primary care, and cultural diversity. Additionally, the websites of major health organizations were searched for relevant reports and guidelines. Selection criteria were based on the relevance and quality of the selected studies. Results: The findings highlight the importance of empathy, transparency, and personalized information in communication strategies. The need for communication training and addressing policy and workload barriers for healthcare providers is significant. The proposed strategy includes regular communication skills and cultural competency workshops, language training, the development of multilingual resources, implementation of telemedicine services, and active community engagement. Conclusions: Policy recommendations advocate for increased primary care resources, support from general practitioner unions, and the integration of digital tools. These strategies are essential to improve vaccine uptake and public health outcomes by enhancing the capacity of general practitioners to effectively engage with vaccine-hesitant patients. Full article

(This article belongs to the Special Issue Local Healthcare Preparedness and Alert Systems—How to Prevent Future Pandemics?)

► Show Figures

Figure 1

20 pages, 487 KB

Open AccessArticle

NLP and Text Mining for Enriching IT Professional Skills Frameworks

by Danial Zare, Luis Fernandez-Sanz, Vera Pospelova and Inés López-Baldominos

Appl. Sci. 2025, 15(17), 9634; https://doi.org/10.3390/app15179634 - 1 Sep 2025

Viewed by 156

Abstract

The European e-Competence Framework (e-CF) and the European Skills, Competences, Qualifications and Occupations (ESCO) classification are two key initiatives developed by the European Commission to support skills transparency, mobility, and interoperability across labour and education systems. While e-CF defines essential competences for ICT [...] Read more.

The European e-Competence Framework (e-CF) and the European Skills, Competences, Qualifications and Occupations (ESCO) classification are two key initiatives developed by the European Commission to support skills transparency, mobility, and interoperability across labour and education systems. While e-CF defines essential competences for ICT professionals through a structured framework, it provides only a limited number of illustrative skills and knowledge examples for each competence. In contrast, ESCO offers a rich, multilingual taxonomy of skills and knowledge, each accompanied by a detailed description, alternative labels, and links to relevant occupations. This paper explores the possibility of enriching the e-CF framework by linking it to relevant ESCO ICT skills using text embedding (MPNet) and cosine similarity. This approach allows the extension to 15–25 semantically aligned skills and knowledge items per competence in e-CF, all with full description and officially translated into all EU languages, instead of the present amount of 4–10 brief examples. This significantly improves the clarity, usability, and interpretability of e-CF competences for the various stakeholders. Furthermore, since ESCO terminology serves as the foundation for labour market analysis across the EU, establishing this linkage provides a valuable bridge between the e-CF competence model and real-time labour market intelligence, a connection not available now. The results of this study offer practical insights into the application of semantic technologies to the enhancement and mutual alignment of European ICT skills frameworks. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications—2nd Edition)

► Show Figures

Figure 1

23 pages, 1233 KB

Open AccessArticle

Decoding the Digits: How Number Notation Influences Cognitive Effort and Performance in Chinese-to-English Sight Translation

by Xueyan Zong, Lei Song and Shanshan Yang

Behav. Sci. 2025, 15(9), 1195; https://doi.org/10.3390/bs15091195 - 1 Sep 2025

Viewed by 124

Abstract

Numbers present persistent challenges in interpreting, yet cognitive mechanisms underlying notation-specific processing remain underexplored. While eye-tracking studies in visually-assisted simultaneous interpreting have advanced number research, they predominantly examine Arabic numerals in non-Chinese contexts—neglecting notation diversity increasingly prevalent in computer-assisted interpreting systems where Automatic [...] Read more.

Numbers present persistent challenges in interpreting, yet cognitive mechanisms underlying notation-specific processing remain underexplored. While eye-tracking studies in visually-assisted simultaneous interpreting have advanced number research, they predominantly examine Arabic numerals in non-Chinese contexts—neglecting notation diversity increasingly prevalent in computer-assisted interpreting systems where Automatic Speech Recognition outputs vary across languages. Addressing these gaps, this study investigated how number notation (Arabic digits vs. Chinese character numbers) affects trainee interpreters’ cognitive effort and performance in Chinese-to-English sight translation. Employing a mixed-methods design, we measured global (task-level) and local (number-specific) eye movements alongside expert assessments, output analysis, and subjective assessments. Results show that Chinese character numbers demand significantly greater cognitive effort than Arabic digits, evidenced by more and longer fixations, more extensive saccadic movements, and a larger eye-voice span. Concurrently, sight translation quality decreased markedly with Chinese character numbers, with more processing attempts yet lower accuracy and fluency. Subjective workload ratings confirmed higher mental, physical, and temporal demands in Task 2. These findings reveal an effort-quality paradox where greater cognitive investment in processing complex notations leads to poorer outcomes, and highlight the urgent need for notation-specific training strategies and adaptive technologies in multilingual communication. Full article

(This article belongs to the Section Cognition)

► Show Figures

Figure 1

14 pages, 657 KB

Open AccessArticle

Pretrained Models Against Traditional Machine Learning for Detecting Fake Hadith

by Jawaher Alghamdi, Adeeb Albukhari and Thair Al-Dala’in

Electronics 2025, 14(17), 3484; https://doi.org/10.3390/electronics14173484 - 31 Aug 2025

Viewed by 204

Abstract

The proliferation of fake news, particularly in sensitive domains like religious texts, necessitates robust authenticity verification methods. This study addresses the growing challenge of authenticating Hadith, where traditional methods relying on the analysis of the chain of narrators (Isnad) and the content (Matn) [...] Read more.

The proliferation of fake news, particularly in sensitive domains like religious texts, necessitates robust authenticity verification methods. This study addresses the growing challenge of authenticating Hadith, where traditional methods relying on the analysis of the chain of narrators (Isnad) and the content (Matn) are increasingly strained by the sheer volume in circulation. To combat this issue, machine learning (ML) and natural language processing (NLP) techniques, specifically through transfer learning, are explored to automate Hadith classification into Genuine and Fake categories. This study utilizes an imbalanced dataset of 8544 Hadiths, with 7008 authentic and 1536 fake Hadiths, to systematically investigate the collective impact of both linguistic and contextual features, particularly the chain of narrators (Isnad), on Hadith authentication. For the first time in this specialized domain, state-of-the-art pre-trained language models (PLMs) such as Multilingual BERT (mBERT), CamelBERT, and AraBERT are evaluated alongside classical algorithms like logistic regression (LR) and support vector machine (SVM) for Hadith authentication. Our best-performing model, AraBERT, achieved a 99.94% F1score when including the chain of narrators, demonstrating the profound effectiveness of contextual elements (Isnad) in significantly improving accuracy, providing novel insights into the indispensable role of computational methods in Hadith authentication and reinforcing traditional scholarly emphasis. This research represents a significant advancement in combating misinformation in this important field. Full article

(This article belongs to the Special Issue Advances in Artificial Intelligence and Computer Vision Based on Deep Learning)

► Show Figures

Figure 1

29 pages, 1104 KB

Open AccessArticle

Deaf and Indigenous Curricula and Eco-Pedagogies: Hybridizing Languacultures and Biocultures for Sustainable STEAM Education Founded on Collaboration, Mutualism, and Symbiosis

by Michael E. Skyer and Melanie McKay-Cody

Educ. Sci. 2025, 15(9), 1132; https://doi.org/10.3390/educsci15091132 - 30 Aug 2025

Viewed by 289

Abstract

STEM ideologies provoke environmental destruction from which deaf, disabled, and Indigenous people are uniquely targeted. Our analysis counteracts harms caused by governmental, industrial, and educational agents who weaponize STEM ideologies against downstream people, animals, plants, environments, and biogeochemical entities. We explore two research [...] Read more.

STEM ideologies provoke environmental destruction from which deaf, disabled, and Indigenous people are uniquely targeted. Our analysis counteracts harms caused by governmental, industrial, and educational agents who weaponize STEM ideologies against downstream people, animals, plants, environments, and biogeochemical entities. We explore two research questions via a theoretical framework about biocultural deaf gains and deaf/Indigenous languacultures to center the arts in STEAM. As a result, we synthesized a conceptual framework called Deaf and Indigenous Curricula and Eco-pedagogies (DICE), which are multimodal, multilingual approaches to STEAM education emphasizing place-based ecology and the arts, including knowledge emanating from Indigenous Deaf Cultures, Indigenous sign languages, and epistemologists who are deaf, disabled, women, and Indigenous (singly or in combination). DICE is designed to reinvigorate communities and ecologies at risk of destruction from colonialism and runnamok capitalism. Within and across Indigenous and Deaf lifeworlds, our model explores: collaboration, mutualism, and symbiosis. These are situated in examples drawn from the research, abductive reasoning, our life histories, and the creative works of Deaf Indigenous scientists and artists. In sum, alongside uprising Indigenous voices, deaf hands shall rise in solidarity to aid Earth’s defense. Full article

(This article belongs to the Special Issue Full STEAM Ahead! in Deaf Education)

► Show Figures

Figure 1

18 pages, 3723 KB

Open AccessArticle

Empowering Weak Languages Through Cross-Language Hyperlink Recommendation

by Nhu Nguyen, Hideaki Takeda and Lakshan Karunathilake

Information 2025, 16(9), 749; https://doi.org/10.3390/info16090749 - 29 Aug 2025

Viewed by 301

Abstract

Wikipedia is an important platform for promoting language inclusivity and sharing global knowledge. However, while languages with more resources have a lot of content, languages with fewer resources face challenges in accessibility and cultural representation. To help address this gap, we use multilingual [...] Read more.

Wikipedia is an important platform for promoting language inclusivity and sharing global knowledge. However, while languages with more resources have a lot of content, languages with fewer resources face challenges in accessibility and cultural representation. To help address this gap, we use multilingual datasets and neural graph collaborative filtering to recommend missing hyperlinks, helping to improve low-resource languages on Wikipedia. By encouraging cross-language collaboration, this method strengthens the connections and content of these languages, promoting cultural sustainability and digital inclusion. Experimental results show significant improvement in recommendation quality, with clear benefits for weaker languages. This highlights the role of recommender systems in preserving unique cultural aspects, building connections between language communities, and supporting fair knowledge sharing in a globalized world. Full article

► Show Figures

Figure 1

30 pages, 21387 KB

Open AccessArticle

An Intelligent Docent System with a Small Large Language Model (sLLM) Based on Retrieval-Augmented Generation (RAG)

by Taemoon Jung and Inwhee Joe

Appl. Sci. 2025, 15(17), 9398; https://doi.org/10.3390/app15179398 - 27 Aug 2025

Viewed by 378

Abstract

This study designed and empirically evaluated a method to enhance information accessibility for museum and art gallery visitors using a small Large Language Model (sLLM) based on the Retrieval-Augmented Generation (RAG) framework. Over 199,000 exhibition descriptions were collected and refined, and a question-answering [...] Read more.

This study designed and empirically evaluated a method to enhance information accessibility for museum and art gallery visitors using a small Large Language Model (sLLM) based on the Retrieval-Augmented Generation (RAG) framework. Over 199,000 exhibition descriptions were collected and refined, and a question-answering dataset consisting of 102,000 pairs reflecting user personas was constructed to develop DocentGemma, a domain-optimized language model. This model was fine-tuned through Low-Rank Adaptation (LoRA) based on Google’s Gemma2-9B and integrated with FAISS and OpenSearch-based document retrieval systems within the LangChain framework. Performance evaluation was conducted using a dedicated Q&A benchmark for the docent domain, comparing the model against five commercial and open-source LLMs (including GPT-3.5 Turbo, LLaMA3.3-70B, and Gemma2-9B). DocentGemma achieved an accuracy of 85.55% and a perplexity of 3.78, demonstrating competitive performance in language generation and response accuracy within the domain-specific context. To enhance retrieval relevance, a Spatio-Contextual Retriever (SC-Retriever) was introduced, which combines semantic similarity and spatial proximity based on the user’s query and location. An ablation study confirmed that integrating both modalities improved retrieval quality, with the SC-Retriever achieving a recall@1 of 53.45% and a Mean Reciprocal Rank (MRR) of 68.12, representing a 17.5 20% gain in search accuracy compared to baseline models such as GTE and SpatialNN. System performance was further validated through field deployment at three major exhibition venues in Seoul (the Seoul History Museum, the Hwan-ki Museum, and the Hanseong Baekje Museum). A user test involving 110 participants indicated high response credibility and an average satisfaction score of 4.24. To ensure accessibility, the system supports various output formats, including multilingual speech and subtitles. This work illustrates a practical application of integrating LLM-based conversational capabilities into traditional docent services and suggests potential for further development toward location-aware interactive systems and AI-driven cultural content services. Full article

(This article belongs to the Special Issue Advancements in Large Language Models Applied in Multidisciplinary Research Contexts)

► Show Figures

Figure 1

19 pages, 437 KB

Open AccessArticle

Research on Generation and Quality Evaluation of Earthquake Emergency Language Service Contingency Plan Based on Chain-of-Thought Prompt Engineering for LLMs

by Wenyan Zhang, Kai Zhang, Ti Li and Wenhua Deng

Inventions 2025, 10(5), 74; https://doi.org/10.3390/inventions10050074 - 26 Aug 2025

Viewed by 347

Abstract

China frequently experiences natural disasters, making emergency language services a key link in information transmission, cross-lingual communication, and resource coordination during disaster relief. Traditional contingency plans rely on manual experience, which results in low efficiency, limited coverage, and insufficient dynamic adaptability. Large language [...] Read more.

China frequently experiences natural disasters, making emergency language services a key link in information transmission, cross-lingual communication, and resource coordination during disaster relief. Traditional contingency plans rely on manual experience, which results in low efficiency, limited coverage, and insufficient dynamic adaptability. Large language models (LLMs), with their advantages in semantic understanding, multilingual adaptation, and scalability, provide new technical approaches for emergency language services. Our study establishes the country’s first generative evaluation index system for emergency language service contingency plans, covering eight major dimensions. Through an evaluation of 11 mainstream large language models, including Deepseek, we find that these models perform excellently in precise service stratification and resource network stereoscopic coordination but show significant shortcomings in legal/regulatory frameworks and mechanisms for dynamic evolution. It is recommended to construct a more comprehensive emergency language service system by means of targeted data augmentation, multi-model collaboration, and human–machine integration so as to improve cross-linguistic communication efficiency in emergencies and reduce secondary risks caused by information transmission barriers. Full article

(This article belongs to the Special Issue Advances and Innovations in Deep Learning: Unveiling Multidisciplinary Applications and Challenges)

► Show Figures

Figure 1

44 pages, 900 KB

Open AccessArticle

MetaFFI-Multilingual Indirect Interoperability System

by Tsvi Cherny-Shahar and Amiram Yehudai

Software 2025, 4(3), 21; https://doi.org/10.3390/software4030021 - 26 Aug 2025

Viewed by 354

Abstract

The development of software applications using multiple programming languages has increased in recent years, as it allows the selection of the most suitable language and runtime for each component of the system and the integration of third-party libraries. However, this practice involves complexity [...] Read more.

The development of software applications using multiple programming languages has increased in recent years, as it allows the selection of the most suitable language and runtime for each component of the system and the integration of third-party libraries. However, this practice involves complexity and error proneness, due to the absence of an adequate system for the interoperability of multiple programming languages. Developers are compelled to resort to workarounds, such as library reimplementation or language-specific wrappers, which are often dependent on C as the common denominator for interoperability. These challenges render the use of multiple programming languages a burdensome and demanding task that necessitates highly skilled developers for implementation, debugging, and maintenance, and raise doubts about the benefits of interoperability. To overcome these challenges, we propose MetaFFI, introducing a fully in-process, plugin-oriented, runtime-independent architecture based on a minimal C abstraction layer. It provides deep binding without relying on a shared object model, virtual machine bytecode, or manual glue code. This architecture is scalable (O(n) integration for n languages) and supports true polymorphic function and object invocation across languages. MetaFFI is based on leveraging FFI and embedding mechanisms, which minimize restrictions on language selection while still enabling full-duplex binding and deep integration. This is achieved by exploiting the less restrictive shallow binding mechanisms (e.g., Foreign Function Interface) to offer deep binding features (e.g., object creation, methods, fields). MetaFFI provides a runtime-independent framework to load and xcall (Cross-Call) foreign entities (e.g., getters, functions, objects). MetaFFI uses Common Data Types (CDTs) to pass parameters and return values, including objects and complex types, and even cross-language callbacks and dynamic calling conventions for optimization. The indirect interoperability approach of MetaFFI has the significant advantage of requiring only

2 n

mechanisms to support n languages, compared to direct interoperability approaches that need

n^{2}

mechanisms. We developed and tested a proof of concept tool interoperating three languages (Go, Python, and Java), on Windows and Ubuntu. To evaluate the approach and the tool, we conducted a user study, with promising results. The MetaFFI framework is available as open source software, including its full source code and installers, to facilitate adoption and collaboration across academic and industrial communities. Full article

(This article belongs to the Topic Software Engineering and Applications)

► Show Figures

Figure 1

29 pages, 848 KB

Open AccessArticle

Applying Additional Auxiliary Context Using Large Language Model for Metaphor Detection

by Takuya Hayashi and Minoru Sasaki

Big Data Cogn. Comput. 2025, 9(9), 218; https://doi.org/10.3390/bdcc9090218 - 25 Aug 2025

Viewed by 353

Abstract

Metaphor detection is challenging in natural language processing (NLP) because it requires recognizing nuanced semantic shifts beyond literal meaning, and conventional models often falter when contextual cues are limited. We propose a method to enhance metaphor detection by augmenting input sentences with auxiliary [...] Read more.

Metaphor detection is challenging in natural language processing (NLP) because it requires recognizing nuanced semantic shifts beyond literal meaning, and conventional models often falter when contextual cues are limited. We propose a method to enhance metaphor detection by augmenting input sentences with auxiliary context generated by ChatGPT. In our approach, ChatGPT produces semantically relevant sentences that are inserted before, after, or on both sides of a target sentence, allowing us to analyze the impact of context position and length on classification. Experiments on three benchmark datasets (MOH-X, VUA_All, VUA_Verb) show that this context-enriched input consistently outperforms the no-context baseline across accuracy, precision, recall, and F1-score, with the MOH-X dataset achieving the largest F1 gain. These improvements are statistically significant based on two-tailed t-tests. Our findings demonstrate that generative models can effectively enrich context for metaphor understanding, highlighting context placement and quantity as critical factors. Finally, we outline future directions, including advanced prompt engineering, optimizing context lengths, and extending this approach to multilingual metaphor detection. Full article

► Show Figures

Figure 1

21 pages, 728 KB

Open AccessArticle

Resolving Linguistic Asymmetry: Forging Symmetric Multilingual Embeddings Through Asymmetric Contrastive and Curriculum Learning

by Lei Meng, Yinlin Li, Wei Wei and Caipei Yang

Symmetry 2025, 17(9), 1386; https://doi.org/10.3390/sym17091386 - 25 Aug 2025

Viewed by 473

Abstract

The pursuit of universal, symmetric semantic representations within large language models (LLMs) faces a fundamental challenge: the inherent asymmetry of natural languages. Different languages exhibit vast disparities in syntactic structures, lexical choices, and cultural nuances, making the creation of a truly shared, symmetric [...] Read more.

The pursuit of universal, symmetric semantic representations within large language models (LLMs) faces a fundamental challenge: the inherent asymmetry of natural languages. Different languages exhibit vast disparities in syntactic structures, lexical choices, and cultural nuances, making the creation of a truly shared, symmetric embedding space a non-trivial task. This paper aims to address this critical problem by introducing a novel framework to forge robust and symmetric multilingual sentence embeddings. Our approach, named DACL (Dynamic Asymmetric Contrastive Learning), is anchored in two powerful asymmetric learning paradigms: Contrastive Learning and Dynamic Curriculum Learning (DCL). We extend Contrastive Learning to the multilingual context, where it asymmetrically treats semantically equivalent sentences from different languages (positive pairs) and sentences with distinct meanings (negative pairs) to enforce semantic symmetry in the target embedding space. To further refine this process, we incorporate Dynamic Curriculum Learning, which introduces a second layer of asymmetry by dynamically scheduling training instances from easy to hard. This dual-asymmetric strategy enables the model to progressively master complex cross-lingual relationships, starting with more obvious semantic equivalences and advancing to subtler ones. Our comprehensive experiments on benchmark cross-lingual tasks, including sentence retrieval and cross-lingual classification (XNLI, PAWS-X, MLDoc, MARC), demonstrate that DACL significantly outperforms a wide range of established baselines. The results validate our dual-asymmetric framework as a highly effective approach for forging robust multilingual embeddings, particularly excelling in tasks involving complex linguistic asymmetries. Ultimately, this work contributes a novel dual-asymmetric learning framework that effectively leverages linguistic asymmetry to achieve robust semantic symmetry across languages. It offers valuable insights for developing more capable, fair, and interpretable multilingual LLMs, emphasizing that deliberately leveraging asymmetry in the learning process is a highly effective strategy. Full article

(This article belongs to the Special Issue Symmetry/Asymmetry Studies in Data Mining & Machine Learning of Large Language Models)

► Show Figures

Figure 1

36 pages, 590 KB

Open AccessReview

Machine Translation in the Era of Large Language Models:A Survey of Historical and Emerging Problems

by Duygu Ataman, Alexandra Birch, Nizar Habash, Marcello Federico, Philipp Koehn and Kyunghyun Cho

Information 2025, 16(9), 723; https://doi.org/10.3390/info16090723 - 25 Aug 2025

Viewed by 980

Abstract

Historically regarded as one of the most challenging tasks presented to achieve complete artificial intelligence (AI), machine translation (MT) research has seen continuous devotion over the past decade, resulting in cutting-edge architectures for the modeling of sequential information. While the majority of statistical [...] Read more.

Historically regarded as one of the most challenging tasks presented to achieve complete artificial intelligence (AI), machine translation (MT) research has seen continuous devotion over the past decade, resulting in cutting-edge architectures for the modeling of sequential information. While the majority of statistical models traditionally relied on the idea of learning from parallel translation examples, recent research exploring self-supervised and multi-task learning methods extended the capabilities of MT models, eventually allowing the creation of general-purpose large language models (LLMs). In addition to versatility in providing translations useful across languages and domains, LLMs can in principle perform any natural language processing (NLP) task given sufficient amount of task-specific examples. While LLMs now reach a point where they can both replace and augment traditional MT models, the extent of their advantages and the ways in which they leverage translation capabilities across multilingual NLP tasks remains a wide area for exploration. In this literature survey, we present an introduction to the current position of MT research with a historical look at different modeling approaches to MT, how these might be advantageous for the solution of particular problems, and which problems are solved or remain open in regard to recent developments. We also discuss the connection of MT models leading to the development of prominent LLM architectures, how they continue to support LLM performance across different tasks by providing a means for cross-lingual knowledge transfer, and the redefinition of the task with the possibilities that LLM technology brings. Full article

(This article belongs to the Special Issue Human and Machine Translation: Recent Trends and Foundations)

► Show Figures

Figure 1

102 pages, 17708 KB

Open AccessReview

From Detection to Understanding: A Systematic Survey of Deep Learning for Scene Text Processing

by Zhandong Liu, Ruixia Song, Ke Li and Yong Li

Appl. Sci. 2025, 15(17), 9247; https://doi.org/10.3390/app15179247 - 22 Aug 2025

Viewed by 549

Abstract

Scene text understanding, serving as a cornerstone technology for autonomous navigation, document digitization, and accessibility tools, has witnessed a paradigm shift from traditional methods relying on handcrafted features and multi-stage processing pipelines to contemporary deep learning frameworks capable of learning hierarchical representations directly [...] Read more.

Scene text understanding, serving as a cornerstone technology for autonomous navigation, document digitization, and accessibility tools, has witnessed a paradigm shift from traditional methods relying on handcrafted features and multi-stage processing pipelines to contemporary deep learning frameworks capable of learning hierarchical representations directly from raw image inputs. This survey distinctly categorizes modern scene text recognition (STR) methodologies into three principal paradigms: two-stage detection frameworks that employ region proposal networks for precise text localization, single-stage detectors designed to optimize computational efficiency, and specialized architectures tailored to handle arbitrarily shaped text through geometric-aware modeling techniques. Concurrently, an in-depth analysis of text recognition paradigms elucidates the evolutionary trajectory from connectionist temporal classification (CTC) and sequence-to-sequence models to transformer-based architectures, which excel in contextual modeling and demonstrate superior performance. In contrast to prior surveys, this work uniquely emphasizes several key differences and contributions. Firstly, it provides a comprehensive and systematic taxonomy of STR methods, explicitly highlighting the trade-offs between detection accuracy, computational efficiency, and geometric adaptability across different paradigms. Secondly, it delves into the nuances of text recognition, illustrating how transformer-based models have revolutionized the field by capturing long-range dependencies and contextual information, thereby addressing challenges in recognizing complex text layouts and multilingual scripts. Furthermore, the survey pioneers the exploration of critical research frontiers, such as multilingual text adaptation, enhancing model robustness against environmental variations (e.g., lighting conditions, occlusions), and devising data-efficient learning strategies to mitigate the dependency on large-scale annotated datasets. By synthesizing insights from technical advancements across 28 benchmark datasets and standardized evaluation protocols, this study offers researchers a holistic perspective on the current state-of-the-art, persistent challenges, and promising avenues for future research, with the ultimate goal of achieving human-level scene text comprehension. Full article

► Show Figures

Figure 1

24 pages, 1496 KB

Open AccessArticle

The Gradual Cyclical Process in Adaptive Gamified Learning: Generative Mechanisms for Motivational Transformation, Cognitive Advancement, and Knowledge Construction Strategy

by Liwei Ding and Hongfeng Zhang

Appl. Sci. 2025, 15(16), 9211; https://doi.org/10.3390/app15169211 - 21 Aug 2025

Viewed by 369

Abstract

The integration of gamification into digital learning environments is reshaping educational models, advancing towards more adaptive and personalized teaching evolution. However, within large Chinese corpora, the transition mechanism from passive participation to adaptive gamified learning remains underexplored in a systematic manner. This study [...] Read more.

The integration of gamification into digital learning environments is reshaping educational models, advancing towards more adaptive and personalized teaching evolution. However, within large Chinese corpora, the transition mechanism from passive participation to adaptive gamified learning remains underexplored in a systematic manner. This study fills this gap by utilizing LDA topic modeling and sentiment analysis techniques to delve into user comment data on the Bilibili platform. The results extract five major themes, which include multilingual task-driven learning, early-age programming thinking cultivation, modular English competency certification, cross-domain cognitive integration and psychological safety, as well as ubiquitous intelligent educational environments. The analysis reveals that most themes exhibit highly positive emotions, particularly in applications for early childhood education, while learning models that involve certification mechanisms and technological dependencies tend to provoke emotional fluctuations. Nevertheless, learners still experience certain challenges and pressures when faced with frequent cognitive tasks. In an innovative manner, this study proposes a theoretical framework based on Self-Determination Theory and Connectivism to analyze how motivation satisfaction drives cognitive restructuring, thereby facilitating the process of adaptive learning. This model demonstrates the evolutionary logic of learners’ cross-disciplinary knowledge integration and metacognitive strategy optimization, providing empirical support for the gamification learning transformation mechanism in China’s digital education sector and extending the research framework for personalized teaching and self-regulation in educational technology. Full article

(This article belongs to the Special Issue Adaptive E-Learning Technologies and Experiences)

► Show Figures

Figure 1

18 pages, 3632 KB

Open AccessArticle

Multilingual Mobility: Audio-Based Language ID for Automotive Systems

by Joowon Oh and Jeaho Lee

Appl. Sci. 2025, 15(16), 9209; https://doi.org/10.3390/app15169209 - 21 Aug 2025

Viewed by 402

Abstract

With the growing demand for natural and intelligent human–machine interaction in multilingual environments, automatic language identification (LID) has emerged as a crucial component in voice-enabled systems, particularly in the automotive domain. This study proposes an audio-based LID model that identifies the spoken language [...] Read more.

With the growing demand for natural and intelligent human–machine interaction in multilingual environments, automatic language identification (LID) has emerged as a crucial component in voice-enabled systems, particularly in the automotive domain. This study proposes an audio-based LID model that identifies the spoken language directly from voice input without requiring manual language selection. The model architecture leverages two types of feature extraction pipelines: a Variational Autoencoder (VAE) and a pre-trained Wav2Vec model, both used to obtain latent speech representations. These embeddings are then fed into a multi-layer perceptron (MLP)-based classifier to determine the speaker’s language among five target languages: Korean, Japanese, Chinese, Spanish, and French. The model is trained and evaluated using a dataset preprocessed into Mel-Frequency Cepstral Coefficients (MFCCs) and raw waveform inputs. Experimental results demonstrate the effectiveness of the proposed approach in achieving accurate and real-time language detection, with potential applications in in-vehicle systems, speech translation platforms, and multilingual voice assistants. By eliminating the need for predefined language settings, this work contributes to more seamless and user-friendly multilingual voice interaction systems. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

Search Results (876)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (876)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI