MDPI - Publisher of Open Access Journals

20 pages, 17596 KB

Open AccessArticle

Enhanced Facial Realism in Personalized Diffusion Models: A Memory-Optimized DreamBooth Implementation for Consumer Hardware

by Sandeep Gupta, Kanad Ray, Shamim Kaiser, Sazzad Hossain and Jocelyn Faubert

Algorithms 2026, 19(4), 257; https://doi.org/10.3390/a19040257 - 27 Mar 2026

Viewed by 342

Abstract

Despite significant progress in general-purpose diffusion-based models capable of producing high-quality media, this approach is still too difficult to implement on consumer/gamer hardware. We present here a memory-optimized DreamBooth framework designed for consumer-grade GPUs with 16 GB of VRAM, that allows for end-to-end [...] Read more.

Despite significant progress in general-purpose diffusion-based models capable of producing high-quality media, this approach is still too difficult to implement on consumer/gamer hardware. We present here a memory-optimized DreamBooth framework designed for consumer-grade GPUs with 16 GB of VRAM, that allows for end-to-end image personalization and addresses some of the limitations of existing solutions. Our system reduces peak GPU memory from 22 GB (baseline DreamBooth) to 14.2 GB through novel hierarchical memory management, including attention slicing, Variational Autoencoder (VAE) tiling, gradient accumulation, and gradient checkpointing integrated within the Hugging Face Accelerate ecosystem. The framework further incorporates state-of-the-art techniques for preserving facial features and a comprehensive automated quality management system. The result is a complete end-to-end pipeline achieving a peak memory of 14.2 GB, with quantitative performance (LPIPS: 0.139, SSIM: 0.879, identity: 0.852, and FID: 23.1) competitive with methods requiring significantly more hardware resources. Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

► Show Figures

Figure 1

33 pages, 5023 KB

Open AccessArticle

Recommender Systems: Emerging Trends from Four Decades of Research Using Bibliometric Analysis and Transformer-Based Models

by Simona-Vasilica Oprea, Adela Bâra and Tudor Ghinea

Electronics 2026, 15(4), 763; https://doi.org/10.3390/electronics15040763 - 11 Feb 2026

Viewed by 1332

Abstract

Recommender systems represent an essential infrastructure for digital platforms. To understand their evolution, we analyze 15,944 Web of Science publications (1980–2025) using bibliometric techniques, generative and transformer models for sentiment analysis and latent topic modeling. Our analysis yields three major findings. First, e-commerce [...] Read more.

Recommender systems represent an essential infrastructure for digital platforms. To understand their evolution, we analyze 15,944 Web of Science publications (1980–2025) using bibliometric techniques, generative and transformer models for sentiment analysis and latent topic modeling. Our analysis yields three major findings. First, e-commerce recommendation research exhibits rapid growth in advanced representation techniques, with compound annual growth rates for contrastive learning (187%), graph neural networks (89%) and federated learning (72%). Second, algorithmic fairness and privacy preservation have emerged as critical research directions. Third, collaborative networks indicate a geographical shift, with Asia–Pacific regions becoming influential research hubs. The methodology integrates CAGR analysis with Latent Dirichlet Allocation (LDA, coherence score = 0.687) and BERTopic for thematic mapping and network analysis. Additionally, we employ sentiment analysis (VADER, TextBlob and a sentiment analysis pipeline from Hugging Face Transformers) and temporal heatmaps to capture research narratives. Topic modeling with LDA identifies five core themes: (1) Collaborative Filtering; (2) Machine Learning and Educational Systems; (3) Web Services and Business Applications; (4) Content and Multimedia Recommendations; (5) Graph Neural Networks and Advanced Models. BERTopic provides eight more nuanced themes based on semantics. Citation patterns follow the Pareto principle, where the top 1% of articles account for 29.1% of all citations, confirming a highly skewed impact distribution. Notably, established keywords show declining trajectories, indicating a methodological evolution toward newer, deep learning and generative AI-based paradigms. Full article

(This article belongs to the Special Issue Data Mining and Recommender Systems)

► Show Figures

Figure 1

22 pages, 494 KB

Open AccessArticle

LinguoNER: A Language-Agnostic Framework for Named Entity Recognition in Low-Resource Languages with a Focus on Yambeta

by Philippe Tamla, Stephane Donna, Tobias Bigala, Dilan Nde, Maxime Yves Julien Manifi Abouh and Florian Freund

Informatics 2026, 13(2), 31; https://doi.org/10.3390/informatics13020031 - 11 Feb 2026

Viewed by 689

Abstract

This paper presents LinguoNER, a practical and extensible framework for bootstrapping Named Entity Recognition (NER) in extremely low-resource languages, demonstrated on Yambeta, a Bantu language spoken by a minority community in Cameroon. Due to scarce digital resources and the absence of [...] Read more.

This paper presents LinguoNER, a practical and extensible framework for bootstrapping Named Entity Recognition (NER) in extremely low-resource languages, demonstrated on Yambeta, a Bantu language spoken by a minority community in Cameroon. Due to scarce digital resources and the absence of annotated corpora, Yambeta has remained largely underrepresented in Natural Language Processing (NLP). LinguoNER addresses this gap by providing a methodologically transparent end-to-end workflow that integrates corpus acquisition, gazetteer-driven automatic annotation, tokenizer training, transformer fine-tuning, and multi-level evaluation in settings where large-scale manual annotation is infeasible. Using a Bible-derived corpus as a linguistically stable starting point, we release the first publicly available Yambeta NER dataset (≈25,000 tokens) annotated with the CoNLL BIO scheme and a restricted entity schema (PER/LOC/ORG). Because labels are generated via dictionary-based annotation, the corpus is best characterized as silver-standard; credibility is strengthened through recorded dictionaries, transparency logs, expert-in-the-loop validation on sampled subsets, and complementary qualitative error analysis. We additionally train a dedicated Yambeta WordPiece tokenizer that preserves tone markers and diacritics, and fine-tune a bert-base-cased transformer for token classification. On a held-out test split, LinguoNER achieves strong token-level performance (Precision = 0.989, Recall = 0.981, F1 = 0.985), substantially outperforming a dictionary-only gazetteer baseline (

Δ

F1 ≈ 0.36). Per-entity-type evaluation further indicates improvements beyond surface-form matching, while remaining errors are linguistically motivated and primarily involve multi-word entity boundaries, agglutinative constructions, and tone-/diacritic-sensitive tokenization. We emphasize that results are restricted to a Bible domain and a limited label space, and should be interpreted as proof-of-concept evidence rather than claims of broad out-of-domain generalization. Overall, LinguoNER provides a reproducible blueprint for bootstrapping NER resources in underrepresented languages and supports future work on broader corpora sources (e.g., news, OPUS, JW300), additional African languages (e.g., Yoruba, Igbo, Bassa), and the iterative creation of expert-refined datasets and gold-standard subsets. Full article

► Show Figures

Figure 1

23 pages, 8263 KB

Open AccessArticle

Uncertainty-Aware Deep Learning for Sugarcane Leaf Disease Detection Using Monte Carlo Dropout and MobileNetV3

by Pathmanaban Pugazhendi, Chetan M. Badgujar, Madasamy Raja Ganapathy and Manikandan Arumugam

AgriEngineering 2026, 8(1), 31; https://doi.org/10.3390/agriengineering8010031 - 16 Jan 2026

Viewed by 754

Abstract

Sugarcane diseases cause estimated global annual losses of over $5 billion. While deep learning shows promise for disease detection, current approaches lack transparency and confidence estimates, limiting their adoption by agricultural stakeholders. We developed an uncertainty-aware detection system integrating Monte Carlo (MC) dropout [...] Read more.

Sugarcane diseases cause estimated global annual losses of over $5 billion. While deep learning shows promise for disease detection, current approaches lack transparency and confidence estimates, limiting their adoption by agricultural stakeholders. We developed an uncertainty-aware detection system integrating Monte Carlo (MC) dropout with MobileNetV3, trained on 2521 images across five categories: Healthy, Mosaic, Red Rot, Rust, and Yellow. The proposed framework achieved 97.23% accuracy with a lightweight architecture comprising 5.4 M parameters. It enabled a 2.3 s inference while generating well-calibrated uncertainty estimates that were 4.0 times higher for misclassifications. High-confidence predictions (>70%) achieved 98.2% accuracy. Gradient-weighted Class Activation Mapping provided interpretable disease localization, and the system was deployed on Hugging Face Spaces for global accessibility. The model demonstrated high recall for the Healthy and Red Rot classes. The model achieved comparatively higher recall for the Healthy and Red Rot classes. The inclusion of uncertainty quantification provides additional information that may support more informed decision-making in precision agriculture applications involving farmers and agronomists. Full article

(This article belongs to the Section Computer Applications and Artificial Intelligence in Agriculture)

► Show Figures

Figure 1

21 pages, 1991 KB

Open AccessArticle

Zero-Shot Resume–Job Matching with LLMs via Structured Prompting and Semantic Embeddings

by Panagiotis Skondras, Panagiotis Zervas and Giannis Tzimas

Electronics 2025, 14(24), 4960; https://doi.org/10.3390/electronics14244960 - 17 Dec 2025

Viewed by 1799

Abstract

In this article, we present a tool for matching resumes to job posts and vice versa (job post to resumes). With minor modifications, it may also be adapted to other domains where text matching is necessary. This tool may help organizations save time [...] Read more.

In this article, we present a tool for matching resumes to job posts and vice versa (job post to resumes). With minor modifications, it may also be adapted to other domains where text matching is necessary. This tool may help organizations save time during the hiring process, as well as assist applicants by allowing them to match their resumes to job posts they have selected. To achieve text matching without any model training (zero-shot matching), we constructed dynamic structured prompts that consisted of unstructured and semi-structured job posts and resumes based on specific criteria, and we utilized the Chain of Thought (CoT) technique on the Mistral model (open-mistral-7b). In response, the model generated structured (segmented) job posts and resumes. Then, the job posts and resumes were cleaned and preprocessed. We utilized state-of-the-art sentence similarity models hosted on Hugging face (nomic-embed-text-v1-5 and google-embedding-gemma-300m) through inference endpoints to create sentence embeddings for each resume and job post segment. We used the cosine similarity metric to determine the optimal matching, and the matching operation was applied to eleven different occupations. The results we achieved reached up to 87% accuracy for some of the occupations and underscore the potential of zero-shot techniques in text matching utilizing LLMs. The dataset we used was from indeed.com, and the Spring AI framework was used for the implementation of the tool. Full article

(This article belongs to the Special Issue Advances in Text Mining and Analytics)

► Show Figures

Figure 1

30 pages, 1663 KB

Open AccessArticle

Deep Learning-Driven Integration of Multimodal Data for Material Property Predictions

by Vítor Costa, José Manuel Oliveira and Patrícia Ramos

Computation 2025, 13(12), 282; https://doi.org/10.3390/computation13120282 - 1 Dec 2025

Viewed by 1599

Abstract

Advancements in deep learning have revolutionized materials discovery by enabling predictive modeling of complex material properties. However, single-modal approaches often fail to capture the intricate interplay of compositional, structural, and morphological characteristics. This study introduces a novel multimodal deep learning framework for enhanced [...] Read more.

Advancements in deep learning have revolutionized materials discovery by enabling predictive modeling of complex material properties. However, single-modal approaches often fail to capture the intricate interplay of compositional, structural, and morphological characteristics. This study introduces a novel multimodal deep learning framework for enhanced material property prediction, integrating textual (chemical compositions), tabular (structural descriptors), and image-based (2D crystal structure visualizations) modalities. Utilizing the Alexandriadatabase, we construct a comprehensive multimodal dataset of 10,000 materials with symmetry-resolved crystallographic data. Specialized neural architectures, such as FT-Transformer for tabular data, Hugging Face Electra-based model for text, and TIMM-based MetaFormer for images, generate modality-specific embeddings, fused through a hybrid strategy into a unified latent space. The framework predicts seven critical material properties, including electronic (band gap, density of states), thermodynamic (formation energy, energy above hull, total energy), magnetic (magnetic moment per volume), and volumetric (volume per atom) features, many governed by crystallographic symmetry. Experimental results demonstrated that multimodal fusion significantly outperforms unimodal baselines. Notably, the bimodal integration of image and text data showed significant gains, reducing the Mean Absolute Error for band gap by approximately 22.7% and for volume per atom by 22.4% compared to the average unimodal models. This combination also achieved a 28.4% reduction in Root Mean Squared Error for formation energy. The full trimodal model (tabular + images + text) yielded competitive, and in several cases the lowest, error metrics, particularly for band gap, magnetic moment per volume and density of states per atom, confirming the value of integrating all three modalities. This scalable, modular framework advances materials informatics, offering a powerful tool for data-driven materials discovery and design. Full article

► Show Figures

Figure 1

16 pages, 3476 KB

Open AccessArticle

ROboMC: A Portable Multimodal System for eHealth Training and Scalable AI-Assisted Education

by Marius Cioca and Adriana-Lavinia Cioca

Inventions 2025, 10(6), 103; https://doi.org/10.3390/inventions10060103 - 11 Nov 2025

Viewed by 1256

Abstract

AI-based educational chatbots can expand access to learning, but many remain limited to text-only interfaces and fixed infrastructures, while purely generative responses raise concerns of reliability and consistency. In this context, we present ROboMC, a portable and multimodal system that combines a validated [...] Read more.

AI-based educational chatbots can expand access to learning, but many remain limited to text-only interfaces and fixed infrastructures, while purely generative responses raise concerns of reliability and consistency. In this context, we present ROboMC, a portable and multimodal system that combines a validated knowledge base with generative responses (OpenAI) and voice–text interaction, designed to enable both text and voice interaction, ensuring reliability and flexibility in diverse educational scenarios. The system, developed in Django, integrates two response pipelines: local search using normalized keywords and fuzzy matching in the LocalQuestion database, and fallback to the generative model GPT-3.5-Turbo (OpenAI, San Francisco, CA, USA) with a prompt adapted exclusively for Romanian and an explicit disclaimer. All interactions are logged in AutomaticQuestion for later analysis, supported by a semantic encoder (SentenceTransformer—paraphrase-multilingual-MiniLM-L12-v2’, Hugging Face Inc., New York, NY, USA) that ensures search tolerance to variations in phrasing. Voice interaction is managed through gTTS (Google LLC, Mountain View, CA, USA) with integrated audio playback, while portability is achieved through deployment on a Raspberry Pi 4B (Raspberry Pi Foundation, Cambridge, UK) with microphone, speaker, and battery power. Voice input is enabled through a cloud-based speech-to-text component (Google Web Speech API accessed via the Python SpeechRecognition library, (Anthony Zhang, open-source project, USA) using the Google Web Speech API (Google LLC, Mountain View, CA, USA; language = “ro-RO”)), allowing users to interact by speaking. Preliminary tests showed average latencies of 120–180 ms for validated responses on laptop and 250–350 ms on Raspberry Pi, respectively, 2.5–3.5 s on laptop and 4–6 s on Raspberry Pi for generative responses, timings considered acceptable for real educational scenarios. A small-scale usability study (N ≈ 35) indicated good acceptability (SUS ~80/100), with participants valuing the balance between validated and generative responses, the voice integration, and the hardware portability. Although system validation was carried out in the eHealth context, its architecture allows extension to any educational field: depending on the content introduced into the validated database, ROboMC can be adapted to medicine, engineering, social sciences, or other disciplines, relying on ChatGPT only when no clear match is found in the local base, making it a scalable and interdisciplinary solution. Full article

(This article belongs to the Section Inventions and Innovation in Design, Modeling and Computing Methods)

► Show Figures

Figure 1

20 pages, 68980 KB

Open AccessArticle

Investigating the Role of Personality in Appearance Preferences for Huggable Communication Interfaces: A User-Centered Study

by Eleuda Nunez, Barbara Sienkiewicz, Valentina Ramirez Millan, Bipin Indurkhya and Kenji Suzuki

Electronics 2025, 14(21), 4295; https://doi.org/10.3390/electronics14214295 - 31 Oct 2025

Viewed by 793

Abstract

As alternative remote communication interfaces become increasingly common, ensuring that they seamlessly integrate into daily life has become a pressing design challenge. In this context, what should a huggable communication device look like—should it have arms or a face, or resemble a conventional [...] Read more.

As alternative remote communication interfaces become increasingly common, ensuring that they seamlessly integrate into daily life has become a pressing design challenge. In this context, what should a huggable communication device look like—should it have arms or a face, or resemble a conventional pillow? This study investigates users’ preferences and personalities regarding the appearance of such interfaces for remote emotional interaction. As a case study, we present HugBits, a round, cushion-like device that transmits hugs through visual and tactile feedback. Drawing on the prior literature and a participatory design workshop, we developed seven shape variations and evaluated them through an online survey with 79 Polish participants. The results reveal a consistent preference for less anthropomorphic designs, with users valuing comfort, simplicity, and intuitive affordances such as areas to rest the head or wrap the arms around. Although personality traits did not significantly predict preferences, the findings highlight broader design criteria: huggable communication interfaces, intended to remain visible and available in shared spaces, must balance emotional expressiveness with social acceptability. These insights provide guidelines for designing emotionally engaging, user-centered mediated touch technologies. Full article

(This article belongs to the Special Issue Human-Robot Interaction and Applications: Challenges and Future Perspectives)

► Show Figures

Figure 1

25 pages, 1777 KB

Open AccessArticle

TwinGuard: Privacy-Preserving Digital Twins for Adaptive Email Threat Detection

by Taiwo Oladipupo Ayodele

J. Cybersecur. Priv. 2025, 5(4), 91; https://doi.org/10.3390/jcp5040091 - 29 Oct 2025

Viewed by 1491

Abstract

Email continues to serve as a primary vector for cyber-attacks, with phishing, spoofing, and polymorphic malware evolving rapidly to evade traditional defences. Conventional email security systems, often reliant on static, signature-based detection struggle to identify zero-day exploits and protect user privacy in increasingly [...] Read more.

Email continues to serve as a primary vector for cyber-attacks, with phishing, spoofing, and polymorphic malware evolving rapidly to evade traditional defences. Conventional email security systems, often reliant on static, signature-based detection struggle to identify zero-day exploits and protect user privacy in increasingly data-driven environments. This paper introduces TwinGuard, a privacy-preserving framework that leverages digital twin technology to enable adaptive, personalised email threat detection. TwinGuard constructs dynamic behavioural models tailored to individual email ecosystems, facilitating proactive threat simulation and anomaly detection without accessing raw message content. The system integrates a BERT–LSTM hybrid for semantic and temporal profiling, alongside federated learning, secure multi-party computation (SMPC), and differential privacy to enable collaborative intelligence while preserving confidentiality. Empirical evaluations were conducted using both synthetic AI-generated email datasets and real-world datasets sourced from Hugging Face and Kaggle. TwinGuard achieved 98% accuracy, 97% precision, and a false positive rate of 3%, outperforming conventional detection methods. The framework offers a scalable, regulation-compliant solution that balances security efficacy with strong privacy protection in modern email ecosystems. Full article

(This article belongs to the Special Issue Cybersecurity in the Age of AI and IoT: Challenges and Innovations)

► Show Figures

Figure 1

19 pages, 706 KB

Open AccessArticle

Exploring the Nexus of Opportunities and Challenges in Indigenous Language Podcasting Through Natural Language Processing of User-Generated Content

by Bukola Christiana Ajala, Abiodun Salawu, Israel Ayinla Fadipe and Yetunde Pesu Aromavo

Journal. Media 2025, 6(4), 179; https://doi.org/10.3390/journalmedia6040179 - 16 Oct 2025

Viewed by 1596

Abstract

Part of the relics of colonialism on the African continent is the loss of social identity caused by the adoption of colonial languages, leading to the endangered status of indigenous African languages. This qualitative study examines the potential and challenges of podcasting in [...] Read more.

Part of the relics of colonialism on the African continent is the loss of social identity caused by the adoption of colonial languages, leading to the endangered status of indigenous African languages. This qualitative study examines the potential and challenges of podcasting in indigenous African languages, with a focus on Yoruba. We conducted a sentiment analysis of the podcast “I Speak Yoruba Too” and “learn Yoruba online” to assess the range of audience feedback on the podcast. 735 data points were gathered and preprocessed, Hugging face transformers were used to analyse the sentiments on audience feedback. The result of the analysis shows that the negative reviews were 183, the neutral reviews 226, and the positive reviews 326. The visualisation of the word cloud of the labels shows the words frequently used in the reviews, revealing the challenges and the appreciation of the commenter. An in-depth interview was conducted with the host of the “I Speak Yoruba Too” podcast and the “learn Yoruba online Podcast”. The findings reveal that part of the challenges of podcasting include the absence of a standard Yoruba curriculum for foreign learners and time constraints. This paper argues that the deterministic nature of podcast technology offers opportunities to content creators and listeners, based on the medium’s flexibility and ease of access in facilitating language acquisition. Audience reviews and interview results also confirm the potential of the podcast to generate community building and social identity formation among learners. However, the monetisation of such digital products is often underexplored by both emerging and established podcasters. Full article

(This article belongs to the Special Issue Sociality and Digitality: An Exploration of New Forms of Digital Social Connection and Belonging in Africa)

► Show Figures

Figure 1

24 pages, 2394 KB

Open AccessArticle

Extracting Emotions from Customer Reviews Using Text Mining, Large Language Models and Fine-Tuning Strategies

by Simona-Vasilica Oprea and Adela Bâra

J. Theor. Appl. Electron. Commer. Res. 2025, 20(3), 221; https://doi.org/10.3390/jtaer20030221 - 1 Sep 2025

Cited by 4 | Viewed by 3898

Abstract

User-generated content, such as product and app reviews, offers more than just sentiment. It provides a rich spectrum of emotional expression that reveals users’ experiences, frustrations and expectations. Traditional sentiment analysis, which typically classifies text as positive or negative, lacks the nuance needed [...] Read more.

User-generated content, such as product and app reviews, offers more than just sentiment. It provides a rich spectrum of emotional expression that reveals users’ experiences, frustrations and expectations. Traditional sentiment analysis, which typically classifies text as positive or negative, lacks the nuance needed to fully understand the emotional drivers behind customer feedback. In this research, we focus on fine-grained emotion classification using core emotions. By identifying specific emotions rather than sentiment polarity, we enable more actionable insights for e-commerce and app development, supporting strategies such as feature refinement, marketing personalization and proactive customer engagement. We leverage the Hugging Face Emotions dataset and adopt a two-phase modeling approach. In the first phase, we use a pre-trained DistilBERT model as a feature extractor and evaluate multiple classical classifiers (Logistic Regression, Support Vector Classifier, Random Forest) to establish performance baselines. In the second phase, we fine-tune the DistilBERT model end-to-end using the Hugging Face Trainer API, optimizing classification performance through task-specific adaptation. Training is tracked using the Weights & Biases (wandb) API. Comparative analysis highlights the substantial performance gains from fine-tuning, particularly in capturing informal or noisy language typical in user reviews. The final fine-tuned model is applied to a dataset of customers’ reviews, identifying the dominant emotions expressed. Our results demonstrate the practical value of emotion-aware analytics in uncovering the underlying “why” behind user sentiment, enabling more empathetic decision-making across product design, customer support and user experience (UX) strategy. Full article

► Show Figures

Figure 1

29 pages, 434 KB

Open AccessArticle

Comparative Analysis of Natural Language Processing Techniques in the Classification of Press Articles

by Kacper Piasta and Rafał Kotas

Appl. Sci. 2025, 15(17), 9559; https://doi.org/10.3390/app15179559 - 30 Aug 2025

Cited by 2 | Viewed by 1722

Abstract

The study undertook a comprehensive review and comparative analysis of natural language processing techniques for news article classification, with a particular focus on Java language libraries. The dataset comprised an excess of 200,000 items of news metadata sourced from The Huffington Post. The [...] Read more.

The study undertook a comprehensive review and comparative analysis of natural language processing techniques for news article classification, with a particular focus on Java language libraries. The dataset comprised an excess of 200,000 items of news metadata sourced from The Huffington Post. The traditional algorithms based on mathematical statistics and deep machine learning were evaluated. The libraries chosen for tests were Apache OpenNLP, Stanford CoreNLP, Waikato Weka, and the Huggingface ecosystem with the Pytorch backend. The efficacy of the trained models in forecasting specific topics was evaluated, and diverse methodologies for the feature extraction and analysis of word-vector representations were explored. The study considered aspects such as hardware resource management, implementation simplicity, learning time, and the quality of the resulting model in terms of detection, and it examined a range of techniques for attribute selection, feature filtering, vector representation, and the handling of imbalanced datasets. Advanced techniques for word selection and named entity recognition were employed. The study compared different models and configurations in terms of their performance and the resources they consumed. Furthermore, it addressed the difficulties encountered when processing lengthy texts with transformer neural networks, and it presented potential solutions such as sequence truncation and segment analysis. The elevated computational cost inherent to Java-based languages may present challenges in machine learning tasks. OpenNLP model achieved 84% accuracy, Weka and CoreNLP attained 86% and 88%, respectively, and DistilBERT emerged as the top performer, with an accuracy rate of 92%. Deep learning models demonstrated superior performance, training time, and ease of implementation compared to conventional statistical algorithms. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications—2nd Edition)

► Show Figures

Figure 1

30 pages, 21387 KB

Open AccessArticle

An Intelligent Docent System with a Small Large Language Model (sLLM) Based on Retrieval-Augmented Generation (RAG)

by Taemoon Jung and Inwhee Joe

Appl. Sci. 2025, 15(17), 9398; https://doi.org/10.3390/app15179398 - 27 Aug 2025

Cited by 5 | Viewed by 2857

Abstract

This study designed and empirically evaluated a method to enhance information accessibility for museum and art gallery visitors using a small Large Language Model (sLLM) based on the Retrieval-Augmented Generation (RAG) framework. Over 199,000 exhibition descriptions were collected and refined, and a question-answering [...] Read more.

This study designed and empirically evaluated a method to enhance information accessibility for museum and art gallery visitors using a small Large Language Model (sLLM) based on the Retrieval-Augmented Generation (RAG) framework. Over 199,000 exhibition descriptions were collected and refined, and a question-answering dataset consisting of 102,000 pairs reflecting user personas was constructed to develop DocentGemma, a domain-optimized language model. This model was fine-tuned through Low-Rank Adaptation (LoRA) based on Google’s Gemma2-9B and integrated with FAISS and OpenSearch-based document retrieval systems within the LangChain framework. Performance evaluation was conducted using a dedicated Q&A benchmark for the docent domain, comparing the model against five commercial and open-source LLMs (including GPT-3.5 Turbo, LLaMA3.3-70B, and Gemma2-9B). DocentGemma achieved an accuracy of 85.55% and a perplexity of 3.78, demonstrating competitive performance in language generation and response accuracy within the domain-specific context. To enhance retrieval relevance, a Spatio-Contextual Retriever (SC-Retriever) was introduced, which combines semantic similarity and spatial proximity based on the user’s query and location. An ablation study confirmed that integrating both modalities improved retrieval quality, with the SC-Retriever achieving a recall@1 of 53.45% and a Mean Reciprocal Rank (MRR) of 68.12, representing a 17.5 20% gain in search accuracy compared to baseline models such as GTE and SpatialNN. System performance was further validated through field deployment at three major exhibition venues in Seoul (the Seoul History Museum, the Hwan-ki Museum, and the Hanseong Baekje Museum). A user test involving 110 participants indicated high response credibility and an average satisfaction score of 4.24. To ensure accessibility, the system supports various output formats, including multilingual speech and subtitles. This work illustrates a practical application of integrating LLM-based conversational capabilities into traditional docent services and suggests potential for further development toward location-aware interactive systems and AI-driven cultural content services. Full article

(This article belongs to the Special Issue Advancements in Large Language Models Applied in Multidisciplinary Research Contexts)

► Show Figures

Figure 1

19 pages, 2135 KB

Open AccessArticle

Development of an Automotive Electronics Internship Assistance System Using a Fine-Tuned Llama 3 Large Language Model

by Ying-Chia Huang, Hsin-Jung Tsai, Hui-Ting Liang, Bo-Siang Chen, Tzu-Hsin Chu, Wei-Sho Ho, Wei-Lun Huang and Ying-Ju Tseng

Systems 2025, 13(8), 668; https://doi.org/10.3390/systems13080668 - 6 Aug 2025

Cited by 1 | Viewed by 1406

Abstract

This study develops and validates an artificial intelligence (AI)-assisted internship learning platform for automotive electronics based on the Llama 3 large language model, aiming to enhance pedagogical effectiveness within vocational training contexts. Addressing critical issues such as the persistent theory–practice gap and limited [...] Read more.

This study develops and validates an artificial intelligence (AI)-assisted internship learning platform for automotive electronics based on the Llama 3 large language model, aiming to enhance pedagogical effectiveness within vocational training contexts. Addressing critical issues such as the persistent theory–practice gap and limited innovation capability prevalent in existing curricula, we leverage the natural language processing (NLP) capabilities of Llama 3 through fine-tuning based on transfer learning to establish a specialized knowledge base encompassing fundamental circuit principles and fault diagnosis protocols. The implementation employs the Hugging Face Transformers library with optimized hyperparameters, including a learning rate of 5 × 10⁻⁵ across five training epochs. Post-training evaluations revealed an accuracy of 89.7% on validation tasks (representing a 12.4% improvement over the baseline model), a semantic comprehension precision of 92.3% in technical question-and-answer assessments, a mathematical computation accuracy of 78.4% (highlighting this as a current limitation), and a latency of 6.3 s under peak operational workloads (indicating a system bottleneck). Although direct trials involving students were deliberately avoided, the platform’s technical feasibility was validated through multidimensional benchmarking against established models (BERT-base and GPT-2), confirming superior domain adaptability (F1 = 0.87) and enhanced error tolerance (σ² = 1.2). Notable limitations emerged in numerical reasoning tasks (Cohen’s d = 1.15 compared to human experts) and in real-time responsiveness deterioration when exceeding 50 concurrent users. The study concludes that Llama 3 demonstrates considerable promise for automotive electronics skills development. Proposed future enhancements include integrating symbolic AI modules to improve computational reliability, implementing Kubernetes-based load balancing to ensure latency below 2 s at scale, and conducting longitudinal pedagogical validation studies with trainees. This research provides a robust technical foundation for AI-driven vocational education, especially suited to mechatronics fields that require close integration between theoretical knowledge and practical troubleshooting skills. Full article

(This article belongs to the Special Issue The Application of a Large Language Model (LLM) in Education Reform and Innovation)

► Show Figures

Figure 1

22 pages, 1763 KB

Open AccessArticle

A FIT4NER Generic Approach for Framework-Independent Medical Named Entity Recognition

by Florian Freund, Philippe Tamla, Frederik Wilde and Matthias Hemmje

Information 2025, 16(7), 554; https://doi.org/10.3390/info16070554 - 29 Jun 2025

Viewed by 932

Abstract

This article focuses on assisting medical professionals in analyzing domain-specific texts and selecting and comparing Named Entity Recognition (NER) frameworks. It details the development and evaluation of a system that utilizes a generic approach alongside the structured Nunamaker methodology. This system empowers medical [...] Read more.

This article focuses on assisting medical professionals in analyzing domain-specific texts and selecting and comparing Named Entity Recognition (NER) frameworks. It details the development and evaluation of a system that utilizes a generic approach alongside the structured Nunamaker methodology. This system empowers medical professionals to train, evaluate, and compare NER models across diverse frameworks, such as Stanford CoreNLP, spaCy, and Hugging Face Transformers, independent of their specific implementations. Additionally, it introduces a concept for modeling a general training and evaluation process. Finally, experiments using various ontologies from the CRAFT corpus are conducted to assess the effectiveness of the current prototype. Full article

(This article belongs to the Special Issue Emerging Applications of Machine Learning in Healthcare, Industry, and Beyond)

► Show Figures

Figure 1

Search Results (32)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (32)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI