MDPI - Publisher of Open Access Journals

18 pages, 3371 KB

Open AccessArticle

Fusing Geoscience Large Language Models and Lightweight RAG for Enhanced Geological Question Answering

by Bo Zhou and Ke Li

Geosciences 2025, 15(10), 382; https://doi.org/10.3390/geosciences15100382 - 2 Oct 2025

Mineral prospecting from vast geological text corpora is impeded by challenges in domain-specific semantic interpretation and knowledge synthesis. General-purpose Large Language Models (LLMs) struggle to parse the complex lexicon and relational semantics of geological texts, limiting their utility for constructing precise knowledge graphs [...] Read more.

Mineral prospecting from vast geological text corpora is impeded by challenges in domain-specific semantic interpretation and knowledge synthesis. General-purpose Large Language Models (LLMs) struggle to parse the complex lexicon and relational semantics of geological texts, limiting their utility for constructing precise knowledge graphs (KGs). Our novel framework addresses this gap by integrating a domain-specific LLM, GeoGPT, with a lightweight retrieval-augmented generation architecture, LightRAG. Within this framework, GeoGPT automates the construction of a high-quality mineral-prospecting KG by performing ontology definition, entity recognition, and relation extraction. The LightRAG component then leverages this KG to power a specialized geological question-answering (Q&A) system featuring a dual-layer retrieval mechanism for enhanced precision and an incremental update capability for dynamic knowledge incorporation. The results indicate that the proposed method achieves a mean F1-score of 0.835 for entity extraction, representing a 17% to 25% performance improvement over general-purpose large models using generic prompts. Furthermore, the geological Q&A model, built upon the LightRAG framework with GeoGPT as its core, demonstrates a superior win rate against the DeepSeek-V3 and Qwen2.5-72B general-purpose large models by 8–29% in the geochemistry domain and 53–78% in the remote sensing geology domain. This study establishes an effective and scalable methodology for intelligent geological text analysis, enabling lightweight, high-performance Q&A systems that accelerate knowledge discovery in mineral exploration. Full article

► Show Figures

Figure 1

8 pages, 658 KB

Open AccessBrief Report

Mechanistically Explainable AI Model for Predicting Synergistic Cancer Therapy Combinations

by Han Si, Sanyam Kumar, Sneh Lata, Arshad Ahmad, Saurav Ghosh, Karen Stephansen, Deepti Nagarkar, Eda Zhou and Brandon W. Higgs

Curr. Oncol. 2025, 32(10), 548; https://doi.org/10.3390/curroncol32100548 - 30 Sep 2025

Abstract

This study introduces a Large Language Model (LLM)-based framework that combines drug combination data with a knowledge graph to predict synergistic oncology drug combinations with mechanistic insights. Using a retrieval-augmented generation (RAG) approach, over 50,000 in vitro drug pair assay results and 1631 [...] Read more.

This study introduces a Large Language Model (LLM)-based framework that combines drug combination data with a knowledge graph to predict synergistic oncology drug combinations with mechanistic insights. Using a retrieval-augmented generation (RAG) approach, over 50,000 in vitro drug pair assay results and 1631 human clinical trial and preclinical test entries were integrated to enhance predictive accuracy and explainability. Validation achieved an F1 score of 0.80, demonstrating the framework’s potential to streamline drug discovery and improve translational strategies in cancer treatment. Full article

(This article belongs to the Special Issue Shaping the Future of Oncology: The Role of Generative AI in Clinical and Research Environments)

► Show Figures

Figure 1

23 pages, 1167 KB

Open AccessArticle

Integrating RAG for Smarter Animal Certification Platforms

by Pedro Bilar Montero, Jonas Bulegon Gassen, Glênio Descovi, Tais Oltramari Barnasque, Gabriel Rodrigues da Silva, Felipe Amadori Machado, Gabriel Vieira Casanova, Vinícius Maran and Alencar Machado

Information 2025, 16(10), 843; https://doi.org/10.3390/info16100843 - 30 Sep 2025

Abstract

Large Language Models (LLMs) encounter significant challenges when applied in specialized domains that require precise and localized information. This problem is particularly critical in regulatory sectors, such as the animal health sector in Brazil, where professionals depend on complex and constantly updated legal [...] Read more.

Large Language Models (LLMs) encounter significant challenges when applied in specialized domains that require precise and localized information. This problem is particularly critical in regulatory sectors, such as the animal health sector in Brazil, where professionals depend on complex and constantly updated legal norms to perform their work. The generic knowledge encapsulated in traditional LLMs is often insufficient to provide reliable support in these contexts, which can lead to inaccurate or outdated responses. To address this gap, this work presents a practical implementation of a Retrieval-Augmented Generation (RAG) system. We detail the integration of this system with the Plataforma de Defesa Sanitária Animal do Rio Grande do Sul (PDSA-RS), a real platform used for animal production certification. Our solution connects an LLM to an external knowledge base containing specific Brazilian legislation, allowing the model to retrieve relevant legal texts in real time to generate its responses. The principal objective is to demonstrate how this approach can produce accurate and contextually grounded answers for professionals in the veterinary field, assisting in decision-making processes for sanitary certification. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Graphical abstract

47 pages, 3137 KB

Open AccessArticle

DietQA: A Comprehensive Framework for Personalized Multi-Diet Recipe Retrieval Using Knowledge Graphs, Retrieval-Augmented Generation, and Large Language Models

by Ioannis Tsampos and Emmanouil Marakakis

Computers 2025, 14(10), 412; https://doi.org/10.3390/computers14100412 - 29 Sep 2025

Abstract

Recipes available on the web often lack nutritional transparency and clear indicators of dietary suitability. While searching by title is straightforward, exploring recipes that meet combined dietary needs, nutritional goals, and ingredient-level preferences remains challenging. Most existing recipe search systems do not effectively [...] Read more.

Recipes available on the web often lack nutritional transparency and clear indicators of dietary suitability. While searching by title is straightforward, exploring recipes that meet combined dietary needs, nutritional goals, and ingredient-level preferences remains challenging. Most existing recipe search systems do not effectively support flexible multi-dietary reasoning in combination with user preferences and restrictions. For example, users may seek gluten-free and dairy-free dinners with suitable substitutions, or compound goals such as vegan and low-fat desserts. Recent systematic reviews report that most food recommender systems are content-based and often non-personalized, with limited support for dietary restrictions, ingredient-level exclusions, and multi-criteria nutrition goals. This paper introduces DietQA, an end-to-end, language-adaptable chatbot system that integrates a Knowledge Graph (KG), Retrieval-Augmented Generation (RAG), and a Large Language Model (LLM) to support personalized, dietary-aware recipe search and question answering. DietQA crawls Greek-language recipe websites to extract structured information such as titles, ingredients, and quantities. Nutritional values are calculated using validated food composition databases, and dietary tags are inferred automatically based on ingredient composition. All information is stored in a Neo4j-based knowledge graph, enabling flexible querying via Cypher. Users interact with the system through a natural language chatbot friendly interface, where they can express preferences for ingredients, nutrients, dishes, and diets, and filter recipes based on multiple factors such as ingredient availability, exclusions, and nutritional goals. DietQA supports multi-diet recipe search by retrieving both compliant recipes and those adaptable via ingredient substitutions, explaining how each result aligns with user preferences and constraints. An LLM extracts intents and entities from user queries to support rule-based Cypher retrieval, while the RAG pipeline generates contextualized responses using the user query and preferences, retrieved recipes, statistical summaries, and substitution logic. The system integrates real-time updates of recipe and nutritional data, supporting up-to-date, relevant, and personalized recommendations. It is designed for language-adaptable deployment and has been developed and evaluated using Greek-language content. DietQA provides a scalable framework for transparent and adaptive dietary recommendation systems powered by conversational AI. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Large Language Modelling (2nd Edition))

► Show Figures

Graphical abstract

25 pages, 2538 KB

Open AccessArticle

Fic2Bot: A Scalable Framework for Persona-Driven Chatbot Generation from Fiction

by Sua Kang, Chaelim Lee, Subin Jung and Minsu Lee

Electronics 2025, 14(19), 3859; https://doi.org/10.3390/electronics14193859 - 29 Sep 2025

Abstract

This paper presents Fic2Bot, an end-to-end framework that automatically transforms raw novel text into in-character chatbots by combining scene-level retrieval with persona profiling. Unlike conventional RAG-based systems that emphasize factual accuracy but neglect stylistic coherence, Fic2Bot ensures both factual grounding and consistent persona [...] Read more.

This paper presents Fic2Bot, an end-to-end framework that automatically transforms raw novel text into in-character chatbots by combining scene-level retrieval with persona profiling. Unlike conventional RAG-based systems that emphasize factual accuracy but neglect stylistic coherence, Fic2Bot ensures both factual grounding and consistent persona expression without any manual intervention. The framework integrates (1) Major Entity Identification (MEI) for robust coreference resolution, (2) scene-structured retrieval for precise contextual grounding, and (3) stylistic and sentiment profiling to capture linguistic and emotional traits of each character. Experiments conducted on novels from diverse genres show that Fic2Bot achieves robust entity resolution, more relevant retrieval, highly accurate speaker attribution, and stronger persona consistency in multi-turn dialogues. These results highlight Fic2Bot as a scalable and domain-agnostic framework for persona-driven chatbot generation, with potential applications in interactive roleplaying, language and literary studies, and entertainment. Full article

(This article belongs to the Special Issue Feature Papers in Artificial Intelligence)

► Show Figures

Figure 1

26 pages, 7003 KB

Open AccessArticle

Agentic Search Engine for Real-Time Internet of Things Data

by Abdelrahman Elewah, Khalid Elgazzar and Said Elnaffar

Sensors 2025, 25(19), 5995; https://doi.org/10.3390/s25195995 - 28 Sep 2025

Abstract

The Internet of Things (IoT) has enabled a vast network of devices to communicate over the Internet. However, the fragmentation of IoT systems continues to hinder seamless data sharing and coordinated management across platforms.However, there is currently no actual search engine for IoT [...] Read more.

The Internet of Things (IoT) has enabled a vast network of devices to communicate over the Internet. However, the fragmentation of IoT systems continues to hinder seamless data sharing and coordinated management across platforms.However, there is currently no actual search engine for IoT data. Existing IoT search engines are considered device discovery tools, providing only metadata about devices rather than enabling access to IoT application data. While efforts such as IoTCrawler have striven to support IoT application data, they have largely failed due to the fragmentation of IoT systems and the heterogeneity of IoT data.To address this, we recently introduced SensorsConnect—a unified framework designed to facilitate interoperable content and sensor data sharing among collaborative IoT systems, inspired by how the World Wide Web (WWW) enabled shared and accessible information spaces for humans. This paper presents the IoT Agentic Search Engine (IoTASE), a real-time semantic search engine tailored specifically for IoT environments. IoTASE leverages LLMs and Retrieval-Augmented Generation (RAG) techniques to address the challenges of navigating and searching vast, heterogeneous streams of real-time IoT data. This approach enables the system to process complex natural language queries and return accurate, contextually relevant results in real time. To evaluate its effectiveness, we implemented a hypothetical deployment in the Toronto region, simulating a realistic urban environment using a dataset composed of 500 services and over 37,000 IoT-like data entries. Our evaluation shows that IoT-ASE achieved 92% accuracy in retrieving intent-aligned services and consistently generated concise, relevant, and preference-aware responses, outperforming generalized outputs produced by systems such as Gemini. These results underscore the potential of IoT-ASE to make real-time IoT data both accessible and actionable, supporting intelligent decision-making across diverse application domains. Full article

(This article belongs to the Special Issue Recent Trends in AI-Based Intelligent Sensing Systems and IoTs)

► Show Figures

Figure 1

21 pages, 4052 KB

Open AccessArticle

Enhancing Geological Knowledge Engineering with Retrieval-Augmented Generation: A Case Study of the Qin–Hang Metallogenic Belt

by Jianhua Ma, Yongzhang Zhou, Luhao He, Qianlong Zhang, Muhammad Atif Bilal and Yuqing Zhang

Minerals 2025, 15(10), 1023; https://doi.org/10.3390/min15101023 - 26 Sep 2025

Abstract

This study presents a domain-adapted retrieval-augmented generation (RAG) pipeline that integrates geological knowledge with large language models (LLMs) to support intelligent question answering in the metallogenic domain. Focusing on the Qin–Hang metallogenic belt in South China, we construct a bilingual question-answering (QA) corpus [...] Read more.

This study presents a domain-adapted retrieval-augmented generation (RAG) pipeline that integrates geological knowledge with large language models (LLMs) to support intelligent question answering in the metallogenic domain. Focusing on the Qin–Hang metallogenic belt in South China, we construct a bilingual question-answering (QA) corpus derived from 615 authoritative geological publications, covering topics such as regional tectonics, ore-forming processes, structural evolution, and mineral resources. Using the ChatGLM3-6B language model and LangChain framework, we embed the corpus into a semantic vector database via Sentence-BERT and FAISS, enabling dynamic retrieval and grounded response generation. The RAG-enhanced model significantly outperforms baseline LLMs—including ChatGPT-4, Bing, and Gemini—in a comparative evaluation using BLEU, precision, recall, and F1 metrics, achieving an F1 score of 0.8689. The approach demonstrates high domain adaptability and reproducibility. All datasets and codes are openly released to facilitate application in other metallogenic belts. This work illustrates the potential of LLM-based knowledge engineering to support digital geoscientific research and smart mining. Full article

(This article belongs to the Special Issue Application of Big Data Mining, Machine Learning and Artificial Intelligence in Geoscience, 2nd Edition)

► Show Figures

Figure 1

22 pages, 701 KB

Open AccessArticle

CuBE: A Customizable Bounds Evaluation Framework for Automated Assessment of RAG Systems in Government Services

by Bolun Yang, Xuhong Yu, Xin Zheng, Jing Nong, Zhentao Liu, Xinmin Dai and Xiaoyao Xie

Appl. Sci. 2025, 15(19), 10447; https://doi.org/10.3390/app151910447 - 26 Sep 2025

Abstract

Retrieval-Augmented Generation (RAG) systems are increasingly adopted in government services, yet different administrations have varying customization needs and lack standardized methods to evaluate performance. In particular, general-purpose evaluation approaches fail to show how well a system meets domain-specific expectations. This paper presents CuBE [...] Read more.

Retrieval-Augmented Generation (RAG) systems are increasingly adopted in government services, yet different administrations have varying customization needs and lack standardized methods to evaluate performance. In particular, general-purpose evaluation approaches fail to show how well a system meets domain-specific expectations. This paper presents CuBE (Customizable Bounds Evaluation), a tailored evaluation framework for RAG systems in public administration. CuBE integrates large language model (LLM) scoring, customizable evaluation dimensions, and a bounded scoring paradigm with baseline and upper-bound reference sets, enhancing fairness, consistency, and interpretability. We further introduce Lightweight Targeted Assessment (LTA) to support efficient customization. CuBE is validated on GSIA (Guizhou Provincial Government Service Center Intelligent Assistant) by using four state-of-the-art language models. The results show that CuBE produces robust, stable, and model-agnostic evaluations while reducing reliance on manual annotation and facilitating system optimization and rapid iteration. Moreover, CuBE informs parameter settings, enabling developers to design RAG systems that better meet customizer needs. This study establishes a replicable paradigm for trustworthy and efficient evaluation of RAG systems in complex government service scenarios. Full article

► Show Figures

Figure 1

30 pages, 2461 KB

Open AccessArticle

RAGMed: A RAG-Based Medical AI Assistant for Improving Healthcare Delivery

by Rajvardhan Patil, Manideep Abbidi and Sherri Fannon

AI 2025, 6(10), 240; https://doi.org/10.3390/ai6100240 - 24 Sep 2025

Viewed by 174

Abstract

Electronic Health Records (EHRs) have enhanced access to medical information but have also introduced challenges for healthcare providers, such as increased documentation workload and reduced face-to-face interaction with patients. To mitigate these issues, we propose RAGMed, a Retrieval-Augmented Generation (RAG)-based AI assistant designed [...] Read more.

Electronic Health Records (EHRs) have enhanced access to medical information but have also introduced challenges for healthcare providers, such as increased documentation workload and reduced face-to-face interaction with patients. To mitigate these issues, we propose RAGMed, a Retrieval-Augmented Generation (RAG)-based AI assistant designed to deliver automated and clinically grounded responses to frequently asked patient questions. This system combines a vector database for semantic retrieval with the generative capabilities of a large language model to provide accurate, reliable answers without requiring direct physician involvement. In addition to patient-facing support, the assistant facilitates appointment scheduling and assists clinicians by summarizing clinical notes, thereby streamlining healthcare workflows. Additionally, to evaluate the influence of retrieval quality on overall system performance, we compare two embedding models, gte-large and all-MiniLM-L6-v2, using real-world medical queries. The models are assessed within the RAG-Triad Framework, focusing on context relevance, answer relevance, and factual groundedness. The results indicate that gte-large, owing to its higher-dimensional embeddings, retrieves more informative context, resulting in more accurate and trustworthy responses. These findings underscore the importance of not only the potential of incorporating RAG-based systems to alleviate physician workload and enhance the efficiency and accessibility of healthcare delivery but also the dimensionality of models used to generate embeddings, as this directly influences the relevance, accuracy, and contextual understanding of retrieved information. This prototype is intended for the retrieval-augmented answering of medical FAQs and general informational queries, and is not designed for diagnostic use or treatment recommendations without professional validation. Full article

(This article belongs to the Section Medical & Healthcare AI)

► Show Figures

Figure 1

27 pages, 504 KB

Open AccessArticle

Speaking with the Past: Constructing AI-Generated Historical Characters for Cultural Heritage and Learning

by Boaventura DaCosta

Heritage 2025, 8(9), 387; https://doi.org/10.3390/heritage8090387 - 18 Sep 2025

Viewed by 368

Abstract

Recent advances in generative artificial intelligence (AI) have enabled the creation of AI-generated characters modeled after historical figures, offering new opportunities for reflective and interactive engagement in both cultural heritage and education. This study explores the development and evaluation of a large language [...] Read more.

Recent advances in generative artificial intelligence (AI) have enabled the creation of AI-generated characters modeled after historical figures, offering new opportunities for reflective and interactive engagement in both cultural heritage and education. This study explores the development and evaluation of a large language model representation of Joseph Lister (1827–1912), a pioneer of antiseptic surgery, within a retrieval-augmented generation framework. The purpose was to examine the model’s accuracy, authenticity, and reliability, highlighting challenges, best practices, and ethical considerations. Drawing on primary and secondary sources, including Lister’s writings, the model was constructed using OpenAI’s GPT-4o and refined through iterative validation. Prompts were categorized by cognitive complexity, and responses were evaluated against historical materials. The findings revealed a strong fidelity to Lister’s voice, with appropriate tone, diction, and temporal limits. Moreover, the model demonstrated behavioral control, reflective depth, and consistency across the different prompts. However, minor lapses in temporal framing and occasional embellishments were noted. The findings suggest that, when developed with care, AI-generated characters can support ethically grounded, historically sensitive learning experiences. At the same time, this approach warrants continued scrutiny and underscores the need for further interdisciplinary research and responsible implementation. Full article

► Show Figures

Figure 1

32 pages, 3609 KB

Open AccessArticle

BPMN-Based Design of Multi-Agent Systems: Personalized Language Learning Workflow Automation with RAG-Enhanced Knowledge Access

by Hedi Tebourbi, Sana Nouzri, Yazan Mualla, Meryem El Fatimi, Amro Najjar, Abdeljalil Abbas-Turki and Mahjoub Dridi

Information 2025, 16(9), 809; https://doi.org/10.3390/info16090809 - 17 Sep 2025

Viewed by 350

Abstract

The intersection of Artificial Intelligence (AI) and education is revolutionizing learning and teaching in this digital era, with Generative AI and large language models (LLMs) providing even greater possibilities for the future. The digital transformation of language education demands innovative approaches that combine [...] Read more.

The intersection of Artificial Intelligence (AI) and education is revolutionizing learning and teaching in this digital era, with Generative AI and large language models (LLMs) providing even greater possibilities for the future. The digital transformation of language education demands innovative approaches that combine pedagogical rigor with explainable AI (XAI) principles, particularly for low-resource languages. This paper presents a novel methodology that integrates Business Process Model and Notation (BPMN) with Multi-Agent Systems (MAS) to create transparent, workflow-driven language tutors. Our approach uniquely embeds XAI through three mechanisms: (1) BPMN’s visual formalism that makes agent decision-making auditable, (2) Retrieval-Augmented Generation (RAG) with verifiable knowledge provenance from textbooks of the National Institute of Languages of Luxembourg, and (3) human-in-the-loop validation of both content and pedagogical sequencing. To ensure realism in learner interaction, we integrate speech-to-text and text-to-speech technologies, creating an immersive, human-like learning environment. The system simulates intelligent tutoring through agents’ collaboration and dynamic adaptation to learner progress. We demonstrate this framework through a Luxembourgish language learning platform where specialized agents (Conversational, Reading, Listening, QA, and Grammar) operate within BPMN-modeled workflows. The system achieves high response faithfulness (0.82) and relevance (0.85) according to RAGA metrics, while speech integration using Whisper STT and Coqui TTS enables immersive practice. Evaluation with learners showed 85.8% satisfaction with contextual responses and 71.4% engagement rates, confirming the effectiveness of our process-driven approach. This work advances AI-powered language education by showing how formal process modeling can create pedagogically coherent and explainable tutoring systems. The architecture’s modularity supports extension to other low-resource languages while maintaining the transparency critical for educational trust. Future work will expand curriculum coverage and develop teacher-facing dashboards to further improve explainability. Full article

(This article belongs to the Section Information Applications)

► Show Figures

Figure 1

29 pages, 3212 KB

Open AccessArticle

An Innovative Retrieval-Augmented Generation Framework for Stage-Specific Knowledge Translation in Biomimicry Design

by Hsueh-Kuan Chen and Hung-Hsiang Wang

Biomimetics 2025, 10(9), 626; https://doi.org/10.3390/biomimetics10090626 - 17 Sep 2025

Viewed by 335

Abstract

Converting biological strategies into practical design principles during the Discover–Abstract phase of the Biomimicry Design Spiral (BSD) presents a considerable obstacle, particularly for designers lacking a biological background. This research introduces a Retrieval-Augmented Generation (RAG) framework that combines a specialized AskNature database of [...] Read more.

Converting biological strategies into practical design principles during the Discover–Abstract phase of the Biomimicry Design Spiral (BSD) presents a considerable obstacle, particularly for designers lacking a biological background. This research introduces a Retrieval-Augmented Generation (RAG) framework that combines a specialized AskNature database of 2106 documents with a locally executed Llama 3.1 large language model (LLM) to fill this void. The innovation of this study lies in integrating the BDS with a stage-specific RAG–LLM framework. Unlike BioTRIZ or SAPPhIRE, which require specialized expertise, our approach provides designers with semantically precise and biologically grounded strategies that can be directly translated into practical design principles. A quasi-experimental study with 30 industrial design students assessed three setups—LLM-only, RAG-Small, and RAG-Large—throughout six biomimicry design stages. Performance was assessed via expert evaluations of text and design concept quality, along with a review of retrieval diversity. Findings indicate that RAG-Large consistently yielded superior text quality in stages with high cognitive demands. It also retrieved a more varied array of high-specificity biological ideas and facilitated more coherent incorporation of functional, aesthetic, and semantic aspects in design results. This framework diminishes cognitive burden, boosts the relevance and originality of inspirations, and provides a reproducible, stage-specific AI assistance model for closing the knowledge translation gap in biomimicry design, though its current validation is limited to a small sample and a single task domain. Full article

(This article belongs to the Section Biomimetic Design, Constructions and Devices)

► Show Figures

Figure 1

19 pages, 1599 KB

Open AccessArticle

Enhancing Clinical Named Entity Recognition via Fine-Tuned BERT and Dictionary-Infused Retrieval-Augmented Generation

by Soumya Challaru Sreenivas, Saqib Chowdhury and Mohammad Masum

Electronics 2025, 14(18), 3676; https://doi.org/10.3390/electronics14183676 - 17 Sep 2025

Viewed by 391

Abstract

Clinical notes often contain unstructured text filled with abbreviations, non-standard terminology, and inconsistent phrasing, which pose significant challenges for automated medical information extraction. Named Entity Recognition (NER) plays a crucial role in structuring this data by identifying and categorizing key clinical entities such [...] Read more.

Clinical notes often contain unstructured text filled with abbreviations, non-standard terminology, and inconsistent phrasing, which pose significant challenges for automated medical information extraction. Named Entity Recognition (NER) plays a crucial role in structuring this data by identifying and categorizing key clinical entities such as symptoms, medications, and diagnoses. However, traditional and even transformer-based NER models often struggle with ambiguity and fail to produce clinically interpretable outputs. In this study, we present a hybrid two-stage framework that enhances medical NER by integrating a fine-tuned BERT model for initial entity extraction with a Dictionary-Infused Retrieval-Augmented Generation (DiRAG) module for terminology normalization. Our approach addresses two critical limitations in current clinical NER systems: lack of contextual clarity and inconsistent standardization of medical terms. The DiRAG module combines semantic retrieval from a UMLS-based vector database with lexical matching and prompt-based generation using a large language model, ensuring precise and explainable normalization of ambiguous entities. The fine-tuned BERT model achieved an F1 score of 0.708 on the MACCROBAT dataset, outperforming several domain-specific baselines, including BioBERT and ClinicalBERT. The integration of the DiRAG module further improved the interpretability and clinical relevance of the extracted entities. Through qualitative case studies, we demonstrate that our framework not only enhances clarity but also mitigates common issues such as abbreviation ambiguity and terminology inconsistency. Full article

(This article belongs to the Special Issue Advances in Text Mining and Analytics)

► Show Figures

Figure 1

16 pages, 2128 KB

Open AccessArticle

Secure Multifaceted-RAG: Hybrid Knowledge Retrieval with Security Filtering

by Grace Byun, Shinsun Lee, Nayoung Choi and Jinho D. Choi

Information 2025, 16(9), 804; https://doi.org/10.3390/info16090804 - 16 Sep 2025

Viewed by 384

Abstract

Existing Retrieval-Augmented Generation (RAG) systems face challenges in enterprise settings due to limited retrieval scope and data security risks. When relevant internal documents are unavailable, the system struggles to generate accurate and complete responses. Additionally, using closed-source Large Language Models (LLMs) raises concerns [...] Read more.

Existing Retrieval-Augmented Generation (RAG) systems face challenges in enterprise settings due to limited retrieval scope and data security risks. When relevant internal documents are unavailable, the system struggles to generate accurate and complete responses. Additionally, using closed-source Large Language Models (LLMs) raises concerns about exposing proprietary information. To address these issues, we propose the Secure Multifaceted-RAG (SecMulti-RAG) framework, which retrieves not only from internal documents but also from two supplementary sources: pre-generated expert knowledge for anticipated queries and on-demand external LLM-generated knowledge. To mitigate security risks, we adopt a local open-source generator and selectively utilize external LLMs only when prompts are deemed safe by a filtering mechanism. This approach enhances completeness, prevents data leakage, and reduces costs. In our evaluation on a report generation task in the automotive industry, SecMulti-RAG significantly outperforms traditional RAG—achieving 79.3–91.9% win rates across correctness, richness, and helpfulness in LLM-based evaluation and 56.3–70.4% in human evaluation. This highlights SecMulti-RAG as a practical and secure solution for enterprise RAG. Full article

(This article belongs to the Special Issue Advanced Retrieval-Augmented Generation Systems Based on Large Language Models)

► Show Figures

Figure 1

59 pages, 3482 KB

Open AccessFeature PaperArticle

Empirical Evaluation of Reasoning LLMs in Machinery Functional Safety Risk Assessment and the Limits of Anthropomorphized Reasoning

by Padma Iyenghar

Electronics 2025, 14(18), 3624; https://doi.org/10.3390/electronics14183624 - 12 Sep 2025

Cited by 1 | Viewed by 341

Abstract

Transparent reasoning and interpretability are essential for AI-supported risk assessment, yet it remains unclear whether large language models (LLMs) can provide reliable, deterministic support for safety-critical tasks or merely simulate reasoning through plausible outputs. This study presents a systematic, multi-model empirical evaluation of [...] Read more.

Transparent reasoning and interpretability are essential for AI-supported risk assessment, yet it remains unclear whether large language models (LLMs) can provide reliable, deterministic support for safety-critical tasks or merely simulate reasoning through plausible outputs. This study presents a systematic, multi-model empirical evaluation of reasoning-capable LLMs applied to machinery functional safety, focusing on Required Performance Level (PL_r) estimation as defined by ISO 13849-1 and ISO 12100. Six state-of-the-art models (Claude-opus, o3-mini, o4-mini, GPT-5-mini, Gemini-2.5-flash, DeepSeek-Reasoner) were evaluated across six prompting strategies and two dataset variants: canonical ISO-style hazards (Variant 1) and engineer-authored free-text scenarios (Variant 2). Results show that rule-grounded prompting consistently stabilizes performance, achieving ceiling-level accuracy in Variant 1 and restoring reliability under lexical variability in Variant 2. In contrast, unconstrained chain-of-thought reasoning (CoT) and CoT together with Retrieval-Augmented Generation (RAG) introduce volatility, overprediction biases, and model-dependent degradations. Safety-critical coverage was quantified through per-class F1 and recall of PL_r class e, confirming that only rule-grounded prompts reliably captured rare but high-risk hazards. Latency analysis demonstrated that rule-only prompts were both the most accurate and the most efficient, while CoT strategies incurred 2–10× overhead. A confusion/rescue analysis of retrieval interactions further revealed systematic noise mechanisms such as P-inflation and F-drift, showing that retrieval can either destabilize or rescue cases depending on model family. Intermediate severity/frequency/possibility (S/F/P) reasoning steps were found to diverge from ISO-consistent logic, reinforcing critiques that LLM “reasoning” reflects surface-level continuation rather than genuine inference. All reported figures include 95% confidence intervals, t-intervals across runs (

r = 5

) for accuracy and timing, and class-stratified bootstrap CIs for Micro/Macro/Weighted-F₁ and per-class metrics. Overall, this study establishes a rigorous benchmark for evaluating LLMs in functional safety workflows such as PL_r determination. It shows that deterministic, safety-critical classification requires strict rule-constrained prompting and careful retrieval governance, rather than reliance on assumed model reasoning abilities. Full article

(This article belongs to the Special Issue New Insights into Natural Language Processing and Large Language Models)

► Show Figures

Figure 1

Search Results (168)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (168)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI