Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,449)

Search Parameters:
Keywords = large language models (LLMs)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 1978 KB  
Article
StressSpeak: A Speech-Driven Framework for Real-Time Personalized Stress Detection and Adaptive Psychological Support
by Laraib Umer, Javaid Iqbal, Yasar Ayaz, Hassan Imam, Adil Ahmad and Umer Asgher
Diagnostics 2025, 15(22), 2871; https://doi.org/10.3390/diagnostics15222871 (registering DOI) - 12 Nov 2025
Abstract
Background: Stress is a critical determinant of mental health, yet conventional monitoring approaches often rely on subjective self-reports or physiological signals that lack real-time responsiveness. Recent advances in large language models (LLMs) offer opportunities for speech-driven, adaptive stress detection, but existing systems are [...] Read more.
Background: Stress is a critical determinant of mental health, yet conventional monitoring approaches often rely on subjective self-reports or physiological signals that lack real-time responsiveness. Recent advances in large language models (LLMs) offer opportunities for speech-driven, adaptive stress detection, but existing systems are limited to retrospective text analysis, monolingual settings, or detection-only outputs. Methods: We developed a real-time, speech-driven stress detection framework that integrates audio recording, speech-to-text conversion, and linguistic analysis using transformer-based LLMs. The system provides multimodal outputs, delivering recommendations in both text and synthesized speech. Nine LLM variants were evaluated on five benchmark datasets under zero-shot and few-shot learning conditions. Performance was assessed using accuracy, precision, recall, F1-score, and misclassification trends (false-negatives and false-positives). Real-time feasibility was analyzed through latency modeling, and user-centered validation was conducted across cross-domains. Results: Few-shot fine-tuning improved model performance across all datasets, with Large Language Model Meta AI (LLaMA) and Robustly Optimized BERT Pretraining Approach (RoBERTa) achieving the highest F1-scores and reduced false-negatives, particularly for suicide risk detection. Latency analysis revealed a trade-off between responsiveness and accuracy, with delays ranging from ~2 s for smaller models to ~7.6 s for LLaMA-7B on 30 s audio inputs. Multilingual input support and multimodal output enhanced inclusivity. User feedback confirmed strong usability, accessibility, and adoption potential in real-world settings. Conclusions: This study demonstrates that real-time, LLM-powered stress detection is both technically robust and practically feasible. By combining speech-based input, multimodal feedback, and user-centered validation, the framework advances beyond traditional detection only models toward scalable, inclusive, and deployment-ready digital mental health solutions. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

16 pages, 309 KB  
Article
Large Language Models as Coders of Pragmatic Competence in Healthy Aging: Preliminary Results on Reliability, Limits, and Implications for Human-Centered AI
by Arianna Boldi, Ilaria Gabbatore and Francesca M. Bosco
Electronics 2025, 14(22), 4411; https://doi.org/10.3390/electronics14224411 (registering DOI) - 12 Nov 2025
Abstract
Pragmatics concerns how people use language and other expressive means, such as nonverbal and paralinguistic cues, to convey intended meaning in the context. Difficulties in pragmatics are common across distinct clinical conditions, motivating validated assessments such as the Assessment Battery for Communication (ABaCo); [...] Read more.
Pragmatics concerns how people use language and other expressive means, such as nonverbal and paralinguistic cues, to convey intended meaning in the context. Difficulties in pragmatics are common across distinct clinical conditions, motivating validated assessments such as the Assessment Battery for Communication (ABaCo); whether Large Language Models (LLMs) can serve as reliable coders remains uncertain. In this exploratory study, we used Generative Pre-trained Transformer (GPT)-4o as a rater on 2025 item × dimension units drawn from the responses given by 10 healthy older adults (M = 69.8) to selected ABaCo items. Expert human coders served as the reference standard to compare GPT-4o scores. Agreement metrics included exact agreement, Cohen’s κ, and a discrepancy audit by pragmatic act. Agreement was 89.1% with κ = 0.491. Errors were non-random across acts (χ2(12) = 69.4, p < 0.001). After Benjamini–Hochberg False Discovery Rate correction across 26 cells, only two categories remained significant: false positives concentrated in Command and false negatives in Deceit. Missing prosodic and gestural cues likely exacerbate command-specific failures. In conclusion, in text-only settings, GPT-4o can serve as a supervised second coder for healthy-aging assessments of pragmatic competence, under human oversight. Safe clinical deployment requires population-specific validation and multimodal inputs that recover nonverbal cues. Full article
20 pages, 2260 KB  
Article
Construction of a Person–Job Temporal Knowledge Graph Using Large Language Models
by Zhongshan Zhang, Junzhi Wang, Bo Li, Xiang Lin and Mingyu Liu
Big Data Cogn. Comput. 2025, 9(11), 287; https://doi.org/10.3390/bdcc9110287 (registering DOI) - 12 Nov 2025
Abstract
Person–job data are multi-source, heterogeneous, and strongly temporal, making knowledge modeling and analysis challenging. We present an automated approach for constructing a Human-Resources Temporal Knowledge Graph. We first formalize a schema in which temporal relations are represented as sets of time intervals. On [...] Read more.
Person–job data are multi-source, heterogeneous, and strongly temporal, making knowledge modeling and analysis challenging. We present an automated approach for constructing a Human-Resources Temporal Knowledge Graph. We first formalize a schema in which temporal relations are represented as sets of time intervals. On top of this schema, a large language model (LLM) pipeline extracts entities, relations, and temporal expressions, augmented by self-verification and external knowledge injection to enforce schema compliance, resolve ambiguities, and automatically repair outputs. Context-aware prompting and confidence-based escalation further improve robustness. Evaluated on a corpus of 2000 Chinese resumes, our method outperforms strong baselines, and ablations confirm the necessity and synergy of each component; notably, temporal extraction attains an F1 of 0.9876. The proposed framework provides a reusable path and engineering foundation for downstream HR tasks—such as profiling, relational reasoning, and position matching—supporting more reliable, time-aware decision-making in complex organizations. Full article
35 pages, 904 KB  
Article
Clustering-Guided Automatic Generation of Algorithms for the Multidimensional Knapsack Problem
by Cristian Inzulza, Caio Bezares, Franco Cornejo and Victor Parada
Mach. Learn. Knowl. Extr. 2025, 7(4), 144; https://doi.org/10.3390/make7040144 (registering DOI) - 12 Nov 2025
Abstract
We propose a hybrid framework that integrates instance clustering with Automatic Generation of Algorithms (AGA) to produce specialized algorithms for classes of Multidimensional Knapsack Problem (MKP) instances. This approach is highly relevant given the latest trends in AI, where Large Language Models (LLMs) [...] Read more.
We propose a hybrid framework that integrates instance clustering with Automatic Generation of Algorithms (AGA) to produce specialized algorithms for classes of Multidimensional Knapsack Problem (MKP) instances. This approach is highly relevant given the latest trends in AI, where Large Language Models (LLMs) are actively being used to automate and refine algorithm design through evolutionary frameworks. Our method utilizes a feature-based representation of 328 MKP instances and evaluates K-means, HDBSCAN, and random clustering to produce 11 clusters per method. For each cluster, a master optimization problem was solved using Genetic Programming, evolving algorithms encoded as syntax trees. Fitness was measured as relative error against known optima, a similar objective to those being tackled in LLM-driven optimization. Experimental and statistical analyses demonstrate that clustering-guided AGA significantly reduces average relative error and accelerates convergence compared with AGA trained on randomly grouped instances. K-means produced the most consistent cluster-specialization. Cross-cluster evaluation reveals a trade-off between specialization and generalization. The results demonstrate that clustering prior to AGA is a practical preprocessing step for designing automated algorithms in NP-hard combinatorial problems, paving the way for advanced methodologies that incorporate AI techniques. Full article
Show Figures

Figure 1

27 pages, 1589 KB  
Systematic Review
Can Large Language Models Foster Critical Thinking, Teamwork, and Problem-Solving Skills in Higher Education?: A Literature Review
by Rafael Martínez-Peláez, Luis J. Mena, Homero Toral-Cruz, Alberto Ochoa-Brust, Apolinar González Potes, Víctor Flores, Rodolfo Ostos, Julio C. Ramírez Pacheco, Ramón A. Félix and Vanessa G. Félix
Systems 2025, 13(11), 1013; https://doi.org/10.3390/systems13111013 - 12 Nov 2025
Abstract
Over the last two years, with the rapid development of artificial intelligence, Large Language Models (LLMs) have obtained significant attention from the academic sector, making their application in higher education attractive for students, managers, faculty, and stakeholders. We conducted a Systematic Literature Review [...] Read more.
Over the last two years, with the rapid development of artificial intelligence, Large Language Models (LLMs) have obtained significant attention from the academic sector, making their application in higher education attractive for students, managers, faculty, and stakeholders. We conducted a Systematic Literature Review on the adoption of LLMs in the higher education system to address persistent issues and promote critical thinking, teamwork, and problem-solving skills. Following the PRISMA 2020 protocol, a systematic search was conducted in the Web of Science Core Collection for studies published between 2023 and 2024. After a systematic search and filtering of 203 studies, we included 22 articles for further analysis. The findings show that LLMs can transform traditional teaching through active learning, align curricula with real-world demands, provide personalized feedback in large classes, and enhance assessment practices focused on applied problem-solving. Their effects are transversal, influencing multiple dimensions of higher education systems. Consequently, LLMs have the potential to improve educational equity, strengthen workforce readiness, and foster innovation across disciplines and institutions. This systematic review is registered in PROSPERO (2025 CRD420251165731). Full article
Show Figures

Figure 1

32 pages, 2954 KB  
Review
From Traditional Machine Learning to Fine-Tuning Large Language Models: A Review for Sensors-Based Soil Moisture Forecasting
by Md Babul Islam, Antonio Guerrieri, Raffaele Gravina, Declan T. Delaney and Giancarlo Fortino
Sensors 2025, 25(22), 6903; https://doi.org/10.3390/s25226903 - 12 Nov 2025
Abstract
Smart Agriculture (SA) combines cutting edge technologies such as the Internet of Things (IoT), Artificial Intelligence (AI), and real-time sensing systems with traditional farming practices to enhance productivity, optimize resource use, and support environmental sustainability. A key aspect of SA is the continuous [...] Read more.
Smart Agriculture (SA) combines cutting edge technologies such as the Internet of Things (IoT), Artificial Intelligence (AI), and real-time sensing systems with traditional farming practices to enhance productivity, optimize resource use, and support environmental sustainability. A key aspect of SA is the continuous monitoring of field conditions, particularly Soil Moisture (SM), which plays a crucial role in crop growth and water management. Accurate forecasting of SM allows farmers to make timely irrigation decisions, improve field management, and conserve water. To support this, recent studies have increasingly adopted soil sensors, local weather data, and AI-based data-driven models for SM forecasting. In the literature, most existing review articles lack a structured framework and often overlook recent advancements, including privacy-preserving Federated Learning (FL), Transfer Learning (TL), and the integration of Large Language Models (LLMs). To address this gap, this paper proposes a novel taxonomy for SM forecasting and presents a comprehensive review of existing approaches, including traditional machine learning, deep learning, and hybrid models. Using the PRISMA methodology, we reviewed over 189 papers and selected 68 peer-reviewed studies published between 2017 and 2025. These studies are analyzed based on sensor types, input features, AI techniques, data durations, and evaluation metrics. Six guiding research questions were developed to shape the review and inform the taxonomy. Finally, this work identifies promising research directions, such as the application of TinyML for edge deployment, explainable AI for improved transparency, and privacy-aware model training. This review aims to provide researchers and practitioners with valuable insights for building accurate, scalable, and trustworthy SM forecasting systems to advance SA. Full article
(This article belongs to the Special Issue Feature Papers in the Internet of Things Section 2025)
Show Figures

Figure 1

25 pages, 4855 KB  
Article
Improved Flood Management and Risk Communication Through Large Language Models
by Divas Karimanzira, Thomas Rauschenbach, Tobias Hellmund and Linda Ritzau
Algorithms 2025, 18(11), 713; https://doi.org/10.3390/a18110713 - 12 Nov 2025
Abstract
In light of urbanization, climate change, and the escalation of extreme weather events, flood management is becoming more and more important. Improving community resilience and reducing flood risks require prompt decision-making and effective communication. This study investigates how flood management systems can incorporate [...] Read more.
In light of urbanization, climate change, and the escalation of extreme weather events, flood management is becoming more and more important. Improving community resilience and reducing flood risks require prompt decision-making and effective communication. This study investigates how flood management systems can incorporate Large Language Models (LLMs), especially those that use Retrieval-Augmented Generation (RAG) architectures. We suggest a multimodal framework that uses a Flood Knowledge Graph to aggregate data from various sources, such as social media, hydrological, and meteorological inputs. Although LLMs have the potential to be transformative, we also address important drawbacks like governance issues, hallucination risks, and a lack of physical modeling capabilities. When compared to text-only LLMs, the RAG system significantly improves the reliability of flood-related decision support by reducing factual inconsistency rates by more than 75%. Our suggested architecture includes expert validation and security layers to guarantee dependable, useful results, like flood-constrained evacuation route planning. In areas that are vulnerable to flooding, this strategy seeks to strengthen warning systems, enhance information sharing, and build resilient communities. Full article
(This article belongs to the Special Issue Artificial Intelligence Algorithms in Sustainability)
Show Figures

Figure 1

10 pages, 857 KB  
Article
Estimating the Utility of Using Structured and Unstructured Data for Extracting Incidents of External Hospitalizations from Patient Documents
by Michael Davenport, Robert Hall, Saraswathi Kappala, Trevor Michelson, Robert Mitchell, David Winski, Cynthia Hau, Sarah Leatherman and Frank Meng
Information 2025, 16(11), 978; https://doi.org/10.3390/info16110978 - 12 Nov 2025
Abstract
Patients within the US Department of Veterans Affairs (VA) healthcare system have the option of receiving care at facilities external to the VA network. This work presents a method for identifying external hospitalizations among the VA’s patient population by utilizing data stored in [...] Read more.
Patients within the US Department of Veterans Affairs (VA) healthcare system have the option of receiving care at facilities external to the VA network. This work presents a method for identifying external hospitalizations among the VA’s patient population by utilizing data stored in patient records. The process of extracting this information is complicated by the fact that indicators of external hospitalizations come from two sources: well-defined structured data and free-form unstructured text. Though natural language processing (NLP) leveraging Large Language Models (LLMs) has advanced capabilities to automate information extraction from free text, deploying these systems remains complex and costly. Using structured data is low-cost, but its utility must be determined in order to optimally allocate resources. We describe a method for estimating the utility of using structured and unstructured data and show that if specific conditions are met, the level of effort to perform this estimate can be greatly reduced. For external hospitalizations in the VA, our analysis showed that 44.4% of cases identified using unstructured data could not be found using structured data alone. Full article
Show Figures

Figure 1

15 pages, 930 KB  
Article
Performance Evaluation Metrics for Empathetic LLMs
by Yuna Hong, Bonhwa Ku and Hanseok Ko
Information 2025, 16(11), 977; https://doi.org/10.3390/info16110977 - 11 Nov 2025
Abstract
With the rapid advancement of large language models (LLMs), recent systems have demonstrated increasing capability in understanding and expressing human emotions. However, no objective and standardized metric currently exists to evaluate how empathetic an LLM’s response is. To address this gap, we propose [...] Read more.
With the rapid advancement of large language models (LLMs), recent systems have demonstrated increasing capability in understanding and expressing human emotions. However, no objective and standardized metric currently exists to evaluate how empathetic an LLM’s response is. To address this gap, we propose a novel evaluation framework that measures both sentiment-level and emotion-level alignment between a user query and a model-generated response. The proposed metric consists of two components. The sentiment component evaluates overall affective polarity through Sentlink and the naturalness of emotional expression via NEmpathySort. The emotion component measures fine-grained emotional correspondence using Emosight. Additionally, a semantic component, based on RAGAS, assesses the contextual relevance and coherence of the response. Experimental results demonstrate that our metric effectively captures both the intensity and nuance of empathy in LLM-generated responses, providing a solid foundation for the development of emotionally intelligent conversational AI. Full article
Show Figures

Figure 1

35 pages, 2963 KB  
Article
Explainable Artificial Intelligence Framework for Predicting Treatment Outcomes in Age-Related Macular Degeneration
by Mini Han Wang
Sensors 2025, 25(22), 6879; https://doi.org/10.3390/s25226879 - 11 Nov 2025
Abstract
Age-related macular degeneration (AMD) is a leading cause of irreversible blindness, yet current tools for forecasting treatment outcomes remain limited by either the opacity of deep learning or the rigidity of rule-based systems. To address this gap, we propose a hybrid neuro-symbolic and [...] Read more.
Age-related macular degeneration (AMD) is a leading cause of irreversible blindness, yet current tools for forecasting treatment outcomes remain limited by either the opacity of deep learning or the rigidity of rule-based systems. To address this gap, we propose a hybrid neuro-symbolic and large language model (LLM) framework that combines mechanistic disease knowledge with multimodal ophthalmic data for explainable AMD treatment prognosis. In a pilot cohort of ten surgically managed AMD patients (six men, four women; mean age 67.8 ± 6.3 years), we collected 30 structured clinical documents and 100 paired imaging series (optical coherence tomography, fundus fluorescein angiography, scanning laser ophthalmoscopy, and ocular/superficial B-scan ultrasonography). Texts were semantically annotated and mapped to standardized ontologies, while images underwent rigorous DICOM-based quality control, lesion segmentation, and quantitative biomarker extraction. A domain-specific ophthalmic knowledge graph encoded causal disease and treatment relationships, enabling neuro-symbolic reasoning to constrain and guide neural feature learning. An LLM fine-tuned on ophthalmology literature and electronic health records ingested structured biomarkers and longitudinal clinical narratives through multimodal clinical-profile prompts, producing natural-language risk explanations with explicit evidence citations. On an independent test set, the hybrid model achieved AUROC 0.94 ± 0.03, AUPRC 0.92 ± 0.04, and a Brier score of 0.07, significantly outperforming purely neural and classical Cox regression baselines (p ≤ 0.01). Explainability metrics showed that >85% of predictions were supported by high-confidence knowledge-graph rules, and >90% of generated narratives accurately cited key biomarkers. A detailed case study demonstrated real-time, individualized risk stratification—for example, predicting an >70% probability of requiring three or more anti-VEGF injections within 12 months and a ~45% risk of chronic macular edema if therapy lapsed—with predictions matching the observed clinical course. These results highlight the framework’s ability to integrate multimodal evidence, provide transparent causal reasoning, and support personalized treatment planning. While limited by single-center scope and short-term follow-up, this work establishes a scalable, privacy-aware, and regulator-ready template for explainable, next-generation decision support in AMD management, with potential for expansion to larger, device-diverse cohorts and other complex retinal diseases. Full article
(This article belongs to the Special Issue Sensing Functional Imaging Biomarkers and Artificial Intelligence)
Show Figures

Figure 1

19 pages, 1763 KB  
Article
Research on the Automatic Generation of Information Requirements for Emergency Response to Unexpected Events
by Yao Li, Chang Guo, Zhenhai Lu, Chao Zhang, Wei Gao, Jiaqi Liu and Jungang Yang
Appl. Sci. 2025, 15(22), 11953; https://doi.org/10.3390/app152211953 - 11 Nov 2025
Abstract
In dealing with emergency events, it is very important when making scientific and correct decisions. As an important premise, the creation of information needs is quite essential. Taking earthquakes as a type of unexpected event, this paper constructs a large and model-driven system [...] Read more.
In dealing with emergency events, it is very important when making scientific and correct decisions. As an important premise, the creation of information needs is quite essential. Taking earthquakes as a type of unexpected event, this paper constructs a large and model-driven system for automating the generating process of information requirements for earthquake response. This research explores how the different departments interact during an earthquake emergency response, how the information interacts with each other, and how the information requirement process operates. The system is designed from three points of view, building a knowledge base, designing and developing prompts, and designing the system structure. It talks about how computers automatically make info needs for sudden emergencies. During the experimental process, the backbone architectures used were four Large Language Models (LLMs): chatGLM (GLM-4.6), Spark (SparkX1.5), ERNIE Bot (4.5 Turbo), and DeepSeek (V3.2). According to the desired system process, information needs is generated by real-word cases and then they are compared to the gathered information needs by experts. In the comparison process, the “keyword weighted matching + text structure feature fusion” method was used to calculate the semantic similarity. Like true positives, false positives, and false negatives can be used to find differences and calculate metrics like precision and recal. And the F1-score is also computed. The experimental results show that all four LLMs achieved a precision and recall of over 90% in earthquake information extraction, with their F1-scores all exceeding 85%. This verifies the feasibility of the analytical method a chatGLM dopted in this research. Through comparative analysis, it was found that chatGLM exhibited the best performance, with an F1-score of 93.2%. Eventually, Python is used to script these aforementioned processes, which then create complete comparison charts for visual and test result checking. In the course of researching we also use Protege to create the knowledge requirements ontology, so it is easy for us to show and look at it. This research is particularly useful for emergency management departments, earthquake emergency response teams, and those working on intelligent emergency information systems or those focusing on the automated information requirement generation using technologies such as LLMs. It provides practical support for optimizing rapid decision-making in earthquake emergency response. Full article
Show Figures

Figure 1

855 KB  
Proceeding Paper
Supporting Rule-Based Control with a Natural Language Model
by Martin Kernács and Olivér Hornyák
Eng. Proc. 2025, 113(1), 56; https://doi.org/10.3390/engproc2025113056 - 10 Nov 2025
Abstract
The usage of Artificial Intelligence (AI) in control loops and rule-based frameworks is a novel approach in automation and decision-making processes. Large Language Models (LLMs) are redefining conventional rule-based systems by introducing intuitive natural language interfaces, drastically changing the creation of rules, and [...] Read more.
The usage of Artificial Intelligence (AI) in control loops and rule-based frameworks is a novel approach in automation and decision-making processes. Large Language Models (LLMs) are redefining conventional rule-based systems by introducing intuitive natural language interfaces, drastically changing the creation of rules, and minimizing operational complexity. Unlike static controllers, AI-enhanced systems can autonomously evolve with real-time environmental changes, achieving optimal performance without manual intervention. By allowing non-experts to modify rules through natural language commands, LLM can change the control system management. These advancements not only improve adaptability and operational efficiency but also reduce downtime through proactive error detection and self-correction mechanisms. AI-powered systems allow refining operations, thus accelerating response speeds and increasing reliability. The synergy between rule-based logic and AI-driven intelligence provides a new approach for autonomous systems, improving their capability of context-specific decision-making. In this paper, an approach is presented to control a storage system by natural language commands. The comparison of the Hungarian and English language interpretations is discussed. Full article
Show Figures

Figure 1

27 pages, 3096 KB  
Article
EnergAI: A Large Language Model-Driven Generative Design Method for Early-Stage Building Energy Optimization
by Jing Zhong, Peilin Li, Ran Luo, Jun Yin, Yizhen Ding, Junjie Bai, Chuxiang Hong, Xiang Deng, Xintong Ma and Shuai Lu
Energies 2025, 18(22), 5921; https://doi.org/10.3390/en18225921 - 10 Nov 2025
Abstract
The early stage of architectural design plays a decisive role in determining building energy performance, yet conventional evaluation is typically deferred to later phases, restricting timely and data-informed feedback. This paper proposes EnergAI, a generative design framework that incorporates energy optimization objectives directly [...] Read more.
The early stage of architectural design plays a decisive role in determining building energy performance, yet conventional evaluation is typically deferred to later phases, restricting timely and data-informed feedback. This paper proposes EnergAI, a generative design framework that incorporates energy optimization objectives directly into the scheme generation process through large language models (e.g., GPT-4o, DeepSeek-V3.1-Think, Qwen-Max, and Gemini-2.5 pro). A dedicated dataset, LowEnergy-FormNet, comprising 2160 cases with site parameters, massing descriptors, and simulation outputs, was constructed to model site, form, and energy relationships. The framework encodes building massing into a parametric vector representation and employs hierarchical prompt strategies to establish a closed-loop compatibility with ClimateStudio. Experimental evaluations demonstrate that geometry-oriented and fuzzy-goal prompts achieve average annual reductions of approximately 16–17% in energy use intensity and 3–4% in energy cost compared with human designs, while performance-oriented structured prompts deliver the most reliable improvements, eliminating high-energy outliers and yielding an average EUI-saving rate above 50%. In cross-model comparisons under an identical toolchain, GPT-4o delivered the strongest and most stable optimization, achieving 63.3% mean EUI savings, nearly 13% higher than DeepSeek-V3.1-Think, Qwen-Max, and Gemini-2.5 baselines. These results demonstrate the feasibility and indicate the potential robustness of embedding performance constraints at the generation stage, providing a feasible approach to support proactive, data-informed early design. Full article
(This article belongs to the Special Issue Challenges and Research Trends of Integrated Zero-Carbon Power Plant)
Show Figures

Figure 1

20 pages, 1296 KB  
Article
Learning Path Recommendation Enhanced by Knowledge Tracing and Large Language Model
by Yunxuan Lin and Zhengyang Wu
Electronics 2025, 14(22), 4385; https://doi.org/10.3390/electronics14224385 - 10 Nov 2025
Abstract
With the development of large language model (LLM) technology, AI-assisted education systems are gradually being widely used. Learning Path Recommendation (LPR) is an important task in personalized instructional scenarios. AI-assisted LPR is gaining traction for its ability to generate learning content based on [...] Read more.
With the development of large language model (LLM) technology, AI-assisted education systems are gradually being widely used. Learning Path Recommendation (LPR) is an important task in personalized instructional scenarios. AI-assisted LPR is gaining traction for its ability to generate learning content based on a student’s personalized needs. However, the native-LLM has the problem of hallucination, which may lead to the inability to generate learning content; in addition, the evaluation results of the LLM on students’ knowledge status are usually conservative and have a large margin of error. To address these issues, this work proposes a novel approach for LPR enhanced by knowledge tracing (KT) and LLM. Our method operates in a “generate-and-retrieve” manner: the LLM acts as a pedagogical planner that generates contextual reference exercises based on the student’s needs. Subsequently, a retrieval mechanism constructs the concrete learning path by retrieving the top-N most semantically similar exercises from an established exercise bank, ensuring the recommendations are both pedagogically sound and practically available. The KT plays the role of an evaluator in the iterative process. Rather than generating semantic instructions directly, it provides a quantitative and structured performance metric. Specifically, given a candidate learning path generated by the LLM, the KT model simulates the student’s knowledge state after completing the path and computes a knowledge promotion score. This score quantitatively measures the effectiveness of the proposed path for the current student, thereby guiding the refinement of subsequent recommendations. This iterative interaction between the KT and the LLM continuously refines the candidate learning items until an optimal learning path is generated. Experimental validations on public datasets demonstrate that our model surpasses baseline methods. Full article
(This article belongs to the Special Issue Data Mining and Recommender Systems)
Show Figures

Figure 1

31 pages, 2192 KB  
Article
AgentReport: A Multi-Agent LLM Approach for Automated and Reproducible Bug Report Generation
by Seojin Choi and Geunseok Yang
Appl. Sci. 2025, 15(22), 11931; https://doi.org/10.3390/app152211931 - 10 Nov 2025
Abstract
Bug reports in open-source projects are often incomplete or low in quality, which reduces maintenance efficiency. To address this issue, we propose AgentReport, a multi-agent pipeline based on large language models (LLMs). AgentReport integrates QLoRA-4bit lightweight fine-tuning, CTQRS (Completeness, Traceability, Quantifiability, Reproducibility, Specificity) [...] Read more.
Bug reports in open-source projects are often incomplete or low in quality, which reduces maintenance efficiency. To address this issue, we propose AgentReport, a multi-agent pipeline based on large language models (LLMs). AgentReport integrates QLoRA-4bit lightweight fine-tuning, CTQRS (Completeness, Traceability, Quantifiability, Reproducibility, Specificity) structured prompting, Chain-of-Thought reasoning, and one-shot exemplar within seven modules: Data, Prompt, Fine-tuning, Generation, Evaluation, Reporting, and Controller. Using 3966 summary–report pairs from Bugzilla, AgentReport achieved 80.5% in CTQRS, 84.6% in ROUGE-1 Recall, 56.8% in ROUGE-1 F1, and 86.4% in Sentence-BERT (SBERT). Compared with the baseline (77.0% CTQRS, 61.0% ROUGE-1 Recall, 85.0% SBERT), AgentReport improved CTQRS by 3.5 percentage points, Recall by 23.6 points, and SBERT by 1.4 points. The inclusion of F1 complemented Recall-only evaluation, offering a balanced framework that covers structural completeness (CTQRS), lexical coverage and precision (ROUGE-1 Recall/F1), and semantic consistency (SBERT). This modular design enables consistent experimentation and flexible scaling, providing practical evidence that multi-agent LLM pipelines can generate higher-quality bug reports for software maintenance. Full article
Show Figures

Figure 1

Back to TopTop