Saved Queries

Background: Stress is a critical determinant of mental health, yet conventional monitoring approaches often rely on subjective self-reports or physiological signals that lack real-time responsiveness. Recent advances in large language models (LLMs) offer opportunities for speech-driven, adaptive stress detection, but existing systems are limited to retrospective text analysis, monolingual settings, or detection-only outputs. Methods: We developed a real-time, speech-driven stress detection framework that integrates audio recording, speech-to-text conversion, and linguistic analysis using transformer-based LLMs. The system provides multimodal outputs, delivering recommendations in both text and synthesized speech. Nine LLM variants were evaluated on five benchmark datasets under zero-shot and few-shot learning conditions. Performance was assessed using accuracy, precision, recall, F1-score, and misclassification trends (false-negatives and false-positives). Real-time feasibility was analyzed through latency modeling, and user-centered validation was conducted across cross-domains. Results: Few-shot fine-tuning improved model performance across all datasets, with Large Language Model Meta AI (LLaMA) and Robustly Optimized BERT Pretraining Approach (RoBERTa) achieving the highest F1-scores and reduced false-negatives, particularly for suicide risk detection. Latency analysis revealed a trade-off between responsiveness and accuracy, with delays ranging from ~2 s for smaller models to ~7.6 s for LLaMA-7B on 30 s audio inputs. Multilingual input support and multimodal output enhanced inclusivity. User feedback confirmed strong usability, accessibility, and adoption potential in real-world settings. Conclusions: This study demonstrates that real-time, LLM-powered stress detection is both technically robust and practically feasible. By combining speech-based input, multimodal feedback, and user-centered validation, the framework advances beyond traditional detection only models toward scalable, inclusive, and deployment-ready digital mental health solutions. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

►▼ Show Figures

Figure 1

16 pages, 309 KB

Open AccessArticle

Large Language Models as Coders of Pragmatic Competence in Healthy Aging: Preliminary Results on Reliability, Limits, and Implications for Human-Centered AI

by Arianna Boldi, Ilaria Gabbatore and Francesca M. Bosco

Electronics 2025, 14(22), 4411; https://doi.org/10.3390/electronics14224411 (registering DOI) - 12 Nov 2025

Abstract

Pragmatics concerns how people use language and other expressive means, such as nonverbal and paralinguistic cues, to convey intended meaning in the context. Difficulties in pragmatics are common across distinct clinical conditions, motivating validated assessments such as the Assessment Battery for Communication (ABaCo); whether Large Language Models (LLMs) can serve as reliable coders remains uncertain. In this exploratory study, we used Generative Pre-trained Transformer (GPT)-4o as a rater on 2025 item × dimension units drawn from the responses given by 10 healthy older adults (M = 69.8) to selected ABaCo items. Expert human coders served as the reference standard to compare GPT-4o scores. Agreement metrics included exact agreement, Cohen’s κ, and a discrepancy audit by pragmatic act. Agreement was 89.1% with κ = 0.491. Errors were non-random across acts (χ²(12) = 69.4, p < 0.001). After Benjamini–Hochberg False Discovery Rate correction across 26 cells, only two categories remained significant: false positives concentrated in Command and false negatives in Deceit. Missing prosodic and gestural cues likely exacerbate command-specific failures. In conclusion, in text-only settings, GPT-4o can serve as a supervised second coder for healthy-aging assessments of pragmatic competence, under human oversight. Safe clinical deployment requires population-specific validation and multimodal inputs that recover nonverbal cues. Full article

(This article belongs to the Special Issue Emerging Frontiers and Real-World Innovations in Human-Computer Interactions)

20 pages, 2260 KB

Open AccessArticle

Construction of a Person–Job Temporal Knowledge Graph Using Large Language Models

by Zhongshan Zhang, Junzhi Wang, Bo Li, Xiang Lin and Mingyu Liu

Big Data Cogn. Comput. 2025, 9(11), 287; https://doi.org/10.3390/bdcc9110287 (registering DOI) - 12 Nov 2025

Abstract

Person–job data are multi-source, heterogeneous, and strongly temporal, making knowledge modeling and analysis challenging. We present an automated approach for constructing a Human-Resources Temporal Knowledge Graph. We first formalize a schema in which temporal relations are represented as sets of time intervals. On top of this schema, a large language model (LLM) pipeline extracts entities, relations, and temporal expressions, augmented by self-verification and external knowledge injection to enforce schema compliance, resolve ambiguities, and automatically repair outputs. Context-aware prompting and confidence-based escalation further improve robustness. Evaluated on a corpus of 2000 Chinese resumes, our method outperforms strong baselines, and ablations confirm the necessity and synergy of each component; notably, temporal extraction attains an F1 of 0.9876. The proposed framework provides a reusable path and engineering foundation for downstream HR tasks—such as profiling, relational reasoning, and position matching—supporting more reliable, time-aware decision-making in complex organizations. Full article

35 pages, 904 KB

Open AccessArticle

Clustering-Guided Automatic Generation of Algorithms for the Multidimensional Knapsack Problem

by Cristian Inzulza, Caio Bezares, Franco Cornejo and Victor Parada

Mach. Learn. Knowl. Extr. 2025, 7(4), 144; https://doi.org/10.3390/make7040144 (registering DOI) - 12 Nov 2025

Abstract

We propose a hybrid framework that integrates instance clustering with Automatic Generation of Algorithms (AGA) to produce specialized algorithms for classes of Multidimensional Knapsack Problem (MKP) instances. This approach is highly relevant given the latest trends in AI, where Large Language Models (LLMs) are actively being used to automate and refine algorithm design through evolutionary frameworks. Our method utilizes a feature-based representation of 328 MKP instances and evaluates K-means, HDBSCAN, and random clustering to produce 11 clusters per method. For each cluster, a master optimization problem was solved using Genetic Programming, evolving algorithms encoded as syntax trees. Fitness was measured as relative error against known optima, a similar objective to those being tackled in LLM-driven optimization. Experimental and statistical analyses demonstrate that clustering-guided AGA significantly reduces average relative error and accelerates convergence compared with AGA trained on randomly grouped instances. K-means produced the most consistent cluster-specialization. Cross-cluster evaluation reveals a trade-off between specialization and generalization. The results demonstrate that clustering prior to AGA is a practical preprocessing step for designing automated algorithms in NP-hard combinatorial problems, paving the way for advanced methodologies that incorporate AI techniques. Full article

(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, Second Edition)

►▼ Show Figures

Figure 1

27 pages, 1589 KB

Open AccessSystematic Review

Can Large Language Models Foster Critical Thinking, Teamwork, and Problem-Solving Skills in Higher Education?: A Literature Review

by Rafael Martínez-Peláez, Luis J. Mena, Homero Toral-Cruz, Alberto Ochoa-Brust, Apolinar González Potes, Víctor Flores, Rodolfo Ostos, Julio C. Ramírez Pacheco, Ramón A. Félix and Vanessa G. Félix

Systems 2025, 13(11), 1013; https://doi.org/10.3390/systems13111013 - 12 Nov 2025

Abstract

Over the last two years, with the rapid development of artificial intelligence, Large Language Models (LLMs) have obtained significant attention from the academic sector, making their application in higher education attractive for students, managers, faculty, and stakeholders. We conducted a Systematic Literature Review on the adoption of LLMs in the higher education system to address persistent issues and promote critical thinking, teamwork, and problem-solving skills. Following the PRISMA 2020 protocol, a systematic search was conducted in the Web of Science Core Collection for studies published between 2023 and 2024. After a systematic search and filtering of 203 studies, we included 22 articles for further analysis. The findings show that LLMs can transform traditional teaching through active learning, align curricula with real-world demands, provide personalized feedback in large classes, and enhance assessment practices focused on applied problem-solving. Their effects are transversal, influencing multiple dimensions of higher education systems. Consequently, LLMs have the potential to improve educational equity, strengthen workforce readiness, and foster innovation across disciplines and institutions. This systematic review is registered in PROSPERO (2025 CRD420251165731). Full article

(This article belongs to the Special Issue The Application of a Large Language Model (LLM) in Education Reform and Innovation)

►▼ Show Figures

Figure 1

32 pages, 2954 KB

Open AccessReview

From Traditional Machine Learning to Fine-Tuning Large Language Models: A Review for Sensors-Based Soil Moisture Forecasting

by Md Babul Islam, Antonio Guerrieri, Raffaele Gravina, Declan T. Delaney and Giancarlo Fortino

Sensors 2025, 25(22), 6903; https://doi.org/10.3390/s25226903 - 12 Nov 2025

Abstract

Smart Agriculture (SA) combines cutting edge technologies such as the Internet of Things (IoT), Artificial Intelligence (AI), and real-time sensing systems with traditional farming practices to enhance productivity, optimize resource use, and support environmental sustainability. A key aspect of SA is the continuous monitoring of field conditions, particularly Soil Moisture (SM), which plays a crucial role in crop growth and water management. Accurate forecasting of SM allows farmers to make timely irrigation decisions, improve field management, and conserve water. To support this, recent studies have increasingly adopted soil sensors, local weather data, and AI-based data-driven models for SM forecasting. In the literature, most existing review articles lack a structured framework and often overlook recent advancements, including privacy-preserving Federated Learning (FL), Transfer Learning (TL), and the integration of Large Language Models (LLMs). To address this gap, this paper proposes a novel taxonomy for SM forecasting and presents a comprehensive review of existing approaches, including traditional machine learning, deep learning, and hybrid models. Using the PRISMA methodology, we reviewed over 189 papers and selected 68 peer-reviewed studies published between 2017 and 2025. These studies are analyzed based on sensor types, input features, AI techniques, data durations, and evaluation metrics. Six guiding research questions were developed to shape the review and inform the taxonomy. Finally, this work identifies promising research directions, such as the application of TinyML for edge deployment, explainable AI for improved transparency, and privacy-aware model training. This review aims to provide researchers and practitioners with valuable insights for building accurate, scalable, and trustworthy SM forecasting systems to advance SA. Full article

(This article belongs to the Special Issue Feature Papers in the Internet of Things Section 2025)

►▼ Show Figures

Figure 1

25 pages, 4855 KB

Open AccessArticle

Improved Flood Management and Risk Communication Through Large Language Models

by Divas Karimanzira, Thomas Rauschenbach, Tobias Hellmund and Linda Ritzau

Algorithms 2025, 18(11), 713; https://doi.org/10.3390/a18110713 - 12 Nov 2025

Abstract

In light of urbanization, climate change, and the escalation of extreme weather events, flood management is becoming more and more important. Improving community resilience and reducing flood risks require prompt decision-making and effective communication. This study investigates how flood management systems can incorporate Large Language Models (LLMs), especially those that use Retrieval-Augmented Generation (RAG) architectures. We suggest a multimodal framework that uses a Flood Knowledge Graph to aggregate data from various sources, such as social media, hydrological, and meteorological inputs. Although LLMs have the potential to be transformative, we also address important drawbacks like governance issues, hallucination risks, and a lack of physical modeling capabilities. When compared to text-only LLMs, the RAG system significantly improves the reliability of flood-related decision support by reducing factual inconsistency rates by more than 75%. Our suggested architecture includes expert validation and security layers to guarantee dependable, useful results, like flood-constrained evacuation route planning. In areas that are vulnerable to flooding, this strategy seeks to strengthen warning systems, enhance information sharing, and build resilient communities. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithms in Sustainability)

►▼ Show Figures

Figure 1

10 pages, 857 KB

Open AccessArticle

Estimating the Utility of Using Structured and Unstructured Data for Extracting Incidents of External Hospitalizations from Patient Documents

by Michael Davenport, Robert Hall, Saraswathi Kappala, Trevor Michelson, Robert Mitchell, David Winski, Cynthia Hau, Sarah Leatherman and Frank Meng

Information 2025, 16(11), 978; https://doi.org/10.3390/info16110978 - 12 Nov 2025

Abstract

Patients within the US Department of Veterans Affairs (VA) healthcare system have the option of receiving care at facilities external to the VA network. This work presents a method for identifying external hospitalizations among the VA’s patient population by utilizing data stored in patient records. The process of extracting this information is complicated by the fact that indicators of external hospitalizations come from two sources: well-defined structured data and free-form unstructured text. Though natural language processing (NLP) leveraging Large Language Models (LLMs) has advanced capabilities to automate information extraction from free text, deploying these systems remains complex and costly. Using structured data is low-cost, but its utility must be determined in order to optimally allocate resources. We describe a method for estimating the utility of using structured and unstructured data and show that if specific conditions are met, the level of effort to perform this estimate can be greatly reduced. For external hospitalizations in the VA, our analysis showed that 44.4% of cases identified using unstructured data could not be found using structured data alone. Full article

►▼ Show Figures

Figure 1

15 pages, 930 KB

Open AccessArticle

Performance Evaluation Metrics for Empathetic LLMs

by Yuna Hong, Bonhwa Ku and Hanseok Ko

Information 2025, 16(11), 977; https://doi.org/10.3390/info16110977 - 11 Nov 2025

Abstract

With the rapid advancement of large language models (LLMs), recent systems have demonstrated increasing capability in understanding and expressing human emotions. However, no objective and standardized metric currently exists to evaluate how empathetic an LLM’s response is. To address this gap, we propose a novel evaluation framework that measures both sentiment-level and emotion-level alignment between a user query and a model-generated response. The proposed metric consists of two components. The sentiment component evaluates overall affective polarity through Sentlink and the naturalness of emotional expression via NEmpathySort. The emotion component measures fine-grained emotional correspondence using Emosight. Additionally, a semantic component, based on RAGAS, assesses the contextual relevance and coherence of the response. Experimental results demonstrate that our metric effectively captures both the intensity and nuance of empathy in LLM-generated responses, providing a solid foundation for the development of emotionally intelligent conversational AI. Full article

(This article belongs to the Special Issue Graph Neural Networks and Transformers for Intelligent Data-Driven Systems)

►▼ Show Figures

Figure 1

35 pages, 2963 KB

Open AccessArticle

Explainable Artificial Intelligence Framework for Predicting Treatment Outcomes in Age-Related Macular Degeneration

by Mini Han Wang

Sensors 2025, 25(22), 6879; https://doi.org/10.3390/s25226879 - 11 Nov 2025

Abstract

Age-related macular degeneration (AMD) is a leading cause of irreversible blindness, yet current tools for forecasting treatment outcomes remain limited by either the opacity of deep learning or the rigidity of rule-based systems. To address this gap, we propose a hybrid neuro-symbolic and large language model (LLM) framework that combines mechanistic disease knowledge with multimodal ophthalmic data for explainable AMD treatment prognosis. In a pilot cohort of ten surgically managed AMD patients (six men, four women; mean age 67.8 ± 6.3 years), we collected 30 structured clinical documents and 100 paired imaging series (optical coherence tomography, fundus fluorescein angiography, scanning laser ophthalmoscopy, and ocular/superficial B-scan ultrasonography). Texts were semantically annotated and mapped to standardized ontologies, while images underwent rigorous DICOM-based quality control, lesion segmentation, and quantitative biomarker extraction. A domain-specific ophthalmic knowledge graph encoded causal disease and treatment relationships, enabling neuro-symbolic reasoning to constrain and guide neural feature learning. An LLM fine-tuned on ophthalmology literature and electronic health records ingested structured biomarkers and longitudinal clinical narratives through multimodal clinical-profile prompts, producing natural-language risk explanations with explicit evidence citations. On an independent test set, the hybrid model achieved AUROC 0.94 ± 0.03, AUPRC 0.92 ± 0.04, and a Brier score of 0.07, significantly outperforming purely neural and classical Cox regression baselines (p ≤ 0.01). Explainability metrics showed that >85% of predictions were supported by high-confidence knowledge-graph rules, and >90% of generated narratives accurately cited key biomarkers. A detailed case study demonstrated real-time, individualized risk stratification—for example, predicting an >70% probability of requiring three or more anti-VEGF injections within 12 months and a ~45% risk of chronic macular edema if therapy lapsed—with predictions matching the observed clinical course. These results highlight the framework’s ability to integrate multimodal evidence, provide transparent causal reasoning, and support personalized treatment planning. While limited by single-center scope and short-term follow-up, this work establishes a scalable, privacy-aware, and regulator-ready template for explainable, next-generation decision support in AMD management, with potential for expansion to larger, device-diverse cohorts and other complex retinal diseases. Full article

(This article belongs to the Special Issue Sensing Functional Imaging Biomarkers and Artificial Intelligence)

►▼ Show Figures

Figure 1

19 pages, 1763 KB

Open AccessArticle

Research on the Automatic Generation of Information Requirements for Emergency Response to Unexpected Events

by Yao Li, Chang Guo, Zhenhai Lu, Chao Zhang, Wei Gao, Jiaqi Liu and Jungang Yang

Appl. Sci. 2025, 15(22), 11953; https://doi.org/10.3390/app152211953 - 11 Nov 2025

Abstract

In dealing with emergency events, it is very important when making scientific and correct decisions. As an important premise, the creation of information needs is quite essential. Taking earthquakes as a type of unexpected event, this paper constructs a large and model-driven system for automating the generating process of information requirements for earthquake response. This research explores how the different departments interact during an earthquake emergency response, how the information interacts with each other, and how the information requirement process operates. The system is designed from three points of view, building a knowledge base, designing and developing prompts, and designing the system structure. It talks about how computers automatically make info needs for sudden emergencies. During the experimental process, the backbone architectures used were four Large Language Models (LLMs): chatGLM (GLM-4.6), Spark (SparkX1.5), ERNIE Bot (4.5 Turbo), and DeepSeek (V3.2). According to the desired system process, information needs is generated by real-word cases and then they are compared to the gathered information needs by experts. In the comparison process, the “keyword weighted matching + text structure feature fusion” method was used to calculate the semantic similarity. Like true positives, false positives, and false negatives can be used to find differences and calculate metrics like precision and recal. And the F1-score is also computed. The experimental results show that all four LLMs achieved a precision and recall of over 90% in earthquake information extraction, with their F1-scores all exceeding 85%. This verifies the feasibility of the analytical method a chatGLM dopted in this research. Through comparative analysis, it was found that chatGLM exhibited the best performance, with an F1-score of 93.2%. Eventually, Python is used to script these aforementioned processes, which then create complete comparison charts for visual and test result checking. In the course of researching we also use Protege to create the knowledge requirements ontology, so it is easy for us to show and look at it. This research is particularly useful for emergency management departments, earthquake emergency response teams, and those working on intelligent emergency information systems or those focusing on the automated information requirement generation using technologies such as LLMs. It provides practical support for optimizing rapid decision-making in earthquake emergency response. Full article

►▼ Show Figures

Figure 1

855 KB

Open AccessProceeding Paper

Supporting Rule-Based Control with a Natural Language Model

by Martin Kernács and Olivér Hornyák

Eng. Proc. 2025, 113(1), 56; https://doi.org/10.3390/engproc2025113056 - 10 Nov 2025

Abstract

The usage of Artificial Intelligence (AI) in control loops and rule-based frameworks is a novel approach in automation and decision-making processes. Large Language Models (LLMs) are redefining conventional rule-based systems by introducing intuitive natural language interfaces, drastically changing the creation of rules, and minimizing operational complexity. Unlike static controllers, AI-enhanced systems can autonomously evolve with real-time environmental changes, achieving optimal performance without manual intervention. By allowing non-experts to modify rules through natural language commands, LLM can change the control system management. These advancements not only improve adaptability and operational efficiency but also reduce downtime through proactive error detection and self-correction mechanisms. AI-powered systems allow refining operations, thus accelerating response speeds and increasing reliability. The synergy between rule-based logic and AI-driven intelligence provides a new approach for autonomous systems, improving their capability of context-specific decision-making. In this paper, an approach is presented to control a storage system by natural language commands. The comparison of the Hungarian and English language interpretations is discussed. Full article

►▼ Show Figures

Figure 1

27 pages, 3096 KB

Open AccessArticle

EnergAI: A Large Language Model-Driven Generative Design Method for Early-Stage Building Energy Optimization

by Jing Zhong, Peilin Li, Ran Luo, Jun Yin, Yizhen Ding, Junjie Bai, Chuxiang Hong, Xiang Deng, Xintong Ma and Shuai Lu

Energies 2025, 18(22), 5921; https://doi.org/10.3390/en18225921 - 10 Nov 2025

Abstract

The early stage of architectural design plays a decisive role in determining building energy performance, yet conventional evaluation is typically deferred to later phases, restricting timely and data-informed feedback. This paper proposes EnergAI, a generative design framework that incorporates energy optimization objectives directly into the scheme generation process through large language models (e.g., GPT-4o, DeepSeek-V3.1-Think, Qwen-Max, and Gemini-2.5 pro). A dedicated dataset, LowEnergy-FormNet, comprising 2160 cases with site parameters, massing descriptors, and simulation outputs, was constructed to model site, form, and energy relationships. The framework encodes building massing into a parametric vector representation and employs hierarchical prompt strategies to establish a closed-loop compatibility with ClimateStudio. Experimental evaluations demonstrate that geometry-oriented and fuzzy-goal prompts achieve average annual reductions of approximately 16–17% in energy use intensity and 3–4% in energy cost compared with human designs, while performance-oriented structured prompts deliver the most reliable improvements, eliminating high-energy outliers and yielding an average EUI-saving rate above 50%. In cross-model comparisons under an identical toolchain, GPT-4o delivered the strongest and most stable optimization, achieving 63.3% mean EUI savings, nearly 13% higher than DeepSeek-V3.1-Think, Qwen-Max, and Gemini-2.5 baselines. These results demonstrate the feasibility and indicate the potential robustness of embedding performance constraints at the generation stage, providing a feasible approach to support proactive, data-informed early design. Full article

(This article belongs to the Special Issue Challenges and Research Trends of Integrated Zero-Carbon Power Plant)

►▼ Show Figures

Figure 1

20 pages, 1296 KB

Open AccessArticle

Learning Path Recommendation Enhanced by Knowledge Tracing and Large Language Model

by Yunxuan Lin and Zhengyang Wu

Electronics 2025, 14(22), 4385; https://doi.org/10.3390/electronics14224385 - 10 Nov 2025

Abstract

With the development of large language model (LLM) technology, AI-assisted education systems are gradually being widely used. Learning Path Recommendation (LPR) is an important task in personalized instructional scenarios. AI-assisted LPR is gaining traction for its ability to generate learning content based on a student’s personalized needs. However, the native-LLM has the problem of hallucination, which may lead to the inability to generate learning content; in addition, the evaluation results of the LLM on students’ knowledge status are usually conservative and have a large margin of error. To address these issues, this work proposes a novel approach for LPR enhanced by knowledge tracing (KT) and LLM. Our method operates in a “generate-and-retrieve” manner: the LLM acts as a pedagogical planner that generates contextual reference exercises based on the student’s needs. Subsequently, a retrieval mechanism constructs the concrete learning path by retrieving the top-N most semantically similar exercises from an established exercise bank, ensuring the recommendations are both pedagogically sound and practically available. The KT plays the role of an evaluator in the iterative process. Rather than generating semantic instructions directly, it provides a quantitative and structured performance metric. Specifically, given a candidate learning path generated by the LLM, the KT model simulates the student’s knowledge state after completing the path and computes a knowledge promotion score. This score quantitatively measures the effectiveness of the proposed path for the current student, thereby guiding the refinement of subsequent recommendations. This iterative interaction between the KT and the LLM continuously refines the candidate learning items until an optimal learning path is generated. Experimental validations on public datasets demonstrate that our model surpasses baseline methods. Full article

(This article belongs to the Special Issue Data Mining and Recommender Systems)

►▼ Show Figures

Figure 1

31 pages, 2192 KB

Open AccessArticle

AgentReport: A Multi-Agent LLM Approach for Automated and Reproducible Bug Report Generation

by Seojin Choi and Geunseok Yang

Appl. Sci. 2025, 15(22), 11931; https://doi.org/10.3390/app152211931 - 10 Nov 2025

Abstract

Bug reports in open-source projects are often incomplete or low in quality, which reduces maintenance efficiency. To address this issue, we propose AgentReport, a multi-agent pipeline based on large language models (LLMs). AgentReport integrates QLoRA-4bit lightweight fine-tuning, CTQRS (Completeness, Traceability, Quantifiability, Reproducibility, Specificity) structured prompting, Chain-of-Thought reasoning, and one-shot exemplar within seven modules: Data, Prompt, Fine-tuning, Generation, Evaluation, Reporting, and Controller. Using 3966 summary–report pairs from Bugzilla, AgentReport achieved 80.5% in CTQRS, 84.6% in ROUGE-1 Recall, 56.8% in ROUGE-1 F1, and 86.4% in Sentence-BERT (SBERT). Compared with the baseline (77.0% CTQRS, 61.0% ROUGE-1 Recall, 85.0% SBERT), AgentReport improved CTQRS by 3.5 percentage points, Recall by 23.6 points, and SBERT by 1.4 points. The inclusion of F1 complemented Recall-only evaluation, offering a balanced framework that covers structural completeness (CTQRS), lexical coverage and precision (ROUGE-1 Recall/F1), and semantic consistency (SBERT). This modular design enables consistent experimentation and flexible scaling, providing practical evidence that multi-agent LLM pipelines can generate higher-quality bug reports for software maintenance. Full article

(This article belongs to the Special Issue Applied and Innovative Computational Intelligence Systems: 4th Edition)

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 29.

Go to page 1 2 3 4 5

Search Results (1,449)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI