Journal Description
AI
AI
is an international, peer-reviewed, open access journal on artificial intelligence (AI), including broad aspects of cognition and reasoning, perception and planning, machine learning, intelligent robotics, and applications of AI, published monthly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within ESCI (Web of Science), Scopus, EBSCO, and other databases.
- Journal Rank: JCR - Q1 (Computer Science, Interdisciplinary Applications) / CiteScore - Q2 (Artificial Intelligence)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 20.7 days after submission; acceptance to publication is undertaken in 3.9 days (median values for papers published in this journal in the first half of 2025).
- Recognition of Reviewers: APC discount vouchers, optional signed peer review, and reviewer names published annually in the journal.
Impact Factor:
5.0 (2024);
5-Year Impact Factor:
4.6 (2024)
Latest Articles
Intelligent Decision-Making Analytics Model Based on MAML and Actor–Critic Algorithms
AI 2025, 6(9), 231; https://doi.org/10.3390/ai6090231 - 14 Sep 2025
Abstract
Traditional Reinforcement Learning (RL) struggles in dynamic decision-making due to data dependence, limited generalization, and imbalanced subjective/objective factors. This paper proposes an intelligent model combining the Model-Agnostic Meta-Learning (MAML) framework with the Actor–Critic algorithm to address these limitations. The model integrates the AHP-CRITIC
[...] Read more.
Traditional Reinforcement Learning (RL) struggles in dynamic decision-making due to data dependence, limited generalization, and imbalanced subjective/objective factors. This paper proposes an intelligent model combining the Model-Agnostic Meta-Learning (MAML) framework with the Actor–Critic algorithm to address these limitations. The model integrates the AHP-CRITIC weighting method to quantify strategic weights from both subjective expert experience and objective data, achieving balanced decision rationality. The MAML mechanism enables rapid generalization with minimal samples in dynamic environments via cross-task parameter optimization, drastically reducing retraining costs upon environmental changes. Evaluated on enterprise indicator anomaly decision-making, the model achieves significantly higher task reward values than traditional Actor–Critic, PG, and DQN using only 10–20 samples. It improves time efficiency by up to 97.23%. A proposed Balanced Performance Index confirms superior stability and adaptability. Currently integrated into an enterprise platform, the model provides efficient support for dynamic, complex scenarios. This research offers an innovative solution for intelligent decision-making under data scarcity and subjective-objective conflicts, demonstrating both theoretical value and practical potential.
Full article
(This article belongs to the Section AI Systems: Theory and Applications)
►
Show Figures
Open AccessArticle
Toward Reliable Models for Distinguishing Epileptic High-Frequency Oscillations (HFOs) from Non-HFO Events Using LSTM and Pre-Trained OWL-ViT Vision–Language Framework
by
Sahbi Chaibi and Abdennaceur Kachouri
AI 2025, 6(9), 230; https://doi.org/10.3390/ai6090230 - 14 Sep 2025
Abstract
Background: Over the past two decades, high-frequency oscillations (HFOs) between 80 and 500 Hz have emerged as valuable biomarkers for delineating and tracking epileptogenic brain networks. However, inspecting HFO events in lengthy EEG recordings remains a time-consuming visual process and mainly relies on
[...] Read more.
Background: Over the past two decades, high-frequency oscillations (HFOs) between 80 and 500 Hz have emerged as valuable biomarkers for delineating and tracking epileptogenic brain networks. However, inspecting HFO events in lengthy EEG recordings remains a time-consuming visual process and mainly relies on experienced clinicians. Extensive recent research has emphasized the value of introducing deep learning (DL) and generative AI (GenAI) methods to automatically identify epileptic HFOs in iEEG signals. Owing to the ongoing issue of the noticeable incidence of spurious or false HFOs, a key question remains: which model is better able to distinguish epileptic HFOs from non-HFO events, such as artifacts and background noise? Methods: In this regard, our study addresses two main objectives: (i) proposing a novel HFO classification approach using a prompt engineering framework with OWL-ViT, a state-of-the-art large vision–language model designed for multimodal image understanding guided by optimized natural language prompts; and (ii) comparing a range of existing deep learning and generative models, including our proposed one. Main results: Notably, our quantitative and qualitative analysis demonstrated that the LSTM model achieved the highest classification accuracy of 99.16% among the time-series methods considered, while our proposed method consistently performed best among the different approaches based on time–frequency representation, achieving an accuracy of 99.07%. Conclusions and significance: The present study highlights the effectiveness of LSTM and prompted OWL-ViT models in distinguishing genuine HFOs from spurious non-HFO oscillations with respect to the gold-standard benchmark. These advancements constitute a promising step toward more reliable and efficient diagnostic tools for epilepsy.
Full article
(This article belongs to the Section Medical & Healthcare AI)
►▼
Show Figures

Graphical abstract
Open AccessArticle
GLNet-YOLO: Multimodal Feature Fusion for Pedestrian Detection
by
Yi Zhang, Qing Zhao, Xurui Xie, Yang Shen, Jinhe Ran, Shu Gui, Haiyan Zhang, Xiuhe Li and Zhen Zhang
AI 2025, 6(9), 229; https://doi.org/10.3390/ai6090229 - 12 Sep 2025
Abstract
In the field of modern computer vision, pedestrian detection technology holds significant importance in applications such as intelligent surveillance, autonomous driving, and robot navigation. However, single-modal images struggle to achieve high-precision detection in complex environments. To address this, this study proposes a GLNet-YOLO
[...] Read more.
In the field of modern computer vision, pedestrian detection technology holds significant importance in applications such as intelligent surveillance, autonomous driving, and robot navigation. However, single-modal images struggle to achieve high-precision detection in complex environments. To address this, this study proposes a GLNet-YOLO framework based on cross-modal deep feature fusion, aiming to improve pedestrian detection performance in complex environments by fusing feature information from visible light and infrared images. By extending the YOLOv11 architecture, the framework adopts a dual-branch network structure to process visible light and infrared modal inputs, respectively, and introduces the FM module to realize global feature fusion and enhancement, as well as the DMR module to accomplish local feature separation and interaction. Experimental results show that on the LLVIP dataset, compared to the single-modal YOLOv11 baseline, our fused model improves the mAP@50 by 9.2% over the visible-light-only model and 0.7% over the infrared-only model. This significantly improves the detection accuracy under low-light and complex background conditions and enhances the robustness of the algorithm, and its effectiveness is further verified on the KAIST dataset.
Full article
(This article belongs to the Special Issue Deep Learning Technologies and Their Applications in Image Processing, Computer Vision, and Computational Intelligence)
Open AccessArticle
Beyond DOM: Unlocking Web Page Structure from Source Code with Neural Networks
by
Irfan Prazina, Damir Pozderac and Vensada Okanović
AI 2025, 6(9), 228; https://doi.org/10.3390/ai6090228 - 12 Sep 2025
Abstract
►▼
Show Figures
We introduce a code-only approach for modeling web page layouts directly from their source code (HTML and CSS only), bypassing rendering. Our method employs a neural architecture with specialized encoders for style rules, CSS selectors, and HTML attributes. These encodings are then aggregated
[...] Read more.
We introduce a code-only approach for modeling web page layouts directly from their source code (HTML and CSS only), bypassing rendering. Our method employs a neural architecture with specialized encoders for style rules, CSS selectors, and HTML attributes. These encodings are then aggregated in another neural network that integrates hierarchical context (sibling and ancestor information) to form rich representational vectors for each web page’s element. Using these vectors, our model predicts eight spatial relationships between pairs of elements, focusing on edge-based proximity in a multilabel classification setup. For scalable training, labels are automatically derived from the Document Object Model (DOM) data for each web page, but the model operates independently of the DOM during inference. During inference, the model does not use bounding boxes or any information found in the DOM; instead, it relies solely on the source code as input. This approach facilitates structure-aware visual analysis in a lightweight and fully code-based way. Our model demonstrates alignment with human judgment in the evaluation of web page similarity, suggesting that code-only layout modeling offers a promising direction for scalable, interpretable, and efficient web interface analysis. The evaluation metrics show our method yields similar performance despite relying on less information.
Full article

Figure 1
Open AccessReview
Artificial Intelligence in Medical Education: A Narrative Review on Implementation, Evaluation, and Methodological Challenges
by
Annalisa Roveta, Luigi Mario Castello, Costanza Massarino, Alessia Francese, Francesca Ugo and Antonio Maconi
AI 2025, 6(9), 227; https://doi.org/10.3390/ai6090227 - 11 Sep 2025
Abstract
Artificial Intelligence (AI) is rapidly transforming medical education by enabling adaptive tutoring, interactive simulation, diagnostic enhancement, and competency-based assessment. This narrative review explores how AI has influenced learning processes in undergraduate and postgraduate medical training, focusing on methodological rigor, educational impact, and implementation
[...] Read more.
Artificial Intelligence (AI) is rapidly transforming medical education by enabling adaptive tutoring, interactive simulation, diagnostic enhancement, and competency-based assessment. This narrative review explores how AI has influenced learning processes in undergraduate and postgraduate medical training, focusing on methodological rigor, educational impact, and implementation challenges. The literature reveals promising results: large language models can generate didactic content and foster academic writing; AI-driven simulations enhance decision-making, procedural skills, and interprofessional communication; and deep learning systems improve diagnostic accuracy in visually intensive tasks such as radiology and histology. Despite promising findings, the existing literature is methodologically heterogeneous. A minority of studies use controlled designs, while the majority focus on short-term effects or are confined to small, simulated cohorts. Critical limitations include algorithmic opacity, generalizability concerns, ethical risks (e.g., GDPR compliance, data bias), and infrastructural barriers, especially in low-resource contexts. Additionally, the unregulated use of AI may undermine critical thinking, foster cognitive outsourcing, and compromise pedagogical depth if not properly supervised. In conclusion, AI holds substantial potential to enhance medical education, but its integration requires methodological robustness, human oversight, and ethical safeguards. Future research should prioritize multicenter validation, longitudinal evaluation, and AI literacy for learners and educators to ensure responsible and sustainable adoption.
Full article
(This article belongs to the Special Issue Exploring the Use of Artificial Intelligence in Education)
Open AccessSystematic Review
Retrieval-Augmented Generation (RAG) in Healthcare: A Comprehensive Review
by
Fnu Neha, Deepshikha Bhati and Deepak Kumar Shukla
AI 2025, 6(9), 226; https://doi.org/10.3390/ai6090226 - 11 Sep 2025
Abstract
►▼
Show Figures
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by integrating external knowledge retrieval to improve factual consistency and reduce hallucinations. Despite growing interest, its use in healthcare remains fragmented. This paper presents a Systematic Literature Review (SLR) following PRISMA guidelines, synthesizing 30 peer-reviewed
[...] Read more.
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by integrating external knowledge retrieval to improve factual consistency and reduce hallucinations. Despite growing interest, its use in healthcare remains fragmented. This paper presents a Systematic Literature Review (SLR) following PRISMA guidelines, synthesizing 30 peer-reviewed studies on RAG in clinical domains, focusing on three of its most prevalent and promising applications in diagnostic support, electronic health record (EHR) summarization, and medical question answering. We synthesize the existing architectural variants (naïve, advanced, and modular) and examine their deployment across these applications. Persistent challenges are identified, including retrieval noise (irrelevant or low-quality retrieved information), domain shift (performance degradation when models are applied to data distributions different from their training set), generation latency, and limited explainability. Evaluation strategies are compared using both standard metrics and clinical-specific metrics, FactScore, RadGraph-F1, and MED-F1, which are particularly critical for ensuring factual accuracy, medical validity, and clinical relevance. This synthesis offers a domain-focused perspective to guide researchers, healthcare providers, and policymakers in developing reliable, interpretable, and clinically aligned AI systems, laying the groundwork for future innovation in RAG-based healthcare solutions.
Full article

Figure 1
Open AccessSystematic Review
Advances and Optimization Trends in Photovoltaic Systems: A Systematic Review
by
Luis Angel Iturralde Carrera, Gendry Alfonso-Francia, Carlos D. Constantino-Robles, Juan Terven, Edgar A. Chávez-Urbiola and Juvenal Rodríguez-Reséndiz
AI 2025, 6(9), 225; https://doi.org/10.3390/ai6090225 - 10 Sep 2025
Abstract
This article presents a systematic review of optimization methods applied to enhance the performance of photovoltaic (PV) systems, with a focus on critical challenges such as system design and spatial layout, maximum power point tracking (MPPT), energy forecasting, fault diagnosis, and energy management.
[...] Read more.
This article presents a systematic review of optimization methods applied to enhance the performance of photovoltaic (PV) systems, with a focus on critical challenges such as system design and spatial layout, maximum power point tracking (MPPT), energy forecasting, fault diagnosis, and energy management. The emphasis is on the integration of classical and algorithmic approaches. Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (PRISMA) methodology, 314 relevant publications from 2020 to 2025 were analyzed to identify current trends, methodological advances, and practical applications in the optimization of PV performance. The principal novelty of this review lies in its integrative critical analysis, which systematically contrasts the applicability, performance, and limitations of deterministic classical methods with emerging stochastic metaheuristic and data-driven artificial intelligence (AI) techniques, highlighting the growing dominance of hybrid models that synergize their strengths. Traditional techniques such as analytical modeling, numerical simulation, linear and dynamic programming, and gradient-based methods are examined in terms of their efficiency and scope. In parallel, the study evaluates the growing adoption of metaheuristic algorithms, including particle swarm optimization, genetic algorithms, and ant colony optimization, as well as machine learning (ML) and deep learning (DL) models applied to tasks such as MPPT, spatial layout optimization, energy forecasting, and fault diagnosis. A key contribution of this review is the identification of hybrid methodologies that combine metaheuristics with ML/DL models, demonstrating superior results in energy yield, robustness, and adaptability under dynamic conditions. The analysis highlights both the strengths and limitations of each paradigm, emphasizing challenges related to data availability, computational cost, and model interpretability. Finally, the study proposes future research directions focused on explainable AI, real-time control via edge computing, and the development of standardized benchmarks for performance evaluation. The findings contribute to a deeper understanding of current capabilities and opportunities in PV system optimization, offering a strategic framework for advancing intelligent and sustainable solar energy technologies.
Full article
(This article belongs to the Special Issue The Application of Machine Learning and AI Technology Towards the Sustainable Development Goals)
►▼
Show Figures

Figure 1
Open AccessArticle
A Markerless Vision-Based Physical Frailty Assessment System for the Older Adults
by
Muhammad Huzaifa, Wajiha Ali, Khawaja Fahad Iqbal, Ishtiaq Ahmad, Yasar Ayaz, Hira Taimur, Yoshihisa Shirayama and Motoyuki Yuasa
AI 2025, 6(9), 224; https://doi.org/10.3390/ai6090224 - 10 Sep 2025
Abstract
The geriatric syndrome known as frailty is characterized by diminished physiological reserves and heightened susceptibility to unfavorable health consequences. As the world’s population ages, it is crucial to detect frailty early and accurately in order to reduce hazards, including falls, hospitalization, and death.
[...] Read more.
The geriatric syndrome known as frailty is characterized by diminished physiological reserves and heightened susceptibility to unfavorable health consequences. As the world’s population ages, it is crucial to detect frailty early and accurately in order to reduce hazards, including falls, hospitalization, and death. In particular, functional tests are frequently used to evaluate physical frailty. However, current evaluation techniques are limited in their scalability and are prone to inconsistency due to their heavy reliance on subjective interpretation and manual observation. In this paper, we provide a completely automated, impartial, and comprehensive frailty assessment system that employs computer vision techniques for assessing physical frailty tests. Machine learning models have been specifically designed to analyze each clinical test. In order to extract significant features, our system analyzes the depth and joint coordinate data for important physical performance tests such as the Walking Speed Test, Timed Up and Go (TUG) Test, Functional Reach Test, Seated Forward Bend Test, Standing on One Leg Test, and Grip Strength Test. The proposed system offers a comprehensive system with consistent measurements, intelligent decision-making, and real-time feedback, in contrast to current systems, which lack real-time analysis and standardization. Strong model accuracy and conformity to clinical benchmarks are demonstrated by the experimental outcomes. The proposed system can be considered a scalable and useful tool for frailty screening in clinical and distant care settings by eliminating observer dependency and improving accessibility.
Full article
(This article belongs to the Special Issue Multimodal Artificial Intelligence in Healthcare)
►▼
Show Figures

Figure 1
Open AccessArticle
Transformer Models Enhance Explainable Risk Categorization of Incidents Compared to TF-IDF Baselines
by
Carlos Ramon Hölzing, Patrick Meybohm, Oliver Happel, Peter Kranke and Charlotte Meynhardt
AI 2025, 6(9), 223; https://doi.org/10.3390/ai6090223 - 9 Sep 2025
Abstract
Background: Critical Incident Reporting Systems (CIRS) play a key role in improving patient safety but facess limitations due to the unstructured nature of narrative data. Systematic analysis of such data to identify latent risk patterns remains challenging. While artificial intelligence (AI) shows promise
[...] Read more.
Background: Critical Incident Reporting Systems (CIRS) play a key role in improving patient safety but facess limitations due to the unstructured nature of narrative data. Systematic analysis of such data to identify latent risk patterns remains challenging. While artificial intelligence (AI) shows promise in healthcare, its application to CIRS analysis is still underexplored. Methods: This study presents a transformer-based approach to classify incident reports into predefined risk categories and support clinical risk managers in identifying safety hazards. We compared a traditional TF-IDF/logistic regression model with a transformer-based German BERT (GBERT) model using 617 anonymized CIRS reports. Reports were categorized manually into four classes: Organization, Treatment, Documentation, and Consent/Communication. Models were evaluated using stratified 5-fold cross-validation. Interpretability was ensured via Shapley Additive Explanations (SHAP). Results: GBERT outperformed the baseline across all metrics, achieving macro averaged-F1 of 0.44 and a weighted-F1 of 0.75 versus 0.35 and 0.71. SHAP analysis revealed clinically plausible feature attributions. Conclusions: In summary, transformer-based models such as GBERT improve classification of incident report data and enable interpretable, systematic risk stratification. These findings highlight the potential of explainable AI to enhance learning from critical incidents.
Full article
(This article belongs to the Special Issue Adversarial Learning and Its Applications in Healthcare)
►▼
Show Figures

Figure 1
Open AccessArticle
Dual-Stream Former: A Dual-Branch Transformer Architecture for Visual Speech Recognition
by
Sanghun Jeon, Jieun Lee and Yong-Ju Lee
AI 2025, 6(9), 222; https://doi.org/10.3390/ai6090222 - 9 Sep 2025
Abstract
►▼
Show Figures
This study proposes Dual-Stream Former, a novel architecture that integrates a Video Swin Transformer and Conformer designed to address the challenges of visual speech recognition (VSR). The model captures spatiotemporal dependencies, achieving a state-of-the-art character error rate (CER) of 3.46%, surpassing traditional convolutional
[...] Read more.
This study proposes Dual-Stream Former, a novel architecture that integrates a Video Swin Transformer and Conformer designed to address the challenges of visual speech recognition (VSR). The model captures spatiotemporal dependencies, achieving a state-of-the-art character error rate (CER) of 3.46%, surpassing traditional convolutional neural network (CNN)-based models, such as 3D-CNN + DenseNet-121 (CER: 5.31%), and transformer-based alternatives, such as vision transformers (CER: 4.05%). The Video Swin Transformer captures multiscale spatial representations with high computational efficiency, whereas the Conformer back-end enhances temporal modeling across diverse phoneme categories. Evaluation of a high-resolution dataset comprising 740,000 utterances across 185 classes highlighted the effectiveness of the model in addressing visually confusing phonemes, such as diphthongs (/ai/, /au/) and labio-dental sounds (/f/, /v/). Dual-Stream Former achieved phoneme recognition error rates of 10.39% for diphthongs and 9.25% for labiodental sounds, surpassing those of CNN-based architectures by more than 6%. Although the model’s large parameter count (168.6 M) poses resource challenges, its hierarchical design ensures scalability. Future work will explore lightweight adaptations and multimodal extensions to increase deployment feasibility. These findings underscore the transformative potential of Dual-Stream Former for advancing VSR applications such as silent communication and assistive technologies by achieving unparalleled precision and robustness in diverse settings.
Full article

Figure 1
Open AccessArticle
Optimizing NFL Draft Selections with Machine Learning Classification
by
Akshaj Enaganti and George Pappas
AI 2025, 6(9), 221; https://doi.org/10.3390/ai6090221 - 9 Sep 2025
Abstract
►▼
Show Figures
The National Football League draft is one of the most important events in the creation of a successful franchise in professional American football. Selecting players as part of the draft process, however, is difficult, as a multitude of factors affect decisions to opt
[...] Read more.
The National Football League draft is one of the most important events in the creation of a successful franchise in professional American football. Selecting players as part of the draft process, however, is difficult, as a multitude of factors affect decisions to opt for one player over another; a few of these include collegiate statistics, team need and fit, and physical potential. In this paper, we utilize a machine learning approach, with various types of models, to optimize the NFL draft and, in turn, enhance team performances. We compare the selections made by the system to the real athletes selected, and assess which of the picks would have been more impactful for the respective franchise. The specific investigation allows for further research by altering the weighting of specific factors and their significance in this decision-making process to land on the ideal player based on what a specific team desires. Using artificial intelligence in this process can produce more consistent results than high-risk traditional methods. Our approach extends beyond a basic Random Forest classifier by simulating complete draft scenarios with player attributes and team needs weighted. This allows comparison of different draft strategies (best-player-available vs. need-based) and demonstrates improved prediction accuracy over conventional methods.
Full article

Figure 1
Open AccessArticle
Self-Emotion-Mediated Exploration in Artificial Intelligence Mirrors: Findings from Cognitive Psychology
by
Gustavo Assuncao, Miguel Castelo-Branco and Paulo Menezes
AI 2025, 6(9), 220; https://doi.org/10.3390/ai6090220 - 9 Sep 2025
Abstract
►▼
Show Figures
Background: Exploration of the physical environment is an indispensable precursor to information acquisition and knowledge consolidation for living organisms. Yet, current artificial intelligence models lack these autonomy capabilities during training, hindering their adaptability. This work proposes a learning framework for artificial agents to
[...] Read more.
Background: Exploration of the physical environment is an indispensable precursor to information acquisition and knowledge consolidation for living organisms. Yet, current artificial intelligence models lack these autonomy capabilities during training, hindering their adaptability. This work proposes a learning framework for artificial agents to obtain an intrinsic exploratory drive, based on epistemic and achievement emotions triggered during data observation. Methods: This study proposes a dual-module reinforcement framework, where data analysis scores dictate pride or surprise, in accordance with psychological studies on humans. A correlation between these states and exploration is then optimized for agents to meet their learning goals. Results: Causal relationships between states and exploration are demonstrated by the majority of agents. A mean increase is noted for surprise, with a mean decrease for pride. Resulting correlations of and are obtained, mirroring previously reported human behavior. Conclusions: These findings lead to the conclusion that bio-inspiration for AI development can be of great use. This can incur benefits typically found in living beings, such as autonomy. Further, it empirically shows how AI methodologies can corroborate human behavioral findings, showcasing major interdisciplinary importance. Ramifications are discussed.
Full article

Figure 1
Open AccessArticle
MST-DGCN: Multi-Scale Temporal–Dynamic Graph Convolutional with Orthogonal Gate for Imbalanced Multi-Label ECG Arrhythmia Classification
by
Jie Chen, Mingfeng Jiang, Xiaoyu He, Yang Li, Jucheng Zhang, Juan Li, Yongquan Wu and Wei Ke
AI 2025, 6(9), 219; https://doi.org/10.3390/ai6090219 - 8 Sep 2025
Abstract
Multi-label arrhythmia classification from 12-lead ECG signals is a tricky problem, including spatiotemporal feature extraction, feature fusion, and class imbalance. To address these issues, a multi-scale temporal–dynamic graph convolutional with orthogonal gates method, termed MST-DGCN, is proposed for ECG arrhythmia classification. In this
[...] Read more.
Multi-label arrhythmia classification from 12-lead ECG signals is a tricky problem, including spatiotemporal feature extraction, feature fusion, and class imbalance. To address these issues, a multi-scale temporal–dynamic graph convolutional with orthogonal gates method, termed MST-DGCN, is proposed for ECG arrhythmia classification. In this method, a temporal–dynamic graph convolution with dynamic adjacency matrices is used to learn spatiotemporal patterns jointly, and an orthogonal gated fusion mechanism is used to eliminate redundancy, so as to strength their complementarity and independence through adjusting the significance of features dynamically. Moreover, a multi-instance learning strategy is proposed to alleviate class imbalance by adjusting the proportion of a few arrhythmia samples through adaptive label allocation. After validating on the St Petersburg INCART dataset under stringent inter-patient settings, the experimental results show that the proposed MST-DGCN method can achieve the best classification performance with an F1-score of 73.66% (+6.2% over prior baseline methods), with concurrent improvements in AUC (70.92%) and mAP (85.24%), while maintaining computational efficiency.
Full article
(This article belongs to the Special Issue Artificial Intelligence in Biomedical Engineering: Challenges and Developments)
►▼
Show Figures

Figure 1
Open AccessArticle
Conv-ScaleNet: A Multiscale Convolutional Model for Federated Human Activity Recognition
by
Xian Wu Ting, Ying Han Pang, Zheng You Lim, Shih Yin Ooi and Fu San Hiew
AI 2025, 6(9), 218; https://doi.org/10.3390/ai6090218 - 8 Sep 2025
Abstract
►▼
Show Figures
Background: Artificial Intelligence (AI) techniques have been extensively deployed in sensor-based Human Activity Recognition (HAR) systems. Recent advances in deep learning, especially Convolutional Neural Networks (CNNs), have advanced HAR by enabling automatic feature extraction from raw sensor data. However, these models often struggle
[...] Read more.
Background: Artificial Intelligence (AI) techniques have been extensively deployed in sensor-based Human Activity Recognition (HAR) systems. Recent advances in deep learning, especially Convolutional Neural Networks (CNNs), have advanced HAR by enabling automatic feature extraction from raw sensor data. However, these models often struggle to capture multiscale patterns in human activity, limiting recognition accuracy. Additionally, traditional centralized learning approaches raise data privacy concerns, as personal sensor data must be transmitted to a central server, increasing the risk of privacy breaches. Methods: To address these challenges, this paper introduces Conv-ScaleNet, a CNN-based model designed for multiscale feature learning and compatibility with federated learning (FL) environments. Conv-ScaleNet integrates a Pyramid Pooling Module to extract both fine-grained and coarse-grained features and employs sequential Global Average Pooling layers to progressively capture abstract global representations from inertial sensor data. The model supports federated learning by training locally on user devices, sharing only model updates rather than raw data, thus preserving user privacy. Results: Experimental results demonstrate that the proposed Conv-ScaleNet achieves approximately 98% and 96% F1-scores on the WISDM and UCI-HAR datasets, respectively, confirming its competitiveness in FL environments for activity recognition. Conclusions: The proposed Conv-ScaleNet model addresses key limitations of existing HAR systems by combining multiscale feature learning with privacy-preserving training. Its strong performance, data protection capability, and adaptability to decentralized environments make it a robust and scalable solution for real-world HAR applications.
Full article

Figure 1
Open AccessArticle
Unplugged Activities for Teaching Decision Trees to Secondary Students—A Case Study Analysis Using the SOLO Taxonomy
by
Konstantinos Karapanos, Vassilis Komis, Georgios Fesakis, Konstantinos Lavidas, Stavroula Prantsoudi and Stamatios Papadakis
AI 2025, 6(9), 217; https://doi.org/10.3390/ai6090217 - 5 Sep 2025
Abstract
►▼
Show Figures
The integration of Artificial Intelligence (AI) technologies in students’ lives necessitates the systematic incorporation of foundational AI literacy into educational curricula. Students are challenged to develop conceptual understanding of computational frameworks such as Machine Learning (ML) algorithms and Decision Trees (DTs). In this
[...] Read more.
The integration of Artificial Intelligence (AI) technologies in students’ lives necessitates the systematic incorporation of foundational AI literacy into educational curricula. Students are challenged to develop conceptual understanding of computational frameworks such as Machine Learning (ML) algorithms and Decision Trees (DTs). In this context, unplugged (i.e., computer-free) pedagogical approaches have emerged as complementary to traditional coding-based instruction in AI education. This study examines the pedagogical effectiveness of an instructional intervention employing unplugged activities to facilitate conceptual understanding of DT algorithms among 47 9th-grade students within a Computer Science (CS) curriculum in Greece. The study employed a quasi-experimental design, utilizing the Structure of Observed Learning Outcomes (SOLO) taxonomy as the theoretical framework for assessing cognitive development and conceptual mastery of DT principles. Quantitative analysis of pre- and post-intervention assessments demonstrated statistically significant improvements in student performance across all evaluated SOLO taxonomy levels. The findings provide empirical support for the hypothesis that unplugged pedagogical interventions constitute an effective and efficient approach for introducing AI concepts to secondary education students. Based on these outcomes, the authors recommend the systematic implementation of developmentally appropriate unplugged instructional interventions for DTs and broader AI concepts across all educational levels, to optimize AI literacy acquisition.
Full article

Figure 1
Open AccessReview
Large Language Models in Cybersecurity: A Survey of Applications, Vulnerabilities, and Defense Techniques
by
Niveen O. Jaffal, Mohammed Alkhanafseh and David Mohaisen
AI 2025, 6(9), 216; https://doi.org/10.3390/ai6090216 - 5 Sep 2025
Abstract
►▼
Show Figures
Large Language Models (LLMs) are transforming cybersecurity by enabling intelligent, adaptive, and automated approaches to threat detection, vulnerability assessment, and incident response. With their advanced language understanding and contextual reasoning, LLMs surpass traditional methods in tackling challenges across domains such as the Internet
[...] Read more.
Large Language Models (LLMs) are transforming cybersecurity by enabling intelligent, adaptive, and automated approaches to threat detection, vulnerability assessment, and incident response. With their advanced language understanding and contextual reasoning, LLMs surpass traditional methods in tackling challenges across domains such as the Internet of Things (IoT), blockchain, and hardware security. This survey provides a comprehensive overview of LLM applications in cybersecurity, focusing on two core areas: (1) the integration of LLMs into key cybersecurity domains, and (2) the vulnerabilities of LLMs themselves, along with mitigation strategies. By synthesizing recent advancements and identifying key limitations, this work offers practical insights and strategic recommendations for leveraging LLMs to build secure, scalable, and future-ready cyber defense systems.
Full article

Figure 1
Open AccessReview
Long Short-Term Memory Networks: A Comprehensive Survey
by
Moez Krichen and Alaeddine Mihoub
AI 2025, 6(9), 215; https://doi.org/10.3390/ai6090215 - 5 Sep 2025
Abstract
Long Short-Term Memory (LSTM) networks have revolutionized the field of deep learning, particularly in applications that require the modeling of sequential data. Originally designed to overcome the limitations of traditional recurrent neural networks (RNNs), LSTMs effectively capture long-range dependencies in sequences, making them
[...] Read more.
Long Short-Term Memory (LSTM) networks have revolutionized the field of deep learning, particularly in applications that require the modeling of sequential data. Originally designed to overcome the limitations of traditional recurrent neural networks (RNNs), LSTMs effectively capture long-range dependencies in sequences, making them suitable for a wide array of tasks. This survey aims to provide a comprehensive overview of LSTM architectures, detailing their unique components, such as cell states and gating mechanisms, which facilitate the retention and modulation of information over time. We delve into the various applications of LSTMs across multiple domains, including the following: natural language processing (NLP), where they are employed for language modeling, machine translation, and sentiment analysis; time series analysis, where they play a critical role in forecasting tasks; and speech recognition, significantly enhancing the accuracy of automated systems. By examining these applications, we illustrate the versatility and robustness of LSTMs in handling complex data types. Additionally, we explore several notable variants and improvements of the standard LSTM architecture, such as Bidirectional LSTMs, which enhance context understanding, and Stacked LSTMs, which increase model capacity. We also discuss the integration of Attention Mechanisms with LSTMs, which have further advanced their performance in various tasks. Despite their strengths, LSTMs face several challenges, including high Computational Complexity, extensive Data Requirements, and difficulties in training, which can hinder their practical implementation. This survey addresses these limitations and provides insights into ongoing research aimed at mitigating these issues. In conclusion, we highlight recent advances in LSTM research and propose potential future directions that could lead to enhanced performance and broader applicability of LSTM networks. This survey serves as a foundational resource for researchers and practitioners seeking to understand the current landscape of LSTM technology and its future trajectory.
Full article
(This article belongs to the Special Issue The Application of Machine Learning and AI Technology Towards the Sustainable Development Goals)
►▼
Show Figures

Figure 1
Open AccessArticle
From Detection to Decision: Transforming Cybersecurity with Deep Learning and Visual Analytics
by
Saurabh Chavan and George Pappas
AI 2025, 6(9), 214; https://doi.org/10.3390/ai6090214 - 4 Sep 2025
Abstract
►▼
Show Figures
Objectives: The persistent evolution of software vulnerabilities—spanning novel zero-day exploits to logic-level flaws—continues to challenge conventional cybersecurity mechanisms. Static rule-based scanners and opaque deep learning models often lack the precision and contextual understanding required for both accurate detection and analyst interpretability. This
[...] Read more.
Objectives: The persistent evolution of software vulnerabilities—spanning novel zero-day exploits to logic-level flaws—continues to challenge conventional cybersecurity mechanisms. Static rule-based scanners and opaque deep learning models often lack the precision and contextual understanding required for both accurate detection and analyst interpretability. This paper presents a hybrid framework for real-time vulnerability detection that improves both robustness and explainability. Methods: The framework integrates semantic encoding via Bidirectional Encoder Representations from Transformers (BERTs), structural analysis using Deep Graph Convolutional Neural Networks (DGCNNs), and lightweight prioritization through Kernel Extreme Learning Machines (KELMs). The architecture incorporates Minimum Intermediate Representation (MIR) learning to reduce false positives and fuses multi-modal data (source code, execution traces, textual metadata) for robust, scalable performance. Explainable Artificial Intelligence (XAI) visualizations—combining SHAP-based attributions and CVSS-aligned pair plots—serve as an analyst-facing interpretability layer. The framework is evaluated on benchmark datasets, including VulnDetect and the NIST Software Reference Library (NSRL, version 2024.12.1, used strictly as a benign baseline for false positive estimation). Results: Our evaluation reports that precision, recall, AUPRC, MCC, and calibration (ECE/Brier score) demonstrated improved robustness and reduced false positives compared to baselines. An internal interpretability validation was conducted to align SHAP/GNNExplainer outputs with known vulnerability features; formal usability testing with practitioners is left as future work. Conclusions: The framework, Designed with DevSecOps integration in mind, the system is packaged in containerized modules (Docker/Kubernetes) and outputs SIEM-compatible alerts, enabling potential compatibility with Splunk, GitLab CI/CD, and similar tools. While full enterprise deployment was not performed, these deployment-oriented design choices support scalability and practical adoption.
Full article

Figure 1
Open AccessReview
A Survey of Traditional and Emerging Deep Learning Techniques for Non-Intrusive Load Monitoring
by
Annysha Huzzat, Ahmed S. Khwaja, Ali A. Alnoman, Bhagawat Adhikari, Alagan Anpalagan and Isaac Woungang
AI 2025, 6(9), 213; https://doi.org/10.3390/ai6090213 - 3 Sep 2025
Abstract
To cope with the increasing global demand of energy and significant energy wastage caused by the use of different home appliances, smart load monitoring is considered a promising solution to promote proper activation and scheduling of devices and reduce electricity bills. Instead of
[...] Read more.
To cope with the increasing global demand of energy and significant energy wastage caused by the use of different home appliances, smart load monitoring is considered a promising solution to promote proper activation and scheduling of devices and reduce electricity bills. Instead of installing a sensing device on each electric appliance, non-intrusive load monitoring (NILM) enables the monitoring of each individual device using the total power reading of the home smart meter. However, for a high-accuracy load monitoring, efficient artificial intelligence (AI) and deep learning (DL) approaches are needed. To that end, this paper thoroughly reviews traditional AI and DL approaches, as well as emerging AI models proposed for NILM. Unlike existing surveys that are usually limited to a specific approach or a subset of approaches, this review paper presents a comprehensive survey of an ensemble of topics and models, including deep learning, generative AI (GAI), emerging attention-enhanced GAI, and hybrid AI approaches. Another distinctive feature of this work compared to existing surveys is that it also reviews actual cases of NILM system design and implementation, covering a wide range of technical enablers including hardware, software, and AI models. Furthermore, a range of new future research and challenges are discussed, such as the heterogeneity of energy sources, data uncertainty, privacy and safety, cost and complexity reduction, and the need for a standardized comparison.
Full article
(This article belongs to the Section AI Systems: Theory and Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
Post-Heuristic Cancer Segmentation Refinement over MRI Images and Deep Learning Models
by
Panagiotis Christakakis and Eftychios Protopapadakis
AI 2025, 6(9), 212; https://doi.org/10.3390/ai6090212 - 2 Sep 2025
Abstract
►▼
Show Figures
Lately, deep learning methods have greatly improved the accuracy of brain-tumor segmentation, yet slice-wise inconsistencies still limit reliable use in clinical practice. While volume-aware 3D convolutional networks achieve high accuracy, their memory footprint and inference time may limit clinical adoption. This study proposes
[...] Read more.
Lately, deep learning methods have greatly improved the accuracy of brain-tumor segmentation, yet slice-wise inconsistencies still limit reliable use in clinical practice. While volume-aware 3D convolutional networks achieve high accuracy, their memory footprint and inference time may limit clinical adoption. This study proposes a resource-conscious pipeline for lower-grade-glioma delineation in axial FLAIR MRI that combines a 2D Attention U-Net with a guided post-processing refinement step. Two segmentation backbones, a vanilla U-Net and an Attention U-Net, are trained on 110 TCGA-LGG axial FLAIR patient volumes under various loss functions and activation functions. The Attention U-Net, optimized with Dice loss, delivers the strongest baseline, achieving a mean Intersection-over-Union (mIoU) of 0.857. To mitigate slice-wise inconsistencies inherent to 2D models, a White-Area Overlap (WAO) voting mechanism quantifies the tumor footprint shared by neighboring slices. The WAO curve is smoothed with a Gaussian filter to locate its peak, after which a percentile-based heuristic selectively relabels the most ambiguous softmax pixels. Cohort-level analysis shows that removing merely 0.1–0.3% of ambiguous low-confidence pixels lifts the post-processing mIoU above the baseline while improving segmentation for two-thirds of patients. The proposed refinement strategy holds great potential for further improvement, offering a practical route for integrating deep learning segmentation into routine clinical workflows with minimal computational overhead.
Full article

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
3 September 2025
Join Us at the MDPI at the University of Toronto Career Fair, 23 September 2025, Toronto, ON, Canada
Join Us at the MDPI at the University of Toronto Career Fair, 23 September 2025, Toronto, ON, Canada

1 September 2025
MDPI INSIGHTS: The CEO’s Letter #26 – CUJS, Head of Ethics, Open Peer Review, AIS 2025, Reviewer Recognition
MDPI INSIGHTS: The CEO’s Letter #26 – CUJS, Head of Ethics, Open Peer Review, AIS 2025, Reviewer Recognition
Topics
Topic in
AI, Data, Economies, Mathematics, Risks
Advanced Techniques and Modeling in Business and Economics
Topic Editors: José Manuel Santos-Jaén, Ana León-Gomez, María del Carmen Valls MartínezDeadline: 30 September 2025
Topic in
AI, Energies, Entropy, Sustainability
Game Theory and Artificial Intelligence Methods in Sustainable and Renewable Energy Power Systems
Topic Editors: Lefeng Cheng, Pei Zhang, Anbo MengDeadline: 31 October 2025
Topic in
AI, Algorithms, Diagnostics, Emergency Care and Medicine
Trends of Artificial Intelligence in Emergency and Critical Care Medicine
Topic Editors: Zhongheng Zhang, Yucai Hong, Wei ShaoDeadline: 30 November 2025
Topic in
AI, Drones, Electronics, Mathematics, Sensors
AI and Data-Driven Advancements in Industry 4.0, 2nd Edition
Topic Editors: Teng Huang, Yan Pang, Qiong Wang, Jianjun Li, Jin Liu, Jia WangDeadline: 15 December 2025

Conferences
Special Issues
Special Issue in
AI
AI and the Evolution of Work: Redefining Project Management across Disciplines
Guest Editor: Jose BerengueresDeadline: 30 September 2025
Special Issue in
AI
Artificial Intelligence for Network Management
Guest Editors: Stephen Ojo, Agbotiname Lucky Imoize, Lateef Adesola AkinyemiDeadline: 30 September 2025
Special Issue in
AI
Development and Design of Autonomous Robot
Guest Editors: Tayab Din Memon, Kamran Shaukat, Sufyan Ali MemonDeadline: 24 October 2025
Special Issue in
AI
Adversarial Learning and Its Applications in Healthcare
Guest Editors: Min Xian, Aleksandar VakanskiDeadline: 27 October 2025