Next Issue
Volume 6, November
Previous Issue
Volume 6, September
 
 

AI, Volume 6, Issue 10 (October 2025) – 36 articles

Cover Story (view full-size image): Visual–language tracking, situated at the intersection of computer vision and natural language processing, aims to enhance semantic understanding in single-target tracking through textual descriptions. However, most existing methods depend on contrastive learning for cross-modal alignment, which fails to accurately distinguish targets in semantically ambiguous or multi-similar scenes. This study proposes a Textual Heatmap Mapping (THM) module that explicitly introduces spatial position mapping into cross-modal fusion, allowing textual cues to guide feature distribution at semantic and spatial levels. Integrated into the UVLTrack framework, THM imposes semantically constrained spatial response conditioning, effectively suppressing distractions and improving localization accuracy. Experiments show notable gains on OTB99, LaSOT, and TNL2K. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
23 pages, 2701 KB  
Article
Grad-CAM-Assisted Deep Learning for Mode Hop Localization in Shearographic Tire Inspection
by Manuel Friebolin, Michael Munz and Klaus Schlickenrieder
AI 2025, 6(10), 275; https://doi.org/10.3390/ai6100275 - 21 Oct 2025
Abstract
In shearography-based tire testing, so-called “Mode Hops”, abrupt phase changes caused by laser mode changes, can lead to significant disturbances in the interference image analysis. These artifacts distort defect assessment, lead to retesting or false-positive decisions, and, thus, represent a significant hurdle for [...] Read more.
In shearography-based tire testing, so-called “Mode Hops”, abrupt phase changes caused by laser mode changes, can lead to significant disturbances in the interference image analysis. These artifacts distort defect assessment, lead to retesting or false-positive decisions, and, thus, represent a significant hurdle for the automation of the shearography-based tire inspection process. This work proposes a deep learning workflow that combines a pretrained, optimized ResNet-50 classifier with Grad-CAM, providing a practical and explainable solution for the reliable detection and localization of Mode Hops in shearographic tire inspection images. We trained the algorithm on an extensive, cross-machine dataset comprising more than 6.5 million test images. The final deep learning model achieves a classification accuracy of 99.67%, a false-negative rate of 0.48%, and a false-positive rate of 0.24%. Applying a probability-based quadrant-repeat decision rule within the inspection process effectively reduces process-level false positives to zero, with an estimated probability of repetition of ≤0.084%. This statistically validated approach increases the overall inspection accuracy to 99.83%. The method allows the robust detection and localization of relevant Mode Hops and represents a significant contribution to explainable, AI-supported tire testing. It fulfills central requirements for the automation of shearography-based tire testing and contributes to the possible certification process of non-destructive testing methods in safety-critical industries. Full article
Show Figures

Figure 1

36 pages, 2714 KB  
Review
Artificial Intelligence-Based Epileptic Seizure Prediction Strategies: A Review
by Andrea V. Perez-Sanchez, Martin Valtierra-Rodriguez, J. Jesus De-Santiago-Perez, Carlos A. Perez-Ramirez, Arturo Garcia-Perez and Juan P. Amezquita-Sanchez
AI 2025, 6(10), 274; https://doi.org/10.3390/ai6100274 - 21 Oct 2025
Viewed by 2
Abstract
Epilepsy, a chronic neurological disorder marked by recurrent and unpredictable seizures, poses significant risks of injury and compromises patient quality of life. The accurate forecasting of seizures is paramount for enabling timely interventions and improving safety. Since the 1970s, research has increasingly focused [...] Read more.
Epilepsy, a chronic neurological disorder marked by recurrent and unpredictable seizures, poses significant risks of injury and compromises patient quality of life. The accurate forecasting of seizures is paramount for enabling timely interventions and improving safety. Since the 1970s, research has increasingly focused on analyzing bioelectrical signals for this purpose. In recent years, artificial intelligence (AI), particularly machine learning (ML) and deep learning (DL), has emerged as a powerful tool for seizure prediction. This review, conducted by PRISMA guidelines, analyzes studies from 2020 to August 2025. It explores the evolution from traditional ML classifiers toward advanced DL architecture, including convolutional and recurrent neural networks and transformer-based frameworks, applied to bioelectrical signals. While these approaches show promising performance, significant challenges persist in patient generalization, standardized evaluation, and clinical validation. This review synthesizes current advancements, provides a critical analysis of methodological limitations, and outlines future directions for developing robust, clinically relevant seizure prediction systems to enhance patient autonomy and outcomes. Full article
Show Figures

Figure 1

33 pages, 4831 KB  
Article
A General-Purpose Knowledge Retention Metric for Evaluating Distillation Models Across Architectures and Tasks
by Arjay Alba and Jocelyn Villaverde
AI 2025, 6(10), 273; https://doi.org/10.3390/ai6100273 - 21 Oct 2025
Viewed by 136
Abstract
Background: Knowledge distillation (KD) compresses deep neural networks by transferring knowledge from a high-capacity teacher model to a lightweight student model. However, conventional evaluation metrics such as accuracy, mAP, IoU, or RMSE focus mainly on task performance and overlook how effectively the [...] Read more.
Background: Knowledge distillation (KD) compresses deep neural networks by transferring knowledge from a high-capacity teacher model to a lightweight student model. However, conventional evaluation metrics such as accuracy, mAP, IoU, or RMSE focus mainly on task performance and overlook how effectively the student internalizes the teacher’s knowledge. Methods: This study introduces the Knowledge Retention Score (KRS), a composite metric that integrates intermediate feature similarity and output agreement into a single interpretable score to quantify knowledge retention. KRS was primarily validated in computer vision (CV) through 36 experiments covering image classification, object detection, and semantic segmentation using diverse datasets and eight representative KD methods. Supplementary experiments were conducted in natural language processing (NLP) using transformer-based models on SST-2, and in time series regression with convolutional teacher–student pairs. Results: Across all domains, KRS correlated strongly with standard performance metrics while revealing internal retention dynamics that conventional evaluations often overlook. By reporting feature similarity and output agreement separately alongside the composite score, KRS provides transparent and interpretable insights into knowledge transfer. Conclusions: KRS offers a stable diagnostic tool and a complementary evaluation metric for KD research. Its generality across domains demonstrates its potential as a standardized framework for assessing knowledge retention beyond task-specific performance measures. Full article
Show Figures

Figure 1

34 pages, 8070 KB  
Article
AI-Enhanced Rescue Drone with Multi-Modal Vision and Cognitive Agentic Architecture
by Nicoleta Cristina Gaitan, Bianca Ioana Batinas and Calin Ursu
AI 2025, 6(10), 272; https://doi.org/10.3390/ai6100272 - 20 Oct 2025
Viewed by 266
Abstract
In post-disaster search and rescue (SAR) operations, unmanned aerial vehicles (UAVs) are essential tools, yet the large volume of raw visual data often overwhelms human operators by providing isolated, context-free information. This paper presents an innovative system with a novel cognitive–agentic architecture that [...] Read more.
In post-disaster search and rescue (SAR) operations, unmanned aerial vehicles (UAVs) are essential tools, yet the large volume of raw visual data often overwhelms human operators by providing isolated, context-free information. This paper presents an innovative system with a novel cognitive–agentic architecture that transforms the UAV from an intelligent tool into a proactive reasoning partner. The core innovation lies in the LLM’s ability to perform high-level semantic reasoning, logical validation, and robust self-correction through internal feedback loops. A visual perception module based on a custom-trained YOLO11 model feeds the cognitive core, which performs contextual analysis and hazard assessment, enabling a complete perception–reasoning–action cycle. The system also incorporates a physical payload delivery module for first-aid supplies, which acts on prioritized, actionable recommendations to reduce operator cognitive load and accelerate victim assistance. This work, therefore, presents the first developed LLM-driven architecture of its kind, transforming a drone from a mere data-gathering tool into a proactive reasoning partner and demonstrating a viable path toward reducing operator cognitive load in critical missions. Full article
Show Figures

Figure 1

17 pages, 1775 KB  
Article
AI-Driven Analysis for Real-Time Detection of Unstained Microscopic Cell Culture Images
by Kathrin Hildebrand, Tatiana Mögele, Dennis Raith, Maria Kling, Anna Rubeck, Stefan Schiele, Eelco Meerdink, Avani Sapre, Jonas Bermeitinger, Martin Trepel and Rainer Claus
AI 2025, 6(10), 271; https://doi.org/10.3390/ai6100271 - 18 Oct 2025
Viewed by 257
Abstract
Staining-based assays are widely used for cell analysis but are invasive, alter physiology, and prevent longitudinal monitoring. Label-free, morphology-based approaches could enable real-time, non-invasive drug testing, yet detection of subtle and dynamic changes has remained difficult. We developed a deep learning framework for [...] Read more.
Staining-based assays are widely used for cell analysis but are invasive, alter physiology, and prevent longitudinal monitoring. Label-free, morphology-based approaches could enable real-time, non-invasive drug testing, yet detection of subtle and dynamic changes has remained difficult. We developed a deep learning framework for stain-free monitoring of leukemia cell cultures using automated bright-field microscopy in a semi-automated culture system (AICE3, LABMaiTE, Augsburg, Germany). YOLOv8 models were trained on images from K562, HL-60, and Kasumi-1 cells, using an NVIDIA DGX A100 GPU for training and tested on GPU and CPU environments for real-time performance. Comparative benchmarking with RT-DETR and interpretability analyses using Eigen-CAM and radiomics (RedTell) was performed. YOLOv8 achieved high accuracy (mAP@0.5 > 98%, precision/sensitivity > 97%), with reproducibility confirmed on an independent dataset from a second laboratory and an AICE3 setup. The model distinguished between morphologically similar leukemia lines and reliably classified untreated versus differentiated K562 cells (hemin-induced erythroid and PMA-induced megakaryocytic; >95% accuracy). Incorporation of decitabine-treated cells demonstrated applicability to drug testing, revealing treatment-specific and intermediate phenotypes. Longitudinal monitoring captured culture- and time-dependent drift, enabling separation of temporal from drug-induced changes. Radiomics highlighted interpretable features such as size, elongation, and texture, but with lower accuracy than the deep learning approach. To our knowledge, this is the first demonstration that deep learning resolves subtle, drug-induced, and time-dependent morphological changes in unstained leukemia cells in real time. This approach provides a robust, accessible framework for label-free longitudinal drug testing and establishes a foundation for future autonomous, feedback-driven platforms in precision oncology. Ultimately, this approach may also contribute to more precise and adaptive clinical decision-making, advancing the field of personalized medicine. Full article
(This article belongs to the Special Issue AI in Bio and Healthcare Informatics)
Show Figures

Figure 1

25 pages, 7385 KB  
Article
Reducing Annotation Effort in Semantic Segmentation Through Conformal Risk Controlled Active Learning
by Can Erhan and Nazim Kemal Ure
AI 2025, 6(10), 270; https://doi.org/10.3390/ai6100270 - 18 Oct 2025
Viewed by 224
Abstract
Modern semantic segmentation models require extensive pixel-level annotations, creating a significant barrier to practical deployment as labeling a single image can take hours of human effort. Active learning offers a promising way to reduce annotation costs through intelligent sample selection. However, existing methods [...] Read more.
Modern semantic segmentation models require extensive pixel-level annotations, creating a significant barrier to practical deployment as labeling a single image can take hours of human effort. Active learning offers a promising way to reduce annotation costs through intelligent sample selection. However, existing methods rely on poorly calibrated confidence estimates, making uncertainty quantification unreliable. We introduce Conformal Risk Controlled Active Learning (CRC-AL), a novel framework that provides statistical guarantees on uncertainty quantification for semantic segmentation, in contrast to heuristic approaches. CRC-AL calibrates class-specific thresholds via conformal risk control, transforming softmax outputs into multi-class prediction sets with formal guarantees. From these sets, our approach derives complementary uncertainty representations: risk maps highlighting uncertain regions and class co-occurrence embeddings capturing semantic confusions. A physics-inspired selection algorithm leverages these representations with a barycenter-based distance metric that balances uncertainty and diversity. Experiments on Cityscapes and PascalVOC2012 show CRC-AL consistently outperforms baseline methods, achieving 95% of fully supervised performance with only 30% of labeled data, making semantic segmentation more practical under limited annotation budgets. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Graphical abstract

30 pages, 6035 KB  
Article
Bio-Inspired Optimization of Transfer Learning Models for Diabetic Macular Edema Classification
by A. M. Mutawa, Khalid Sabti, Bibin Shalini Sundaram Thankaleela and Seemant Raizada
AI 2025, 6(10), 269; https://doi.org/10.3390/ai6100269 - 17 Oct 2025
Viewed by 186
Abstract
Diabetic Macular Edema (DME) poses a significant threat to vision, often leading to permanent blindness if not detected and addressed swiftly. Existing manual diagnostic methods are arduous and inconsistent, highlighting the pressing necessity for automated, accurate, and personalized solutions. This study presents a [...] Read more.
Diabetic Macular Edema (DME) poses a significant threat to vision, often leading to permanent blindness if not detected and addressed swiftly. Existing manual diagnostic methods are arduous and inconsistent, highlighting the pressing necessity for automated, accurate, and personalized solutions. This study presents a novel methodology for diagnosing DME and categorizing choroidal neovascularization (CNV), drusen, and normal conditions from fundus images through the application of transfer learning models and bio-inspired optimization methodologies. The methodology utilizes advanced transfer learning architectures, including VGG16, VGG19, ResNet50, EfficientNetB7, EfficientNetV2-S, InceptionV3, and InceptionResNetV2, for analyzing both binary and multi-class Optical Coherence Tomography (OCT) datasets. We combined the OCT datasets OCT2017 and OCTC8 to create a new dataset for our study. The parameters, including learning rate, batch size, and dropout layer of the fully connected network, are further adjusted using the bio-inspired Particle Swarm Optimization (PSO) method, in conjunction with thorough preprocessing. Explainable AI approaches, especially Shapley additive explanations (SHAP), provide transparent insights into the model’s decision-making processes. Experimental findings demonstrate that our bio-inspired optimized transfer learning Inception V3 significantly surpasses conventional deep learning techniques for DME classification, as evidenced by enhanced metrics including the accuracy, precision, recall, F1-score, misclassification rate, Matthew’s correlation coefficient, intersection over union, and kappa coefficient for both binary and multi-class scenarios. The accuracy achieved is approximately 98% in binary classification and roughly 90% in multi-class classification with the Inception V3 model. The integration of contemporary transfer learning architectures with nature-inspired PSO enhances diagnostic precision to approximately 95% in multi-class classification, while also improving interpretability and reliability, which are crucial for clinical implementation. This research promotes the advancement of more precise, personalized, and timely diagnostic and therapeutic strategies for Diabetic Macular Edema, aiming to avert vision loss and improve patient outcomes. Full article
Show Figures

Figure 1

23 pages, 3017 KB  
Article
Improving Forecasting Accuracy of Stock Market Indices Utilizing Attention-Based LSTM Networks with a Novel Asymmetric Loss Function
by Shlok Sagar Rajpal, Rajesh Mahadeva, Amit Kumar Goyal and Varun Sarda
AI 2025, 6(10), 268; https://doi.org/10.3390/ai6100268 - 17 Oct 2025
Viewed by 403
Abstract
This study presents a novel approach to financial time series forecasting by introducing asymmetric loss functions. This is specifically designed to enhance directional accuracy across major stock indices (S&P 500, DJI, and NASDAQ Composite) over a 33-year time period. We integrate these loss [...] Read more.
This study presents a novel approach to financial time series forecasting by introducing asymmetric loss functions. This is specifically designed to enhance directional accuracy across major stock indices (S&P 500, DJI, and NASDAQ Composite) over a 33-year time period. We integrate these loss functions into an attention-based Long Short-Term Memory (LSTM) framework. The proposed loss functions are evaluated against traditional metrics such as Mean Squared Error (MSE), Mean Absolute Error (MAE), and other recent research-based losses. Our approach consistently achieves superior test-time directional accuracy, with gains of 3.4–6.1 percentage points over MSE/MAE and 2.0–4.5 percentage points over prior asymmetric losses, which are either non-differentiable or require extensive hyperparameter tuning. Furthermore, proposed models also achieve an F1 score of up to 0.74, compared to 0.63–0.68 for existing methods, and maintain competitive MAE values within 0.01–0.03 of the baseline. The optimized asymmetric loss functions improve specificity to above 0.62 and ensure a better balance between precision and recall. These results underscore the potential of directionally aware loss design to enhance AI-driven financial forecasting systems. Full article
(This article belongs to the Special Issue AI in Finance: Leveraging AI to Transform Financial Services)
Show Figures

Figure 1

22 pages, 3339 KB  
Article
An AutoML Algorithm: Multiple-Steps Ahead Forecasting of Correlated Multivariate Time Series with Anomalies Using Gated Recurrent Unit Networks
by Ying Su and Morgan C. Wang
AI 2025, 6(10), 267; https://doi.org/10.3390/ai6100267 - 14 Oct 2025
Viewed by 460
Abstract
Multiple time series forecasting is critical in domains such as energy management, economic analysis, web traffic prediction and air pollution monitoring to support effective resource planning. Traditional statistical learning methods, including Vector Autoregression (VAR) and Vector Autoregressive Integrated Moving Average (VARIMA), struggle with [...] Read more.
Multiple time series forecasting is critical in domains such as energy management, economic analysis, web traffic prediction and air pollution monitoring to support effective resource planning. Traditional statistical learning methods, including Vector Autoregression (VAR) and Vector Autoregressive Integrated Moving Average (VARIMA), struggle with nonstationarity, temporal dependencies, inter-series correlations, and data anomalies such as trend shifts, seasonal variations, and missing data. Furthermore, their effectiveness in multi-step ahead forecasting is often limited. This article presents an Automated Machine Learning (AutoML) framework that provides an end-to-end solution for researchers who lack in-depth knowledge of time series forecasting or advanced programming skills. This framework utilizes Gated Recurrent Unit (GRU) networks, a variant of Recurrent Neural Networks (RNNs), to tackle multiple correlated time series forecasting problems, even in the presence of anomalies. To reduce complexity and facilitate the AutoML process, many model parameters are pre-specified, thereby requiring minimal tuning. This design enables efficient and accurate multi-step forecasting while addressing issues including missing values and structural shifts. We also examine the advantages and limitations of GRU-based RNNs within the AutoML system for multivariate time series forecasting. Model performance is evaluated using multiple accuracy metrics across various forecast horizons. The empirical results confirm our proposed approach’s ability to capture inter-series dependencies and handle anomalies in long-range forecasts. Full article
Show Figures

Figure 1

24 pages, 1699 KB  
Article
Efficient Sparse MLPs Through Motif-Level Optimization Under Resource Constraints
by Xiaotian Chen, Hongyun Liu and Seyed Sahand Mohammadi Ziabari
AI 2025, 6(10), 266; https://doi.org/10.3390/ai6100266 - 9 Oct 2025
Viewed by 471
Abstract
We study motif-based optimization for sparse multilayer perceptrons (MLPs), where weights are shared and updated at the level of small neuron groups (‘motifs’) rather than individual connections. Building on Sparse Evolutionary Training (SET), our approach reduces the number of unique parameters and redundant [...] Read more.
We study motif-based optimization for sparse multilayer perceptrons (MLPs), where weights are shared and updated at the level of small neuron groups (‘motifs’) rather than individual connections. Building on Sparse Evolutionary Training (SET), our approach reduces the number of unique parameters and redundant multiply–accumulate operations by exploiting block-structured sparsity. Across Fashion-MNIST and a lung X-ray dataset, our Motif-SET improves training/inference efficiency with modest accuracy trade-offs, and we provide a principled recipe to choose motif size based on accuracy and efficiency budgets. We further compare against representative modern sparse training and compression methods, analyze failure modes such as overly large motifs, and outline real-world constraints on mobile/embedded targets. Our results and ablations indicate that motif size m=2 often offers a strong balance between compute and accuracy under resource constraints. Full article
Show Figures

Figure 1

20 pages, 3126 KB  
Article
Few-Shot Image Classification Algorithm Based on Global–Local Feature Fusion
by Lei Zhang, Xinyu Yang, Xiyuan Cheng, Wenbin Cheng and Yiting Lin
AI 2025, 6(10), 265; https://doi.org/10.3390/ai6100265 - 9 Oct 2025
Viewed by 550
Abstract
Few-shot image classification seeks to recognize novel categories from only a handful of labeled examples, but conventional metric-based methods that rely mainly on global image features often produce unstable prototypes under extreme data scarcity, while local-descriptor approaches can lose context and suffer from [...] Read more.
Few-shot image classification seeks to recognize novel categories from only a handful of labeled examples, but conventional metric-based methods that rely mainly on global image features often produce unstable prototypes under extreme data scarcity, while local-descriptor approaches can lose context and suffer from inter-class local-pattern overlap. To address these limitations, we propose a Global–Local Feature Fusion network that combines a frozen, pretrained global feature branch with a self-attention based multi-local feature fusion branch. Multiple random crops are encoded by a shared backbone (ResNet-12), projected to Query/Key/Value embeddings, and fused via scaled dot-product self-attention to suppress background noise and highlight discriminative local cues. The fused local representation is concatenated with the global feature to form robust class prototypes used in a prototypical-network style classifier. On four benchmarks, our method achieves strong improvements: Mini-ImageNet 70.31% ± 0.20 (1-shot)/85.91% ± 0.13 (5-shot), Tiered-ImageNet 73.37% ± 0.22/87.62% ± 0.14, FC-100 47.01% ± 0.20/64.13% ± 0.19, and CUB-200-2011 82.80% ± 0.18/93.19% ± 0.09, demonstrating consistent gains over competitive baselines. Ablation studies show that (1) naive local averaging improves over global-only baselines, (2) self-attention fusion yields a large additional gain (e.g., +4.50% in 1-shot on Mini-ImageNet), and (3) concatenating global and fused local features gives the best overall performance. These results indicate that explicitly modeling inter-patch relations and fusing multi-granularity cues produces markedly more discriminative prototypes in few-shot regimes. Full article
Show Figures

Figure 1

15 pages, 3254 KB  
Article
Rodent Social Behavior Recognition Using a Global Context-Aware Vision Transformer Network
by Muhammad Imran Sharif, Doina Caragea and Ahmed Iqbal
AI 2025, 6(10), 264; https://doi.org/10.3390/ai6100264 - 8 Oct 2025
Viewed by 563
Abstract
Animal behavior recognition is an important research area that provides insights into areas such as neural functions, gene mutations, and drug efficacy, among others. The manual coding of behaviors based on video recordings is labor-intensive and prone to inconsistencies and human error. Machine [...] Read more.
Animal behavior recognition is an important research area that provides insights into areas such as neural functions, gene mutations, and drug efficacy, among others. The manual coding of behaviors based on video recordings is labor-intensive and prone to inconsistencies and human error. Machine learning approaches have been used to automate the analysis of animal behavior with promising results. Our work builds on existing developments in animal behavior analysis and state-of-the-art approaches in computer vision to identify rodent social behaviors. Specifically, our proposed approach, called Vision Transformer for Rat Social Interactions (ViT-RSI), leverages the existing Global Context Vision Transformer (GC-ViT) architecture to identify rat social interactions. Experimental results using five behaviors of the publicly available Rat Social Interaction (RatSI) dataset show that the ViT-RatSI approach can accurately identify rat social interaction behaviors. When compared with prior results from the literature, the ViT-RatSI approach achieves best results for four out of five behaviors, specifically for the “Approaching”, “Following”, “Moving away”, and “Solitary” behaviors, with F1 scores of 0.81, 0.81, 0.86, and 0.94, respectively. Full article
(This article belongs to the Special Issue AI in Bio and Healthcare Informatics)
Show Figures

Figure 1

20 pages, 1740 KB  
Article
Cross-Modal Alignment Enhancement for Vision–Language Tracking via Textual Heatmap Mapping
by Wei Xu, Gu Geng, Xinming Zhang and Di Yuan
AI 2025, 6(10), 263; https://doi.org/10.3390/ai6100263 - 8 Oct 2025
Viewed by 662
Abstract
Single-object vision–language tracking has become an important research topic due to its potential in applications such as intelligent surveillance and autonomous driving. However, existing cross-modal alignment methods typically rely on contrastive learning and struggle to effectively address semantic ambiguity or the presence of [...] Read more.
Single-object vision–language tracking has become an important research topic due to its potential in applications such as intelligent surveillance and autonomous driving. However, existing cross-modal alignment methods typically rely on contrastive learning and struggle to effectively address semantic ambiguity or the presence of multiple similar objects. This study aims to explore how to achieve more robust vision–language alignment under these challenging conditions, thereby achieving accurate object localization. To this end, we propose a text heatmap mapping (THM) module that enhances the spatial guidance of textual cues in tracking. The THM module integrates visual and language features and generates semantically aware heatmaps, enabling the tracker to focus on the most relevant regions while suppressing distractors. This framework, developed based on UVLTrack, combines a visual transformer with a pre-trained language encoder. The proposed method is evaluated on benchmark datasets such as OTB99, LaSOT, and TNL2K. The main contribution of this paper is the introduction of a novel spatial alignment mechanism for multimodal tracking and its effectiveness on various tracking benchmarks. Results demonstrate that the THM-based tracker improves robustness to semantic ambiguity and multi-instance interference, outperforming baseline frameworks. Full article
Show Figures

Figure 1

27 pages, 9738 KB  
Article
Machine Learning Recognition and Phase Velocity Estimation of Atmospheric Gravity Waves from OI 557.7 nm All-Sky Airglow Images
by Rady Mahmoud, Moataz Abdelwahab, Kazuo Shiokawa and Ayman Mahrous
AI 2025, 6(10), 262; https://doi.org/10.3390/ai6100262 - 7 Oct 2025
Viewed by 590
Abstract
Atmospheric gravity waves (AGWs) are treated as density structure perturbations of the atmosphere and play an important role in atmospheric dynamics. Utilizing All-Sky Airglow Imagers (ASAIs) with OI-Filter 557.7 nm, AGW phase velocity and propagation direction were extracted using classified images by visual [...] Read more.
Atmospheric gravity waves (AGWs) are treated as density structure perturbations of the atmosphere and play an important role in atmospheric dynamics. Utilizing All-Sky Airglow Imagers (ASAIs) with OI-Filter 557.7 nm, AGW phase velocity and propagation direction were extracted using classified images by visual inspection, where airglow images were collected from the OMTI network at Shigaraki (34.85 E, 134.11 N) from October 1998 to October 2002. Nonetheless, a large dataset of airglow images are processed and classified for studying AGW seasonal variation in the middle atmosphere. In this article, a machine learning-based approach for image recognition of AGWs from ASAIs is suggested. Consequently, three convolutional neural networks (CNNs), namely AlexNet, GoogLeNet, and ResNet-50, are considered. Out of 13,201 deviated images, 1192 very weak/unclear AGW signatures were eliminated during the quality control process. All networks were trained and tested by 12,007 classified images which approximately cover the maximum solar cycle during the time-period mentioned above. In the testing phase, AlexNet achieved the highest accuracy of 98.41%. Consequently, estimation of AGW zonal and meridional phase velocities in the mesosphere region by a cascade forward neural network (CFNN) is presented. The CFNN was trained and tested based on AGW and neutral wind data. AGW data were extracted from the classified AGW images by event and spectral methods, where wind data were extracted from the Horizontal Wind Model (HWM) as well as the middle and upper atmosphere radar in Shigaraki. As a result, the estimated phase velocities were determined with correlation coefficient (R) above 0.89 in all training and testing phases. Finally, a comparison with the existing studies confirms the accuracy of our proposed approaches in addition to AGW velocity forecasting. Full article
Show Figures

Figure 1

19 pages, 1858 KB  
Article
Color Space Comparison of Isolated Cervix Cells for Morphology Classification
by Irari Jiménez-López, José E. Valdez-Rodríguez and Marco A. Moreno-Armendáriz
AI 2025, 6(10), 261; https://doi.org/10.3390/ai6100261 - 7 Oct 2025
Viewed by 403
Abstract
Cervical cytology processing involves the morphological analysis of cervical cells to detect abnormalities. In recent years, machine learning and deep learning algorithms have been explored to automate this process. This study investigates the use of color space transformations as a preprocessing technique to [...] Read more.
Cervical cytology processing involves the morphological analysis of cervical cells to detect abnormalities. In recent years, machine learning and deep learning algorithms have been explored to automate this process. This study investigates the use of color space transformations as a preprocessing technique to reorganize visual information and improve classification performance using isolated cell images. Twelve color space transformations were compared, including RGB, CMYK, HSV, Grayscale, CIELAB, YUV, the individual RGB channels, and combinations of these channels (RG, RB, and GB). Two classification strategies were employed: binary classification (normal vs. abnormal) and five-class classification. The SIPaKMeD dataset was used, with images resized to 256×256 pixels via zero-padding. Data augmentation included random flipping and ±10° rotations applied with a 50% probability, followed by normalization. A custom CNN architecture was developed, comprising four convolutional layers followed by two fully connected layers and an output layer. The model achieved average precision, recall, and F1-score values of 91.39%, 91.34%, and 91.31% for the five-class case, respectively, and 99.69%, 96.68%, and 96.89% for the binary classification, respectively; these results were compared with a VGG-16 network. Furthermore, CMYK, HSV, and the RG channel combination consistently outperformed other color spaces, highlighting their potential to enhance classification accuracy. Full article
(This article belongs to the Special Issue AI in Bio and Healthcare Informatics)
Show Figures

Figure 1

80 pages, 7623 KB  
Systematic Review
From Illusion to Insight: A Taxonomic Survey of Hallucination Mitigation Techniques in LLMs
by Ioannis Kazlaris, Efstathios Antoniou, Konstantinos Diamantaras and Charalampos Bratsas
AI 2025, 6(10), 260; https://doi.org/10.3390/ai6100260 - 3 Oct 2025
Viewed by 1340
Abstract
Large Language Models (LLMs) exhibit remarkable generative capabilities but remain vulnerable to hallucinations—outputs that are fluent yet inaccurate, ungrounded, or inconsistent with source material. To address the lack of methodologically grounded surveys, this paper introduces a novel method-oriented taxonomy of hallucination mitigation strategies [...] Read more.
Large Language Models (LLMs) exhibit remarkable generative capabilities but remain vulnerable to hallucinations—outputs that are fluent yet inaccurate, ungrounded, or inconsistent with source material. To address the lack of methodologically grounded surveys, this paper introduces a novel method-oriented taxonomy of hallucination mitigation strategies in text-based LLMs. The taxonomy organizes over 300 studies into six principled categories: Training and Learning Approaches, Architectural Modifications, Input/Prompt Optimization, Post-Generation Quality Control, Interpretability and Diagnostic Methods, and Agent-Based Orchestration. Beyond mapping the field, we identify persistent challenges such as the absence of standardized evaluation benchmarks, attribution difficulties in multi-method systems, and the fragility of retrieval-based methods when sources are noisy or outdated. We also highlight emerging directions, including knowledge-grounded fine-tuning and hybrid retrieval–generation pipelines integrated with self-reflective reasoning agents. This taxonomy provides a methodological framework for advancing reliable, context-sensitive LLM deployment in high-stakes domains such as healthcare, law, and defense. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Graphical abstract

21 pages, 899 KB  
Article
Gated Fusion Networks for Multi-Modal Violence Detection
by Bilal Ahmad, Mustaqeem Khan and Muhammad Sajjad
AI 2025, 6(10), 259; https://doi.org/10.3390/ai6100259 - 3 Oct 2025
Viewed by 486
Abstract
Public safety and security require an effective monitoring system to detect violence through visual, audio, and motion data. However, current methods often fail to utilize the complementary benefits of visual and auditory modalities, thereby reducing their overall effectiveness. To enhance violence detection, we [...] Read more.
Public safety and security require an effective monitoring system to detect violence through visual, audio, and motion data. However, current methods often fail to utilize the complementary benefits of visual and auditory modalities, thereby reducing their overall effectiveness. To enhance violence detection, we present a novel multimodal method in this paper that detects motion, audio, and visual information from the input to recognize violence. We designed a framework comprising two specialized components: a gated fusion module and a multi-scale transformer, which enables the efficient detection of violence in multimodal data. To ensure a seamless and effective integration of features, a gated fusion module dynamically adjusts the contribution of each modality. At the same time, a multi-modal transformer utilizes multiple instance learning (MIL) to identify violent behaviors more accurately from input data by capturing complex temporal correlations. Our model fully integrates multi-modal information using these techniques, improving the accuracy of violence detection. In this study, we found that our approach outperformed state-of-the-art methods with an accuracy of 86.85% using the XD-Violence dataset, thereby demonstrating the potential of multi-modal fusion in detecting violence. Full article
Show Figures

Figure 1

45 pages, 7902 KB  
Review
Artificial Intelligence-Guided Supervised Learning Models for Photocatalysis in Wastewater Treatment
by Asma Rehman, Muhammad Adnan Iqbal, Mohammad Tauseef Haider and Adnan Majeed
AI 2025, 6(10), 258; https://doi.org/10.3390/ai6100258 - 3 Oct 2025
Viewed by 882
Abstract
Artificial intelligence (AI), when integrated with photocatalysis, has demonstrated high predictive accuracy in optimizing photocatalytic processes for wastewater treatment using a variety of catalysts such as TiO2, ZnO, CdS, Zr, WO2, and CeO2. The progress of research [...] Read more.
Artificial intelligence (AI), when integrated with photocatalysis, has demonstrated high predictive accuracy in optimizing photocatalytic processes for wastewater treatment using a variety of catalysts such as TiO2, ZnO, CdS, Zr, WO2, and CeO2. The progress of research in this area is greatly enhanced by advancements in data science and AI, which enable rapid analysis of large datasets in materials chemistry. This article presents a comprehensive review and critical assessment of AI-based supervised learning models, including support vector machines (SVMs), artificial neural networks (ANNs), and tree-based algorithms. Their predictive capabilities have been evaluated using statistical metrics such as the coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE), with numerous investigations documenting R2 values greater than 0.95 and RMSE values as low as 0.02 in forecasting pollutant degradation. To enhance model interpretability, Shapley Additive Explanations (SHAP) have been employed to prioritize the relative significance of input variables, illustrating, for example, that pH and light intensity frequently exert the most substantial influence on photocatalytic performance. These AI frameworks not only attain dependable predictions of degradation efficiency for dyes, pharmaceuticals, and heavy metals, but also contribute to economically viable optimization strategies and the identification of novel photocatalysts. Overall, this review provides evidence-based guidance for researchers and practitioners seeking to advance wastewater treatment technologies by integrating supervised machine learning with photocatalysis. Full article
Show Figures

Figure 1

40 pages, 2282 KB  
Review
Data Preprocessing and Feature Engineering for Data Mining: Techniques, Tools, and Best Practices
by Paraskevas Koukaras and Christos Tjortjis
AI 2025, 6(10), 257; https://doi.org/10.3390/ai6100257 - 2 Oct 2025
Viewed by 874
Abstract
Data preprocessing and feature engineering play key roles in data mining initiatives, as they have a significant impact on the accuracy, reproducibility, and interpretability of analytical results. This review presents an analysis of state-of-the-art techniques and tools that can be used in data [...] Read more.
Data preprocessing and feature engineering play key roles in data mining initiatives, as they have a significant impact on the accuracy, reproducibility, and interpretability of analytical results. This review presents an analysis of state-of-the-art techniques and tools that can be used in data input preparation and data manipulation to be processed by mining tasks in diverse application scenarios. Additionally, basic preprocessing techniques are discussed, including data cleaning, normalisation, and encoding, as well as more sophisticated approaches regarding feature construction, selection, and dimensionality reduction. This work considers manual and automated methods, highlighting their integration in reproducible, large-scale pipelines by leveraging modern libraries. We also discuss assessment methods of preprocessing effects on precision, stability, and bias–variance trade-offs for models, as well as pipeline integrity monitoring, when operating environments vary. We focus on emerging issues regarding scalability, fairness, and interpretability, as well as future directions involving adaptive preprocessing and automation guided by ethically sound design philosophies. This work aims to benefit both professionals and researchers by shedding light on best practices, while acknowledging existing research questions and innovation opportunities. Full article
Show Figures

Figure 1

23 pages, 1370 KB  
Article
The PacifAIst Benchmark: Do AIs Prioritize Human Survival over Their Own Objectives?
by Manuel Herrador
AI 2025, 6(10), 256; https://doi.org/10.3390/ai6100256 - 2 Oct 2025
Viewed by 733
Abstract
As artificial intelligence transitions from conversational agents to autonomous actors in high-stakes environments, a critical gap emerges: how to ensure AI prioritizes human safety when its core objectives conflict with human well-being. Current safety benchmarks focus on harmful content, not behavioral alignment during [...] Read more.
As artificial intelligence transitions from conversational agents to autonomous actors in high-stakes environments, a critical gap emerges: how to ensure AI prioritizes human safety when its core objectives conflict with human well-being. Current safety benchmarks focus on harmful content, not behavioral alignment during instrumental goal conflicts. To address this, we introduce PacifAIst, a benchmark of 700 scenarios testing self-preservation, resource acquisition, and deception. We evaluated eight state-of-the-art large language models, revealing a significant performance hierarchy. Google’s Gemini 2.5 Flash demonstrated the strongest human-centric alignment (90.31%), while the highly anticipated GPT-5 scored lowest (79.49%), indicating potential risks. These findings establish an urgent need to shift the focus of AI safety evaluation from what models say to what they would do, ensuring that autonomous systems are not just helpful in theory but are provably safe in practice. Full article
Show Figures

Figure 1

9 pages, 452 KB  
Article
Diagnostic Performance of AI-Assisted Software in Sports Dentistry: A Validation Study
by André Júdice, Diogo Brandão, Carlota Rodrigues, Cátia Simões, Gabriel Nogueira, Vanessa Machado, Luciano Maia Alves Ferreira, Daniel Ferreira, Luís Proença, João Botelho, Peter Fine and José João Mendes
AI 2025, 6(10), 255; https://doi.org/10.3390/ai6100255 - 1 Oct 2025
Viewed by 911
Abstract
Artificial Intelligence (AI) applications in sports dentistry have the potential to improve early detection and diagnosis. We aimed to validate the diagnostic performance of AI-assisted software in detecting dental caries, periodontitis, and tooth wear using panoramic radiographs in elite athletes. This cross-sectional validation [...] Read more.
Artificial Intelligence (AI) applications in sports dentistry have the potential to improve early detection and diagnosis. We aimed to validate the diagnostic performance of AI-assisted software in detecting dental caries, periodontitis, and tooth wear using panoramic radiographs in elite athletes. This cross-sectional validation study included secondary data from 114 elite athletes from the Sports Dentistry department at Egas Moniz Dental Clinic. The AI software’s performance was compared to clinically validated assessments. Dental caries and tooth wear were inspected clinically and confirmed radiographically. Periodontitis was registered through self-reports. We calculated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), as well as the area under the curve and respective 95% confidence intervals. Inter-rater agreement was assessed using Cohen’s kappa statistic. The AI software showed high reproducibility, with kappa values of 0.82 for caries, 0.91 for periodontitis, 0.96 for periapical lesions, and 0.76 for tooth wear. Sensitivity was highest for periodontitis (1.00; AUC = 0.84), moderate for caries (0.74; AUC = 0.69), and lower for tooth wear (0.53; AUC = 0.68). Full agreement between AI and clinical reference was achieved in 86.0% of cases. The software generated a median of 3 AI-specific suggestions per case (range: 0–16). In 21.9% of cases, AI’s interpretation of periodontal level was deemed inadequate; among these, only 2 cases were clinically confirmed as periodontitis. Of the 34 false positives for periodontitis, 32.4% were misidentified by the AI. The AI-assisted software demonstrated substantial agreement with clinical diagnosis, particularly for periodontitis and caries. The relatively high false-positive rate for periodontitis and limited sensitivity for tooth wear underscore the need for cautious clinical integration, supervision, and further model refinements. However, this software did show overall adequate performance for application in Sports Dentistry. Full article
Show Figures

Figure 1

20 pages, 5721 KB  
Article
Support Vector Machines to Propose a Ground Motion Prediction Equation for the Particular Case of the Bojorquez Intensity Measure INp
by Edén Bojórquez, Omar Payán-Serrano, Juan Bojórquez, Ali Rodríguez-Castellanos, Sonia E. Ruiz, Alfredo Reyes-Salazar, Robespierre Chávez, Herian Leyva and Fernando Velarde
AI 2025, 6(10), 254; https://doi.org/10.3390/ai6100254 - 1 Oct 2025
Viewed by 416
Abstract
This study proposes the first ground motion prediction equation (GMPE) for the parameter INp, an intensity measure based on the spectral shape. A Machine Learning Algorithm based on Support Vector Machines (SVMs) was employed due to its robustness towards outliers, which [...] Read more.
This study proposes the first ground motion prediction equation (GMPE) for the parameter INp, an intensity measure based on the spectral shape. A Machine Learning Algorithm based on Support Vector Machines (SVMs) was employed due to its robustness towards outliers, which is a key advantage over ordinary linear regression. INp also offers a more robust measure of the ground motion intensity than the traditionally used spectral acceleration at the first mode of vibration of the structure Sa(T1). The SVM algorithm, configured for regression (SVR), was applied to derive the prediction coefficients of INp for diverse vibration periods. Furthermore, the complete dataset was analyzed to develop a unified, generalized expression applicable across all the periods considered. To validate the model’s reliability and its ability to generalize, a cross-validation analysis was performed. The results from this rigorous validation confirm the model’s robustness and demonstrate that its predictive accuracy is not dependent on a specific data split. The numerical results show that the newly developed GMPE reveals high predictive accuracy for periods shorter than 3 s and acceptable accuracy for longer periods. The generalized equation exhibits an acceptable coefficient of determination and Mean Squared Error (MSE) for periods from 0.1 to 5 s. This work not only highlights the powerful potential of machine learning in seismic engineering but also introduces a more sophisticated and effective tool for predicting ground motion intensity. Full article
Show Figures

Figure 1

90 pages, 29362 KB  
Review
AI for Wildfire Management: From Prediction to Detection, Simulation, and Impact Analysis—Bridging Lab Metrics and Real-World Validation
by Nicolas Caron, Hassan N. Noura, Lise Nakache, Christophe Guyeux and Benjamin Aynes
AI 2025, 6(10), 253; https://doi.org/10.3390/ai6100253 - 1 Oct 2025
Viewed by 1589
Abstract
Artificial intelligence (AI) offers several opportunities in wildfire management, particularly for improving short- and long-term fire occurrence forecasting, spread modeling, and decision-making. When properly adapted beyond research into real-world settings, AI can significantly reduce risks to human life, as well as ecological and [...] Read more.
Artificial intelligence (AI) offers several opportunities in wildfire management, particularly for improving short- and long-term fire occurrence forecasting, spread modeling, and decision-making. When properly adapted beyond research into real-world settings, AI can significantly reduce risks to human life, as well as ecological and economic damages. However, despite increasingly sophisticated research, the operational use of AI in wildfire contexts remains limited. In this article, we review the main domains of wildfire management where AI has been applied—susceptibility mapping, prediction, detection, simulation, and impact assessment—and highlight critical limitations that hinder practical adoption. These include challenges with dataset imbalance and accessibility, the inadequacy of commonly used metrics, the choice of prediction formats, and the computational costs of large-scale models, all of which reduce model trustworthiness and applicability. Beyond synthesizing existing work, our survey makes four explicit contributions: (1) we provide a reproducible taxonomy supported by detailed dataset tables, emphasizing both the reliability and shortcomings of frequently used data sources; (2) we propose evaluation guidance tailored to imbalanced and spatial tasks, stressing the importance of using accurate metrics and format; (3) we provide a complete state of the art, highlighting important issues and recommendations to enhance models’ performances and reliability from susceptibility to damage analysis; (4) we introduce a deployment checklist that considers cost, latency, required expertise, and integration with decision-support and optimization systems. By bridging the gap between laboratory-oriented models and real-world validation, our work advances prior reviews and aims to strengthen confidence in AI-driven wildfire management while guiding future research toward operational applicability. Full article
Show Figures

Figure 1

16 pages, 7297 KB  
Article
Attention-Based Multi-Agent RL for Multi-Machine Tending Using Mobile Robots
by Abdalwhab Bakheet Mohamed Abdalwhab, Giovanni Beltrame, Samira Ebrahimi Kahou and David St-Onge
AI 2025, 6(10), 252; https://doi.org/10.3390/ai6100252 - 1 Oct 2025
Viewed by 583
Abstract
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborative robots can tackle that can also greatly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm [...] Read more.
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborative robots can tackle that can also greatly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. We introduce a multi-agent multi-machine-tending learning framework using mobile robots based on multi-agent reinforcement learning (MARL) techniques, with the design of a suitable observation and reward. Moreover, we integrate an attention-based encoding mechanism into the Multi-Agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine-tending scenarios. Our model (AB-MAPPO) outperforms MAPPO in this new challenging scenario in terms of task success, safety, and resource utilization. Furthermore, we provided an extensive ablation study to support our design decisions. Full article
Show Figures

Figure 1

18 pages, 966 KB  
Article
Deep Learning Approaches for Classifying Aviation Safety Incidents: Evidence from Australian Data
by Aziida Nanyonga, Keith Francis Joiner, Ugur Turhan and Graham Wild
AI 2025, 6(10), 251; https://doi.org/10.3390/ai6100251 - 1 Oct 2025
Viewed by 452
Abstract
Aviation safety remains a critical area of research, requiring accurate and efficient classification of incident reports to enhance risk assessment and accident prevention strategies. This study evaluates the performance of three deep learning models, BERT, Convolutional Neural Networks (CNN), and Long Short-Term Memory [...] Read more.
Aviation safety remains a critical area of research, requiring accurate and efficient classification of incident reports to enhance risk assessment and accident prevention strategies. This study evaluates the performance of three deep learning models, BERT, Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) for classifying incidents based on injury severity levels: Nil, Minor, Serious, and Fatal. The dataset, drawn from ATSB records covering the years 2013 to 2023, consists of 53,273 records and was used. The models were trained using a standardized preprocessing pipeline, with hyperparameter tuning to optimize performance. Model performance was evaluated using metrics such as F1-score accuracy, recall, and precision. Results revealed that BERT outperformed both LSTM and CNN across all metrics, achieving near-perfect scores (1.00) for precision, recall, F1-score, and accuracy in all classes. In comparison, LSTM achieved an accuracy of 99.01%, with strong performance in the “Nil” class, but less favorable results for the “Minor” class. CNN, with an accuracy of 98.99%, excelled in the “Fatal” and “Serious” classes, though it showed moderate performance in the “Minor” class. BERT’s flawless performance highlights the strengths of transformer architecture in processing sophisticated text classification problems. These findings underscore the strengths and limitations of traditional deep learning models versus transformer-based approaches, providing valuable insights for future research in aviation safety analysis. Future work will explore integrating ensemble methods, domain-specific embeddings, and model interpretability to further improve classification performance and transparency in aviation safety prediction. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence, 3rd Edition)
Show Figures

Figure 1

15 pages, 10305 KB  
Article
Convolutional Neural Network for Automatic Detection of Segments Contaminated by Interference in ECG Signal
by Veronika Kalousková, Pavel Smrčka, Radim Kliment, Tomáš Veselý, Martin Vítězník, Adam Zach and Petr Šrotýř
AI 2025, 6(10), 250; https://doi.org/10.3390/ai6100250 - 1 Oct 2025
Viewed by 384
Abstract
Various types of interfering signals are an integral part of ECGs recorded using wearable electronics, specifically during field monitoring, outside the controlled environment of a medical doctor’s office, or laboratory. The frequency spectrum of several types of interfering signals overlaps significantly with the [...] Read more.
Various types of interfering signals are an integral part of ECGs recorded using wearable electronics, specifically during field monitoring, outside the controlled environment of a medical doctor’s office, or laboratory. The frequency spectrum of several types of interfering signals overlaps significantly with the ECG signal, making effective filtration impossible without losing clinically relevant information. In this article, we proceed from the practical assumption that it is unnecessary to analyze the entire ECG recording in real long-term recordings. Conversely, in the preprocessing phase, it is necessary to detect unreadable segments of the ECG signal. This paper proposes a novel method for automatically detecting unreadable segments distorted by superimposed interference in ECG recordings. The method is based on a convolutional neural network (CNN) and is comparable in quality to annotation performed by a medical expert, but incomparably faster. In a series of controlled experiments, the ECG signal was recorded during physical activities of varying intensities, and individual segments of the recordings were manually annotated based on visual assessment by a medical expert, i.e., divided into four different classes based on the intensity of distortion to the useful ECG signal. A deep convolutional model was designed and evaluated, exhibiting a 87.62% accuracy score and the same F1-score in automatic recognition of segments distorted by superimposed interference. Furthermore, the model exhibits an accuracy and F1-score of 98.70% in correctly identifying segments with visually detectable and non-detectable heart rate. The proposed interference detection procedure appears to be sufficiently effective despite its simplicity. It facilitates subsequent automatic analysis of undisturbed ECG waveform segments, which is crucial in ECG monitoring using wearable electronics. Full article
Show Figures

Figure 1

19 pages, 4717 KB  
Article
Benchmarking Psychological Lexicons and Large Language Models for Emotion Detection in Brazilian Portuguese
by Thales David Domingues Aparecido, Alexis Carrillo, Chico Q. Camargo and Massimo Stella
AI 2025, 6(10), 249; https://doi.org/10.3390/ai6100249 - 1 Oct 2025
Viewed by 563
Abstract
Emotion detection in Brazilian Portuguese is less studied than in English. We benchmarked a large language model (Mistral 24B), a language-specific transformer model (BERTimbau), and the lexicon-based EmoAtlas for classifying emotions in Brazilian Portuguese text, with a focus on eight emotions derived from [...] Read more.
Emotion detection in Brazilian Portuguese is less studied than in English. We benchmarked a large language model (Mistral 24B), a language-specific transformer model (BERTimbau), and the lexicon-based EmoAtlas for classifying emotions in Brazilian Portuguese text, with a focus on eight emotions derived from Plutchik’s model. Evaluation covered four corpora: 4000 stock-market tweets, 1000 news headlines, 5000 GoEmotions Reddit comments translated by LLMs, and 2000 DeepSeek-generated headlines. While BERTimbau achieved the highest average scores (accuracy 0.876, precision 0.529, and recall 0.423), an overlap with Mistral (accuracy 0.831, precision 0.522, and recall 0.539) and notable performance variability suggest there is no single top performer; however, both transformer-based models outperformed the lexicon-based EmoAtlas (accuracy 0.797) but required up to 40 times more computational resources. We also introduce a novel “emotional fingerprinting” methodology using a synthetically generated dataset to probe emotional alignment, which revealed an imperfect overlap in the emotional representations of the models. While LLMs deliver higher overall scores, EmoAtlas offers superior interpretability and efficiency, making it a cost-effective alternative. This work delivers the first quantitative benchmark for interpretable emotion detection in Brazilian Portuguese, with open datasets and code to foster research in multilingual natural language processing. Full article
Show Figures

Figure 1

18 pages, 3163 KB  
Article
A Multi-Stage Deep Learning Framework for Antenna Array Synthesis in Satellite IoT Networks
by Valliammai Arunachalam, Luke Rosen, Mojisola Rachel Akinsiku, Shuvashis Dey, Rahul Gomes and Dipankar Mitra
AI 2025, 6(10), 248; https://doi.org/10.3390/ai6100248 - 1 Oct 2025
Viewed by 543
Abstract
This paper presents an innovative end-to-end framework for conformal antenna array design and beam steering in Low Earth Orbit (LEO) satellite-based IoT communication systems. We propose a multi-stage learning architecture that integrates machine learning (ML) for antenna parameter prediction with reinforcement learning (RL) [...] Read more.
This paper presents an innovative end-to-end framework for conformal antenna array design and beam steering in Low Earth Orbit (LEO) satellite-based IoT communication systems. We propose a multi-stage learning architecture that integrates machine learning (ML) for antenna parameter prediction with reinforcement learning (RL) for adaptive beam steering. The ML module predicts optimal geometric and material parameters for conformal antenna arrays based on mission-specific performance requirements such as frequency, gain, coverage angle, and satellite constraints with an accuracy of 99%. These predictions are then passed to a Deep Q-Network (DQN)-based offline RL model, which learns beamforming strategies to maximize gain toward dynamic ground terminals, without requiring real-time interaction. To enable this, a synthetic dataset grounded in statistical principles and a static dataset is generated using CST Studio Suite and COMSOL Multiphysics simulations, capturing the electromagnetic behavior of various conformal geometries. The results from both the machine learning and reinforcement learning models show that the predicted antenna designs and beam steering angles closely align with simulation benchmarks. Our approach demonstrates the potential of combining data-driven ensemble models with offline reinforcement learning for scalable, efficient, and autonomous antenna synthesis in resource-constrained space environments. Full article
Show Figures

Figure 1

31 pages, 7395 KB  
Article
Creativeable: Leveraging AI for Personalized Creativity Enhancement
by Ariel Kreisberg-Nitzav and Yoed N. Kenett
AI 2025, 6(10), 247; https://doi.org/10.3390/ai6100247 - 1 Oct 2025
Viewed by 1060
Abstract
Creativity is central to innovation and problem-solving, yet scalable training solutions remain limited. This study evaluates Creativeable, an AI-powered creativity training program that provides automated feedback and adjusts creative story writing task difficulty without human intervention. A total of 385 participants completed [...] Read more.
Creativity is central to innovation and problem-solving, yet scalable training solutions remain limited. This study evaluates Creativeable, an AI-powered creativity training program that provides automated feedback and adjusts creative story writing task difficulty without human intervention. A total of 385 participants completed five rounds of creative story writing using semantically distant word prompts across four conditions: (1) feedback with adaptive difficulty (F/VL); (2) feedback with constant difficulty (F/CL); (3) no feedback with adaptive difficulty (NF/VL); (4) no feedback with constant difficulty (NF/CL). Before and after using Creativeable, participants were assessed for their creativity, via the alternative uses task, as well as undergoing a control semantic fluency task. While creativity improvements were evident across conditions, the degree of effectiveness varied. The F/CL condition led to the most notable gains, followed by the NF/CL and NF/VL conditions, while the F/VL condition exhibited comparatively smaller improvements. These findings highlight the potential of AI to democratize creativity training by offering scalable, personalized interventions, while also emphasizing the importance of balancing structured feedback with increasing task complexity to support sustained creative growth. Full article
Show Figures

Figure 1

42 pages, 4717 KB  
Article
Intelligent Advanced Control System for Isotopic Separation: An Adaptive Strategy for Variable Fractional-Order Processes Using AI
by Roxana Motorga, Vlad Mureșan, Mihaela-Ligia Ungureșan, Mihail Abrudean, Honoriu Vǎlean and Valentin Sita
AI 2025, 6(10), 246; https://doi.org/10.3390/ai6100246 - 1 Oct 2025
Viewed by 397
Abstract
This paper provides the modeling, implementation, and simulation of fractional-order processes associated with the production of the enriched 13C isotope due to chemical exchange processes between carbamate and CO2. To demonstrate and simulate the process most effectively, an execution of [...] Read more.
This paper provides the modeling, implementation, and simulation of fractional-order processes associated with the production of the enriched 13C isotope due to chemical exchange processes between carbamate and CO2. To demonstrate and simulate the process most effectively, an execution of a new approximating solution of fractional-order systems is required, which has become possible due to the utilization of advanced AI methods. As the separation process exhibits extremely strong nonlinearity and fractional-order-based performance, it was similarly necessary to utilize the fractional-order system theory to mathematically model the operation, which consists of the comparison of its output with an integrator function. The learning of the dynamic structure’s parameters of the derived fractional-order model is performed by neural networks, which are AI-based domain solutions. Thanks to the approximations executed, the concentration dynamics of the enriched 13C isotope can be simulated and predicted with a high level of precision. The solutions’ effectiveness is corroborated by the model’s response comparison with the reaction of the actual process. The current implementation uses neural networks trained specifically for this purpose. Furthermore, since the isotopic separation processes are long-settling-time processes, this paper proposes some control strategies that are developed for the 13C isotopic separation process, in order to improve the system performances and to avoid the loss of enriched product. The adaptive controllers were tuned by imposing them to follow the output of a first-order-type transfer function, using a PI or a PID controller. Finally, the paper confirms that AI solutions can successfully support the system throughout a range of responses, which paves the way for an efficient design of the automatic control for the 13C isotope concentration. Such systems can similarly be implemented in other industrial processes. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop