Machine Learning and Knowledge Extraction

30 pages, 2240 KB

Open AccessArticle

Distributionally Robust Bayesian Optimization via Sinkhorn-Based Wasserstein Barycenter

by Iman Seyedi, Antonio Candelieri and Francesco Archetti

Mach. Learn. Knowl. Extr. 2025, 7(3), 90; https://doi.org/10.3390/make7030090 (registering DOI) - 28 Aug 2025

This paper introduces a novel framework for Distributionally Robust Bayesian Optimization (DRBO) with continuous context that integrates optimal transport theory and entropic regularization. We propose the sampling from the Wasserstein Barycenter Bayesian Optimization (SWBBO) method to deal with uncertainty about the context; that [...] Read more.

This paper introduces a novel framework for Distributionally Robust Bayesian Optimization (DRBO) with continuous context that integrates optimal transport theory and entropic regularization. We propose the sampling from the Wasserstein Barycenter Bayesian Optimization (SWBBO) method to deal with uncertainty about the context; that is, the unknown stochastic component affecting the observations of the black-box objective function. This approach captures the geometric structure of the underlying distributional uncertainty and enables robust acquisition strategies without incurring excessive computational costs. The method incorporates adaptive robustness scheduling, Lipschitz regularization, and efficient barycenter construction to balance exploration and exploitation. Theoretical analysis establishes convergence guarantees for the robust Bayesian Optimization acquisition function. Empirical evaluations on standard global optimization problems and real-life inspired benchmarks demonstrate that SWBBO consistently achieves faster convergence, good final regret, and greater stability than other recently proposed methods for DRBO with continuous context. Indeed, SWBBO outperforms all of them in terms of both optimization performance and robustness under repeated evaluations. Full article

22 pages, 3476 KB

Open AccessArticle

AlzheimerRAG: Multimodal Retrieval-Augmented Generation for Clinical Use Cases

by Aritra Kumar Lahiri and Qinmin Vivian Hu

Mach. Learn. Knowl. Extr. 2025, 7(3), 89; https://doi.org/10.3390/make7030089 - 27 Aug 2025

Abstract

Recent advancements in generative AI have fostered the development of highly adept Large Language Models (LLMs) that integrate diverse data types to empower decision-making. Among these, multimodal retrieval-augmented generation (RAG) applications are promising because they combine the strengths of information retrieval and generative [...] Read more.

Recent advancements in generative AI have fostered the development of highly adept Large Language Models (LLMs) that integrate diverse data types to empower decision-making. Among these, multimodal retrieval-augmented generation (RAG) applications are promising because they combine the strengths of information retrieval and generative models, enhancing their utility across various domains, including clinical use cases. This paper introduces AlzheimerRAG, a multimodal RAG application for clinical use cases, primarily focusing on Alzheimer’s disease case studies from PubMed articles. This application incorporates cross-modal attention fusion techniques to integrate textual and visual data processing by efficiently indexing and accessing vast amounts of biomedical literature. Our experimental results, compared to benchmarks such as BioASQ and PubMedQA, yield improved performance in the retrieval and synthesis of domain-specific information. We also present a case study using our multimodal RAG in various Alzheimer’s clinical scenarios. We infer that AlzheimerRAG can generate responses with accuracy non-inferior to humans and with low rates of hallucination. Full article

► Show Figures

Figure 1

29 pages, 6541 KB

Open AccessArticle

A Novel Spatio-Temporal Graph Convolutional Network with Attention Mechanism for PM_2.5 Concentration Prediction

by Xin Guan, Xinyue Mo and Huan Li

Mach. Learn. Knowl. Extr. 2025, 7(3), 88; https://doi.org/10.3390/make7030088 - 27 Aug 2025

Abstract

Accurate and high-resolution spatio-temporal prediction of PM_2.5 concentrations remains a significant challenge for air pollution early warning and prevention. Advanced artificial intelligence (AI) technologies, however, offer promising solutions to this problem. A spatio-temporal prediction model is designed in this study, which is [...] Read more.

Accurate and high-resolution spatio-temporal prediction of PM_2.5 concentrations remains a significant challenge for air pollution early warning and prevention. Advanced artificial intelligence (AI) technologies, however, offer promising solutions to this problem. A spatio-temporal prediction model is designed in this study, which is built upon a seq2seq architecture. This model employs an improved graph convolutional neural network to capture spatially dependent features, integrates time-series information through a gated recurrent unit, and incorporates an attention mechanism to achieve PM_2.5 concentration prediction. Benefiting from high-resolution satellite remote sensing data, the regional, multi-step and high-resolution prediction of PM_2.5 concentration in Beijing has been performed. To validate the model’s performance, ablation experiments are conducted, and the model is compared with other advanced prediction models. The experimental results show our proposed Spatio-Temporal Graph Convolutional Network with Attention Mechanism (STGCA) outperforms comparison models in multi-step forecasting, achieving root mean squared error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) of 4.21, 3.11 and 11.41% for the first step, respectively. For subsequent steps, the model also shows significant improvements. For subsequent steps, the model also shows significant improvements, with RMSE, MAE and MAPE values of 5.08, 3.69 and 13.34% for the second step and 6.54, 4.61 and 16.62% for the third step, respectively. Additionally, STGCA achieves the index of agreement (IA) values of 0.98, 0.97 and 0.95, as well as Theil’s inequality coefficient (TIC) values of 0.06, 0.08 and 0.10 proving its superiority. These results demonstrate that the proposed model offers an efficient technical approach for smart air pollution forecasting and warning in the future. Full article

► Show Figures

Figure 1

33 pages, 2122 KB

Open AccessArticle

AML4S: An AutoML Pipeline for Data Streams

by Eleftherios Kalaitzidis, Themistoklis Diamantopoulos, Athanasios Michailoudis and Andreas L. Symeonidis

Mach. Learn. Knowl. Extr. 2025, 7(3), 87; https://doi.org/10.3390/make7030087 - 26 Aug 2025

Abstract

The data landscape has changed, as more and more information is produced in the form of continuous data streams instead of stationary datasets. In this context, several online machine learning techniques have been proposed with the aim of automatically adapting to changes in [...] Read more.

The data landscape has changed, as more and more information is produced in the form of continuous data streams instead of stationary datasets. In this context, several online machine learning techniques have been proposed with the aim of automatically adapting to changes in data distributions, known as drifts. Though effective in certain scenarios, contemporary techniques do not generalize well to different types of data, while they also require manual parameter tuning, thus significantly hindering their applicability. Moreover, current methods do not thoroughly address drifts, as they mostly focus on concept drifts (distribution shifts on the target variable) and not on data drifts (changes in feature distributions). To confront these challenges, in this paper, we propose an AutoML Pipeline for Streams (AML4S), which automates the choice of preprocessing techniques, the choice of machine learning models, and the tuning of hyperparameters. Our pipeline further includes a drift detection mechanism that identifies different types of drifts, therefore continuously adapting the underlying models. We assess our pipeline on several real and synthetic data streams, including a data stream that we crafted to focus on data drifts. Our results indicate that AML4S produces robust pipelines and outperforms existing online learning or AutoML algorithms. Full article

► Show Figures

Figure 1

27 pages, 33734 KB

Open AccessArticle

Full Domain Analysis in Fluid Dynamics

by Alexander Hagg, Adam Gaier, Dominik Wilde, Alexander Asteroth, Holger Foysi and Dirk Reith

Mach. Learn. Knowl. Extr. 2025, 7(3), 86; https://doi.org/10.3390/make7030086 - 18 Aug 2025

Viewed by 540

Abstract

Novel techniques in evolutionary optimization, simulation, and machine learning enable a broad analysis of domains like fluid dynamics, in which computation is expensive and flow behavior is complex. This paper introduces the concept of full domain analysis, defined as the ability to efficiently [...] Read more.

Novel techniques in evolutionary optimization, simulation, and machine learning enable a broad analysis of domains like fluid dynamics, in which computation is expensive and flow behavior is complex. This paper introduces the concept of full domain analysis, defined as the ability to efficiently determine the full space of solutions in a problem domain and analyze the behavior of those solutions in an accessible and interactive manner. The goal of full domain analysis is to deepen our understanding of domains by generating many examples of flow, their diversification, optimization, and analysis. We define a formal model for full domain analysis, its current state of the art, and the requirements of its sub-components. Finally, an example is given to show what can be learned by using full domain analysis. Full domain analysis, rooted in optimization and machine learning, can be a valuable tool in understanding complex systems in computational physics and beyond. Full article

(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, Second Edition)

► Show Figures

Graphical abstract

45 pages, 59922 KB

Open AccessArticle

Machine Learning Applied to Professional Football: Performance Improvement and Results Prediction

by Diego Moya, Christian Tipantuña, Génesis Villa, Xavier Calderón-Hinojosa, Belén Rivadeneira and Robin Álvarez

Mach. Learn. Knowl. Extr. 2025, 7(3), 85; https://doi.org/10.3390/make7030085 - 14 Aug 2025

Viewed by 1085

Abstract

This paper examines the integration of machine learning (ML) techniques in professional football, focusing on two key areas: (i) player and team performance, and (ii) match outcome prediction. Using a systematic methodology, this study reviews 172 papers from a five-year observation period (2019–2024) [...] Read more.

This paper examines the integration of machine learning (ML) techniques in professional football, focusing on two key areas: (i) player and team performance, and (ii) match outcome prediction. Using a systematic methodology, this study reviews 172 papers from a five-year observation period (2019–2024) to identify relevant applications, focusing on the analysis of game actions (free kicks, passes, and penalties), individual and collective performance, and player position. A predominance of supervised learning, deep learning, and hybrid models (which integrate several ML techniques) is observed in the ML categories. Among the most widely used algorithms are decision trees, extreme gradient boosting, and artificial neural networks, which focus on optimizing sports performance and predicting outcomes. This paper discusses challenges such as the limited availability of public datasets due to access and cost restrictions, the restricted use of advanced visualization tools, and the poor integration of data acquisition devices, such as sensors. However, it also highlights the role of ML in addressing these challenges, thereby representing future research opportunities. Furthermore, this paper includes two illustrative case studies: (i) predicting the date Cristiano Ronaldo will reach 1000 goals, and (ii) an example of predicting penalty shoots; these examples demonstrate the practical potential of ML for performance monitoring and tactical decision-making in real-world football environments. Full article

► Show Figures

Figure 1

24 pages, 3617 KB

Open AccessArticle

A Comparison Between Unimodal and Multimodal Segmentation Models for Deep Brain Structures from T1- and T2-Weighted MRI

by Nicola Altini, Erica Lasaracina, Francesca Galeone, Michela Prunella, Vladimiro Suglia, Leonarda Carnimeo, Vito Triggiani, Daniele Ranieri, Gioacchino Brunetti and Vitoantonio Bevilacqua

Mach. Learn. Knowl. Extr. 2025, 7(3), 84; https://doi.org/10.3390/make7030084 - 13 Aug 2025

Viewed by 538

Abstract

Accurate segmentation of deep brain structures is critical for preoperative planning in such neurosurgical procedures as Deep Brain Stimulation (DBS). Previous research has showcased successful pipelines for segmentation from T1-weighted (T1w) Magnetic Resonance Imaging (MRI) data. Nevertheless, the role of T2-weighted (T2w) MRI [...] Read more.

Accurate segmentation of deep brain structures is critical for preoperative planning in such neurosurgical procedures as Deep Brain Stimulation (DBS). Previous research has showcased successful pipelines for segmentation from T1-weighted (T1w) Magnetic Resonance Imaging (MRI) data. Nevertheless, the role of T2-weighted (T2w) MRI data has been underexploited so far. This study proposes and evaluates a fully automated deep learning pipeline based on nnU-Net for the segmentation of eight clinically relevant deep brain structures. A heterogeneous dataset has been prepared by gathering 325 paired T1w and T2w MRI scans from eight publicly available sources, which have been annotated by means of an atlas-based registration approach. Three 3D nnU-Net models—unimodal T1w, unimodal T2w, and multimodal (encompassing both T1w and T2w)—have been trained and compared by using 5-fold cross-validation and a separate test set. The outcomes prove that the multimodal model consistently outperforms the T2w unimodal model and achieves comparable performance with the T1w unimodal model. On our dataset, all proposed models significantly exceed the performance of the state-of-the-art DBSegment tool. These findings underscore the value of multimodal MRI in enhancing deep brain segmentation and offer a robust framework for accurate delineation of subcortical targets in both research and clinical settings. Full article

(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition, 2nd Edition)

► Show Figures

Figure 1

15 pages, 2175 KB

Open AccessArticle

Thrifty World Models for Applying Machine Learning in the Design of Complex Biosocial–Technical Systems

by Stephen Fox and Vitor Fortes Rey

Mach. Learn. Knowl. Extr. 2025, 7(3), 83; https://doi.org/10.3390/make7030083 - 13 Aug 2025

Viewed by 407

Abstract

Interactions between human behavior, legal regulations, and monitoring technology in road traffic systems provide an everyday example of complex biosocial–technical systems. In this paper, a study is reported that investigated the potential for a thrifty world model to predict consequences from choices about [...] Read more.

Interactions between human behavior, legal regulations, and monitoring technology in road traffic systems provide an everyday example of complex biosocial–technical systems. In this paper, a study is reported that investigated the potential for a thrifty world model to predict consequences from choices about road traffic system design. Colloquially, the term thrifty means economical. In physics, the term thrifty is related to the principle of least action. Predictions were made with algebraic machine learning, which combines predefined embeddings with ongoing learning from data. The thrifty world model comprises three categories that encompass a total of only eight system design choice options. Results indicate that the thrifty world model is sufficient to encompass biosocial–technical complexity in predictions of where and when it is most likely that accidents will occur. Overall, it is argued that thrifty world models can provide a practical alternative to large photo-realistic world models, which can contribute to explainable artificial intelligence (AI) and to frugal AI. Full article

(This article belongs to the Special Issue Advances in Machine and Deep Learning)

► Show Figures

Figure 1

30 pages, 2261 KB

Open AccessArticle

Multilayer Perceptron Mapping of Subjective Time Duration onto Mental Imagery Vividness and Underlying Brain Dynamics: A Neural Cognitive Modeling Approach

by Matthew Sheculski and Amedeo D’Angiulli

Mach. Learn. Knowl. Extr. 2025, 7(3), 82; https://doi.org/10.3390/make7030082 - 13 Aug 2025

Viewed by 471

Abstract

According to a recent experimental phenomenology–information processing theory, the sensory strength, or vividness, of visual mental images self-reported by human observers reflects the intensive variation in subjective time duration during the process of generation of said mental imagery. The primary objective of this [...] Read more.

According to a recent experimental phenomenology–information processing theory, the sensory strength, or vividness, of visual mental images self-reported by human observers reflects the intensive variation in subjective time duration during the process of generation of said mental imagery. The primary objective of this study was to test the hypothesis that a biologically plausible essential multilayer perceptron (MLP) architecture can validly map the phenomenological categories of subjective time duration onto levels of subjectively self-reported vividness. A secondary objective was to explore whether this type of neural network cognitive modeling approach can give insight into plausible underlying large-scale brain dynamics. To achieve these objectives, vividness self-reports and reaction times from a previously collected database were reanalyzed using multilayered perceptron network models. The input layer consisted of six levels representing vividness self-reports and a reaction time cofactor. A single hidden layer consisted of three nodes representing the salience, task positive, and default mode networks. The output layer consisted of five levels representing Vittorio Benussi’s subjective time categories. Across different models of networks, Benussi’s subjective time categories (Level 1 = very brief, 2 = brief, 3 = present, 4 = long, 5 = very long) were predicted by visual imagery vividness level 1 (=no image) to 5 (=very vivid) with over 90% success in classification accuracy, precision, recall, and F1-score. This accuracy level was maintained after 5-fold cross validation. Linear regressions, Welch’s t-test for independent coefficients, and Pearson’s correlation analysis were applied to the resulting hidden node weight vectors, obtaining evidence for strong correlation and anticorrelation between nodes. This study successfully mapped Benussi’s five levels of subjective time categories onto the activation patterns of a simple MLP, providing a novel computational framework for experimental phenomenology. Our results revealed structured, complex dynamics between the task positive network (TPN), the default mode network (DMN), and the salience network (SN), suggesting that the neural mechanisms underlying temporal consciousness involve flexible network interactions beyond the traditional triple network model. Full article

► Show Figures

Graphical abstract

25 pages, 24334 KB

Open AccessArticle

Unsupervised Knowledge Extraction of Distinctive Landmarks from Earth Imagery Using Deep Feature Outliers for Robust UAV Geo-Localization

by Zakhar Ostrovskyi, Oleksander Barmak, Pavlo Radiuk and Iurii Krak

Mach. Learn. Knowl. Extr. 2025, 7(3), 81; https://doi.org/10.3390/make7030081 - 13 Aug 2025

Viewed by 321

Abstract

Vision-based navigation is a common solution for the critical challenge of GPS-denied Unmanned Aerial Vehicle (UAV) operation, but a research gap remains in the autonomous discovery of robust landmarks from aerial survey imagery needed for such systems. In this work, we propose a [...] Read more.

Vision-based navigation is a common solution for the critical challenge of GPS-denied Unmanned Aerial Vehicle (UAV) operation, but a research gap remains in the autonomous discovery of robust landmarks from aerial survey imagery needed for such systems. In this work, we propose a framework to fill this gap by identifying visually distinctive urban buildings from aerial survey imagery and curating them into a landmark database for GPS-free UAV localization. The proposed framework constructs semantically rich embeddings using intermediate layers from a pre-trained YOLOv11n-seg segmentation network. This novel technique requires no additional training. An unsupervised landmark selection strategy, based on the Isolation Forest algorithm, then identifies objects with statistically unique embeddings. Experimental validation on the VPAIR aerial-to-aerial benchmark shows that the proposed max-pooled embeddings, assembled from selected layers, significantly improve retrieval performance. The top-1 retrieval accuracy for landmarks more than doubled compared to typical buildings (0.53 vs. 0.31), and a Recall@5 of 0.70 is achieved for landmarks. Overall, this study demonstrates that unsupervised outlier selection in a carefully constructed embedding space yields a highly discriminative, computation-friendly set of landmarks suitable for real-time, robust UAV navigation. Full article

(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition, 2nd Edition)

► Show Figures

Graphical abstract

20 pages, 5008 KB

Open AccessArticle

Harnessing Large-Scale University Registrar Data for Predictive Insights: A Data-Driven Approach to Forecasting Undergraduate Student Success with Convolutional Autoencoders

by Mohammad Erfan Shoorangiz and Michal Brylinski

Mach. Learn. Knowl. Extr. 2025, 7(3), 80; https://doi.org/10.3390/make7030080 - 8 Aug 2025

Viewed by 326

Abstract

Predicting undergraduate student success is critical for informing timely interventions and improving outcomes in higher education. This study leverages over a decade of historical data from Louisiana State University (LSU) to forecast graduation outcomes using advanced machine learning techniques, with a focus on [...] Read more.

Predicting undergraduate student success is critical for informing timely interventions and improving outcomes in higher education. This study leverages over a decade of historical data from Louisiana State University (LSU) to forecast graduation outcomes using advanced machine learning techniques, with a focus on convolutional autoencoders (CAEs). We detail the data processing and transformation steps, including feature selection and imputation, to construct a robust dataset. The CAE effectively extracts meaningful latent features, validated through low-dimensional t-SNE visualizations that reveal clear clusters based on class labels, differentiating students likely to graduate from those at risk. A two-year gap strategy is introduced to ensure rigorous evaluation and simulate real-world conditions by predicting outcomes on unseen future data. Our results demonstrate the promise of CAE-derived embeddings for dimensionality reduction and computational efficiency, with competitive performance in downstream classification tasks. While models trained on embeddings showed slightly reduced performance compared to raw input data, with accuracies of 83% and 85%, respectively, their compactness and computational efficiency highlight their potential for large-scale analyses. The study emphasizes the importance of rigorous preprocessing, feature engineering, and evaluation protocols. By combining these approaches, we provide actionable insights and adaptive modeling strategies to support robust and generalizable predictive systems, enabling educators and administrators to enhance student success initiatives in dynamic educational environments. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

18 pages, 973 KB

Open AccessArticle

Machine Learning-Based Vulnerability Detection in Rust Code Using LLVM IR and Transformer Model

by Young Lee, Syeda Jannatul Boshra, Jeong Yang, Zechun Cao and Gongbo Liang

Mach. Learn. Knowl. Extr. 2025, 7(3), 79; https://doi.org/10.3390/make7030079 - 6 Aug 2025

Viewed by 637

Abstract

Rust’s growing popularity in high-integrity systems requires automated vulnerability detection in order to maintain its strong safety guarantees. Although Rust’s ownership model and compile-time checks prevent many errors, sometimes unexpected bugs may occasionally pass analysis, underlining the necessity for automated safe and unsafe [...] Read more.

Rust’s growing popularity in high-integrity systems requires automated vulnerability detection in order to maintain its strong safety guarantees. Although Rust’s ownership model and compile-time checks prevent many errors, sometimes unexpected bugs may occasionally pass analysis, underlining the necessity for automated safe and unsafe code detection. This paper presents Rust-IR-BERT, a machine learning approach to detect security vulnerabilities in Rust code by analyzing its compiled LLVM intermediate representation (IR) instead of the raw source code. This approach offers novelty by employing LLVM IR’s language-neutral, semantically rich representation of the program, facilitating robust detection by capturing core data and control-flow semantics and reducing language-specific syntactic noise. Our method leverages a graph-based transformer model, GraphCodeBERT, which is a transformer architecture pretrained model to encode structural code semantics via data-flow information, followed by a gradient boosting classifier, CatBoost, that is capable of handling complex feature interactions—to classify code as vulnerable or safe. The model was evaluated using a carefully curated dataset of over 2300 real-world Rust code samples (vulnerable and non-vulnerable Rust code snippets) from RustSec and OSV advisory databases, compiled to LLVM IR and labeled with corresponding Common Vulnerabilities and Exposures (CVEs) identifiers to ensure comprehensive and realistic coverage. Rust-IR-BERT achieved an overall accuracy of 98.11%, with a recall of 99.31% for safe code and 93.67% for vulnerable code. Despite these promising results, this study acknowledges potential limitations such as focusing primarily on known CVEs. Built on a representative dataset spanning over 2300 real-world Rust samples from diverse crates, Rust-IR-BERT delivers consistently strong performance. Looking ahead, practical deployment could take the form of a Cargo plugin or pre-commit hook that automatically generates and scans LLVM IR artifacts during the development cycle, enabling developers to catch vulnerabilities at an early stage in the development cycle. Full article

► Show Figures

Figure 1

40 pages, 2515 KB

Open AccessArticle

AE-DTNN: Autoencoder–Dense–Transformer Neural Network Model for Efficient Anomaly-Based Intrusion Detection Systems

by Hesham Kamal and Maggie Mashaly

Mach. Learn. Knowl. Extr. 2025, 7(3), 78; https://doi.org/10.3390/make7030078 - 6 Aug 2025

Viewed by 573

Abstract

In this study, we introduce an enhanced hybrid Autoencoder–Dense–Transformer Neural Network (AE-DTNN) model for developing an effective intrusion detection system (IDS) aimed at improving the performance and robustness of threat detection strategies within a rapidly changing and increasingly complex network landscape. The Autoencoder [...] Read more.

In this study, we introduce an enhanced hybrid Autoencoder–Dense–Transformer Neural Network (AE-DTNN) model for developing an effective intrusion detection system (IDS) aimed at improving the performance and robustness of threat detection strategies within a rapidly changing and increasingly complex network landscape. The Autoencoder component restructures network traffic data, while a stack of Dense layers performs feature extraction to generate more meaningful representations. The Transformer network then facilitates highly precise and comprehensive classification. Our strategy incorporates adaptive synthetic sampling (ADASYN) for both binary and multi-class classification tasks, complemented by the edited nearest neighbors (ENN) technique and the use of class weights to mitigate class imbalance issues. In experiments conducted on the NF-BoT-IoT-v2 dataset, the AE-DTNN-based IDS achieved outstanding performance, with 99.98% accuracy in binary classification and 98.30% in multi-class classification. On the NSL-KDD dataset, the model reached 98.57% accuracy for binary classification and 97.50% for multi-class classification. Additionally, the model attained 99.92% and 99.78% accuracy in binary and multi-class classification, respectively, on the CSE-CIC-IDS2018 dataset. These results demonstrate the exceptional effectiveness of the proposed model in contrast to conventional approaches, highlighting its strong potential to detect a broad range of network intrusions with high reliability. Full article

► Show Figures

Figure 1

24 pages, 1993 KB

Open AccessArticle

Evaluating Prompt Injection Attacks with LSTM-Based Generative Adversarial Networks: A Lightweight Alternative to Large Language Models

by Sharaf Rashid, Edson Bollis, Lucas Pellicer, Darian Rabbani, Rafael Palacios, Aneesh Gupta and Amar Gupta

Mach. Learn. Knowl. Extr. 2025, 7(3), 77; https://doi.org/10.3390/make7030077 - 6 Aug 2025

Viewed by 842

Abstract

Generative Adversarial Networks (GANs) using Long Short-Term Memory (LSTM) provide a computationally cheaper approach for text generation compared to large language models (LLMs). The low hardware barrier of training GANs poses a threat because it means more bad actors may use them to [...] Read more.

Generative Adversarial Networks (GANs) using Long Short-Term Memory (LSTM) provide a computationally cheaper approach for text generation compared to large language models (LLMs). The low hardware barrier of training GANs poses a threat because it means more bad actors may use them to mass-produce prompt attack messages against LLM systems. Thus, to better understand the threat of GANs being used for prompt attack generation, we train two well-known GAN architectures, SeqGAN and RelGAN, on prompt attack messages. For each architecture, we evaluate generated prompt attack messages, comparing results with each other, with generated attacks from another computationally cheap approach, a 1-billion-parameter Llama 3.2 small language model (SLM), and with messages from the original dataset. This evaluation suggests that GAN architectures like SeqGAN and RelGAN have the potential to be used in conjunction with SLMs to readily generate malicious prompts that impose new threats against LLM-based systems such as chatbots. Analyzing the effectiveness of state-of-the-art defenses against prompt attacks, we also find that GAN-generated attacks can deceive most of these defenses with varying levels of success with the exception of Meta’s PromptGuard. Further, we suggest an improvement of prompt attack defenses based on the analysis of the language quality of the prompts, which we found to be the weakest point of GAN-generated messages. Full article

► Show Figures

Graphical abstract

22 pages, 1969 KB

Open AccessArticle

Significance of Time-Series Consistency in Evaluating Machine Learning Models for Gap-Filling Multi-Level Very Tall Tower Data

by Changhyoun Park

Mach. Learn. Knowl. Extr. 2025, 7(3), 76; https://doi.org/10.3390/make7030076 - 3 Aug 2025

Viewed by 295

Abstract

Machine learning modeling is a valuable tool for gap-filling or prediction, and its performance is typically evaluated using standard metrics. To enable more precise assessments for time-series data, this study emphasizes the importance of considering time-series consistency, which can be evaluated through amplitude—specifically, [...] Read more.

Machine learning modeling is a valuable tool for gap-filling or prediction, and its performance is typically evaluated using standard metrics. To enable more precise assessments for time-series data, this study emphasizes the importance of considering time-series consistency, which can be evaluated through amplitude—specifically, the interquartile range and the lower bound of the band in gap-filled time series. To test this hypothesis, a gap-filling technique was applied using long-term (~6 years) high-frequency flux and meteorological data collected at four different levels (1.5, 60, 140, and 300 m above sea level) on a ~300 m tall flux tower. This study focused on turbulent kinetic energy among several variables, which is important for estimating sensible and latent heat fluxes and net ecosystem exchange. Five ensemble machine learning algorithms were selected and trained on three different datasets. Among several modeling scenarios, the stacking model with a dataset combined with derivative data produced the best metrics for predicting turbulent kinetic energy. Although the metrics before and after gap-filling reported fewer differences among the scenarios, large distortions were found in the consistency of the time series in terms of amplitude. These findings underscore the importance of evaluating time-series consistency alongside traditional metrics, not only to accurately assess modeling performance but also to ensure reliability in downstream applications such as forecasting, climate modeling, and energy estimation. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

65 pages, 8546 KB

Open AccessReview

Quantum Machine Learning and Deep Learning: Fundamentals, Algorithms, Techniques, and Real-World Applications

by Maria Revythi and Georgia Koukiou

Mach. Learn. Knowl. Extr. 2025, 7(3), 75; https://doi.org/10.3390/make7030075 - 1 Aug 2025

Viewed by 817

Abstract

Quantum computing, with its foundational principles of superposition and entanglement, has the potential to provide significant quantum advantages, addressing challenges that classical computing may struggle to overcome. As data generation continues to grow exponentially and technological advancements accelerate, classical machine learning algorithms increasingly [...] Read more.

Quantum computing, with its foundational principles of superposition and entanglement, has the potential to provide significant quantum advantages, addressing challenges that classical computing may struggle to overcome. As data generation continues to grow exponentially and technological advancements accelerate, classical machine learning algorithms increasingly face difficulties in solving complex real-world problems. The integration of classical machine learning with quantum information processing has led to the emergence of quantum machine learning, a promising interdisciplinary field. This work provides the reader with a bottom-up view of quantum circuits starting from quantum data representation, quantum gates, the fundamental quantum algorithms, and more complex quantum processes. Thoroughly studying the mathematics behind them is a powerful tool to guide scientists entering this domain and exploring their connection to quantum machine learning. Quantum algorithms such as Shor’s algorithm, Grover’s algorithm, and the Harrow–Hassidim–Lloyd (HHL) algorithm are discussed in detail. Furthermore, real-world implementations of quantum machine learning and quantum deep learning are presented in fields such as healthcare, bioinformatics and finance. These implementations aim to enhance time efficiency and reduce algorithmic complexity through the development of more effective quantum algorithms. Therefore, a comprehensive understanding of the fundamentals of these algorithms is crucial. Full article

(This article belongs to the Section Learning)

► Show Figures

Graphical abstract

24 pages, 3121 KB

Open AccessArticle

SG-RAG MOT: SubGraph Retrieval Augmented Generation with Merging and Ordering Triplets for Knowledge Graph Multi-Hop Question Answering

by Ahmmad O. M. Saleh, Gokhan Tur and Yucel Saygin

Mach. Learn. Knowl. Extr. 2025, 7(3), 74; https://doi.org/10.3390/make7030074 - 1 Aug 2025

Viewed by 687

Abstract

Large language models (LLMs) often tend to hallucinate, especially in domain-specific tasks and tasks that require reasoning. Previously, we introduced SubGraph Retrieval Augmented Generation (SG-RAG) as a novel Graph RAG method for multi-hop question answering. SG-RAG leverages Cypher queries to search a given [...] Read more.

Large language models (LLMs) often tend to hallucinate, especially in domain-specific tasks and tasks that require reasoning. Previously, we introduced SubGraph Retrieval Augmented Generation (SG-RAG) as a novel Graph RAG method for multi-hop question answering. SG-RAG leverages Cypher queries to search a given knowledge graph and retrieve the subgraph necessary to answer the question. The results from our previous work showed the higher performance of our method compared to the traditional Retrieval Augmented Generation (RAG). In this work, we further enhanced SG-RAG by proposing an additional step called Merging and Ordering Triplets (MOT). The new MOT step seeks to decrease the redundancy in the retrieved triplets by applying hierarchical merging to the retrieved subgraphs. Moreover, it provides an ordering among the triplets using the Breadth-First Search (BFS) traversal algorithm. We conducted experiments on the MetaQA benchmark, which was proposed for multi-hop question-answering in the movies domain. Our experiments showed that SG-RAG MOT provided more accurate answers than Chain-of-Thought and Graph Chain-of-Thought. We also found that merging (up to a certain point) highly overlapping subgraphs and defining an order among the triplets helped the LLM to generate more precise answers. Full article

(This article belongs to the Special Issue Knowledge Graphs and Large Language Models)

► Show Figures

Figure 1

19 pages, 6095 KB

Open AccessArticle

MERA: Medical Electronic Records Assistant

by Ahmed Ibrahim, Abdullah Khalili, Maryam Arabi, Aamenah Sattar, Abdullah Hosseini and Ahmed Serag

Mach. Learn. Knowl. Extr. 2025, 7(3), 73; https://doi.org/10.3390/make7030073 - 30 Jul 2025

Viewed by 594

Abstract

The increasing complexity and scale of electronic health records (EHRs) demand advanced tools for efficient data retrieval, summarization, and comparative analysis in clinical practice. MERA (Medical Electronic Records Assistant) is a Retrieval-Augmented Generation (RAG)-based AI system that addresses these needs by integrating domain-specific [...] Read more.

The increasing complexity and scale of electronic health records (EHRs) demand advanced tools for efficient data retrieval, summarization, and comparative analysis in clinical practice. MERA (Medical Electronic Records Assistant) is a Retrieval-Augmented Generation (RAG)-based AI system that addresses these needs by integrating domain-specific retrieval with large language models (LLMs) to deliver robust question answering, similarity search, and report summarization functionalities. MERA is designed to overcome key limitations of conventional LLMs in healthcare, such as hallucinations, outdated knowledge, and limited explainability. To ensure both privacy compliance and model robustness, we constructed a large synthetic dataset using state-of-the-art LLMs, including Mistral v0.3, Qwen 2.5, and Llama 3, and further validated MERA on de-identified real-world EHRs from the MIMIC-IV-Note dataset. Comprehensive evaluation demonstrates MERA’s high accuracy in medical question answering (correctness: 0.91; relevance: 0.98; groundedness: 0.89; retrieval relevance: 0.92), strong summarization performance (ROUGE-1 F1-score: 0.70; Jaccard similarity: 0.73), and effective similarity search (METEOR: 0.7–1.0 across diagnoses), with consistent results on real EHRs. The similarity search module empowers clinicians to efficiently identify and compare analogous patient cases, supporting differential diagnosis and personalized treatment planning. By generating concise, contextually relevant, and explainable insights, MERA reduces clinician workload and enhances decision-making. To our knowledge, this is the first system to integrate clinical question answering, summarization, and similarity search within a unified RAG-based framework. Full article

(This article belongs to the Special Issue Advances in Machine and Deep Learning)

► Show Figures

Figure 1

23 pages, 8532 KB

Open AccessArticle

VisRep: Towards an Automated, Reflective AI System for Documenting Visualisation Design Processes

by Aron E. Owen and Jonathan C. Roberts

Mach. Learn. Knowl. Extr. 2025, 7(3), 72; https://doi.org/10.3390/make7030072 - 25 Jul 2025

Viewed by 411

Abstract

VisRep (Visualisation Report) is an AI-powered system for capturing and structuring the early stages of the visualisation design process. It addresses a critical gap in predesign: the lack of tools that can naturally record, organise, and transform raw ideation, spoken thoughts, sketches, and [...] Read more.

VisRep (Visualisation Report) is an AI-powered system for capturing and structuring the early stages of the visualisation design process. It addresses a critical gap in predesign: the lack of tools that can naturally record, organise, and transform raw ideation, spoken thoughts, sketches, and evolving concepts into polished, shareable outputs. Users engage in talk-aloud sessions through a terminal-style interface supported by intelligent transcription and eleven structured questions that frame intent, audience, and output goals. These inputs are then processed by a large language model (LLM) guided by markdown-based output templates for reports, posters, and slides. The system aligns free-form ideas with structured communication using prompt engineering to ensure clarity, coherence, and visual consistency. VisRep not only automates the generation of professional deliverables but also enhances reflective practice by bridging spontaneous ideation and structured documentation. This paper introduces VisRep’s methodology, interface design, and AI-driven workflow, demonstrating how it improves the fidelity and transparency of the visualisation design process across academic, professional, and creative domains. Full article

(This article belongs to the Section Visualization)

► Show Figures

Figure 1

26 pages, 1276 KB

Open AccessSystematic Review

Harnessing Language Models for Studying the Ancient Greek Language: A Systematic Review

by Diamanto Tzanoulinou, Loukas Triantafyllopoulos and Vassilios S. Verykios

Mach. Learn. Knowl. Extr. 2025, 7(3), 71; https://doi.org/10.3390/make7030071 - 24 Jul 2025

Viewed by 746

Abstract

Applying language models (LMs) and generative artificial intelligence (GenAI) to the study of Ancient Greek offers promising opportunities. However, it faces substantial challenges due to the language’s morphological complexity and lack of annotated resources. Despite growing interest, no systematic overview of existing research [...] Read more.

Applying language models (LMs) and generative artificial intelligence (GenAI) to the study of Ancient Greek offers promising opportunities. However, it faces substantial challenges due to the language’s morphological complexity and lack of annotated resources. Despite growing interest, no systematic overview of existing research currently exists. To address this gap, a systematic literature review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 methodology. Twenty-seven peer-reviewed studies were identified and analyzed, focusing on application areas such as machine translation, morphological analysis, named entity recognition (NER), and emotion detection. The review reveals six key findings, highlighting both the technical advances and persistent limitations, particularly the scarcity of large, domain-specific corpora and the need for better integration into educational contexts. Future developments should focus on building richer resources and tailoring models to the unique features of Ancient Greek, thereby fully realizing the potential of these technologies in both research and teaching. Full article

► Show Figures

Figure 1

26 pages, 2658 KB

Open AccessArticle

An Efficient and Accurate Random Forest Node-Splitting Algorithm Based on Dynamic Bayesian Methods

by Jun He, Zhanqi Li and Linzi Yin

Mach. Learn. Knowl. Extr. 2025, 7(3), 70; https://doi.org/10.3390/make7030070 - 21 Jul 2025

Viewed by 454

Abstract

Random Forests are powerful machine learning models widely applied in classification and regression tasks due to their robust predictive performance. Nevertheless, traditional Random Forests face computational challenges during tree construction, particularly in high-dimensional data or on resource-constrained devices. In this paper, a novel [...] Read more.

Random Forests are powerful machine learning models widely applied in classification and regression tasks due to their robust predictive performance. Nevertheless, traditional Random Forests face computational challenges during tree construction, particularly in high-dimensional data or on resource-constrained devices. In this paper, a novel node-splitting algorithm, BayesSplit, is proposed to accelerate decision tree construction via a Bayesian-based impurity estimation framework. BayesSplit treats impurity reduction as a Bernoulli event with Beta-conjugate priors for each split point and incorporates two main strategies. First, Dynamic Posterior Parameter Refinement updates the Beta parameters based on observed impurity reductions in batch iterations. Second, Posterior-Derived Confidence Bounding establishes statistical confidence intervals, efficiently filtering out suboptimal splits. Theoretical analysis demonstrates that BayesSplit converges to optimal splits with high probability, while experimental results show up to a 95% reduction in training time compared to baselines and maintains or exceeds generalization performance. Compared to the state-of-the-art MABSplit, BayesSplit achieves similar accuracy on classification tasks and reduces regression training time by 20–70% with lower MSEs. Furthermore, BayesSplit enhances feature importance stability by up to 40%, making it particularly suitable for deployment in computationally constrained environments. Full article

► Show Figures

Graphical abstract

23 pages, 4997 KB

Open AccessArticle

Prediction of Bearing Layer Depth Using Machine Learning Algorithms and Evaluation of Their Performance

by Yuxin Cong, Arisa Katsuumi and Shinya Inazumi

Mach. Learn. Knowl. Extr. 2025, 7(3), 69; https://doi.org/10.3390/make7030069 - 21 Jul 2025

Viewed by 787

Abstract

In earthquake-prone areas such as Tokyo, accurate estimation of bearing stratum depth is crucial for foundation design, liquefaction assessment, and urban disaster mitigation. However, traditional methods such as the standard penetration test (SPT), while reliable, are labor-intensive and have limited spatial distribution. In [...] Read more.

In earthquake-prone areas such as Tokyo, accurate estimation of bearing stratum depth is crucial for foundation design, liquefaction assessment, and urban disaster mitigation. However, traditional methods such as the standard penetration test (SPT), while reliable, are labor-intensive and have limited spatial distribution. In this study, 942 geological survey records from the Tokyo metropolitan area were used to evaluate the performance of three machine learning algorithms, random forest (RF), artificial neural network (ANN), and support vector machine (SVM), in predicting bearing stratum depth. The main input variables included geographic coordinates, elevation, and stratigraphic category. The results showed that the RF model performed well in terms of multiple evaluation indicators and had significantly better prediction accuracy than ANN and SVM. In addition, data density analysis showed that the prediction error was significantly reduced in high-density areas. The results demonstrate the robustness and adaptability of the RF method in foundation soil layer identification, emphasizing the importance of comprehensive input variables and spatial coverage. The proposed method can be used for large-scale, data-driven bearing stratum prediction and has the potential to be integrated into geological risk management systems and smart city platforms. Full article

► Show Figures

Figure 1

22 pages, 875 KB

Open AccessArticle

Towards Robust Synthetic Data Generation for Simplification of Text in French

by Nikos Tsourakis

Mach. Learn. Knowl. Extr. 2025, 7(3), 68; https://doi.org/10.3390/make7030068 - 19 Jul 2025

Viewed by 599

Abstract

We present a pipeline for synthetic simplification of text in French that combines large language models with structured semantic guidance. Our approach enhances data generation by integrating contextual knowledge from Wikipedia and Vikidia articles and injecting symbolic control through lightweight knowledge graphs. To [...] Read more.

We present a pipeline for synthetic simplification of text in French that combines large language models with structured semantic guidance. Our approach enhances data generation by integrating contextual knowledge from Wikipedia and Vikidia articles and injecting symbolic control through lightweight knowledge graphs. To construct document-level representations, we implement a progressive summarization process that incrementally builds running summaries and extracts key ideas. Simplifications are generated iteratively and assessed using semantic comparisons between input and output graphs, enabling targeted regeneration when critical information is lost. Our system is implemented using LangChain’s orchestration framework, allowing modular and extensible coordination of LLM components. Evaluation shows that context-aware prompting and semantic feedback improve simplification quality across successive iterations. Full article

(This article belongs to the Special Issue Knowledge Graphs and Large Language Models)

► Show Figures

Figure 1

24 pages, 2667 KB

Open AccessArticle

Transformer-Driven Fault Detection in Self-Healing Networks: A Novel Attention-Based Framework for Adaptive Network Recovery

by Parul Dubey, Pushkar Dubey and Pitshou N. Bokoro

Mach. Learn. Knowl. Extr. 2025, 7(3), 67; https://doi.org/10.3390/make7030067 - 16 Jul 2025

Viewed by 722

Abstract

Fault detection and remaining useful life (RUL) prediction are critical tasks in self-healing network (SHN) environments and industrial cyber–physical systems. These domains demand intelligent systems capable of handling dynamic, high-dimensional sensor data. However, existing optimization-based approaches often struggle with imbalanced datasets, noisy signals, [...] Read more.

Fault detection and remaining useful life (RUL) prediction are critical tasks in self-healing network (SHN) environments and industrial cyber–physical systems. These domains demand intelligent systems capable of handling dynamic, high-dimensional sensor data. However, existing optimization-based approaches often struggle with imbalanced datasets, noisy signals, and delayed convergence, limiting their effectiveness in real-time applications. This study utilizes two benchmark datasets—EFCD and SFDD—which represent electrical and sensor fault scenarios, respectively. These datasets pose challenges due to class imbalance and complex temporal dependencies. To address this, we propose a novel hybrid framework combining Attention-Augmented Convolutional Neural Networks (AACNN) with transformer encoders, enhanced through Enhanced Ensemble-SMOTE for balancing the minority class. The model captures spatial features and long-range temporal patterns and learns effectively from imbalanced data streams. The novelty lies in the integration of attention mechanisms and adaptive oversampling in a unified fault-prediction architecture. Model evaluation is based on multiple performance metrics, including accuracy, F1-score, MCC, RMSE, and score*. The results show that the proposed model outperforms state-of-the-art approaches, achieving up to 97.14% accuracy and a score* of 0.419, with faster convergence and improved generalization across both datasets. Full article

► Show Figures

Figure 1

16 pages, 2355 KB

Open AccessArticle

Generalising Stock Detection in Retail Cabinets with Minimal Data Using a DenseNet and Vision Transformer Ensemble

by Babak Rahi, Deniz Sagmanli, Felix Oppong, Direnc Pekaslan and Isaac Triguero

Mach. Learn. Knowl. Extr. 2025, 7(3), 66; https://doi.org/10.3390/make7030066 - 16 Jul 2025

Viewed by 471

Abstract

Generalising deep-learning models to perform well on unseen data domains with minimal retraining remains a significant challenge in computer vision. Even when the target task—such as quantifying the number of elements in an image—stays the same, data quality, shape, or form variations can [...] Read more.

Generalising deep-learning models to perform well on unseen data domains with minimal retraining remains a significant challenge in computer vision. Even when the target task—such as quantifying the number of elements in an image—stays the same, data quality, shape, or form variations can deviate from the training conditions, often necessitating manual intervention. As a real-world industry problem, we aim to automate stock level estimation in retail cabinets. As technology advances, new cabinet models with varying shapes emerge alongside new camera types. This evolving scenario poses a substantial obstacle to deploying long-term, scalable solutions. To surmount the challenge of generalising to new cabinet models and cameras with minimal amounts of sample images, this research introduces a new solution. This paper proposes a novel ensemble model that combines DenseNet-201 and Vision Transformer (ViT-B/8) architectures to achieve generalisation in stock-level classification. The novelty aspect of our solution comes from the fact that we combine a transformer with a DenseNet model in order to capture both the local, hierarchical details and the long-range dependencies within the images, improving generalisation accuracy with less data. Key contributions include (i) a novel DenseNet-201 + ViT-B/8 feature-level fusion, (ii) an adaptation workflow that needs only two images per class, (iii) a balanced layer-unfreezing schedule, (iv) a publicly described domain-shift benchmark, and (v) a 47 pp accuracy gain over four standard few-shot baselines. Our approach leverages fine-tuning techniques to adapt two pre-trained models to the new retail cabinets (i.e., standing or horizontal) and camera types using only two images per class. Experimental results demonstrate that our method achieves high accuracy rates of 91% on new cabinets with the same camera and 89% on new cabinets with different cameras, significantly outperforming standard few-shot learning methods. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

24 pages, 8216 KB

Open AccessArticle

Application of Dueling Double Deep Q-Network for Dynamic Traffic Signal Optimization: A Case Study in Danang City, Vietnam

by Tho Cao Phan, Viet Dinh Le and Teron Nguyen

Mach. Learn. Knowl. Extr. 2025, 7(3), 65; https://doi.org/10.3390/make7030065 - 14 Jul 2025

Viewed by 733

Abstract

This study investigates the application of the Dueling Double Deep Q-Network (3DQN) algorithm to optimize traffic signal control at a major urban intersection in Danang City, Vietnam. The objective is to enhance signal timing efficiency in response to mixed traffic flow and real-world [...] Read more.

This study investigates the application of the Dueling Double Deep Q-Network (3DQN) algorithm to optimize traffic signal control at a major urban intersection in Danang City, Vietnam. The objective is to enhance signal timing efficiency in response to mixed traffic flow and real-world traffic dynamics. A simulation environment was developed using the Simulation of Urban Mobility (SUMO) software version 1.11, incorporating both a fixed-time signal controller and two 3DQN models trained with 1 million (1M-Step) and 5 million (5M-Step) iterations. The models were evaluated using randomized traffic demand scenarios ranging from 50% to 150% of baseline traffic volumes. The results demonstrate that the 3DQN models outperform the fixed-time controller, significantly reducing vehicle delays, with the 5M-Step model achieving average waiting times of under five minutes. To further assess the model’s responsiveness to real-time conditions, traffic flow data were collected using YOLOv8 for object detection and SORT for vehicle tracking from live camera feeds, and integrated into the SUMO-3DQN simulation. The findings highlight the robustness and adaptability of the 3DQN approach, particularly under peak traffic conditions, underscoring its potential for deployment in intelligent urban traffic management systems. Full article

► Show Figures

Graphical abstract

22 pages, 2583 KB

Open AccessArticle

Helmet Detection in Underground Coal Mines via Dynamic Background Perception with Limited Valid Samples

by Guangfu Wang, Dazhi Sun, Hao Li, Jian Cheng, Pengpeng Yan and Heping Li

Mach. Learn. Knowl. Extr. 2025, 7(3), 64; https://doi.org/10.3390/make7030064 - 9 Jul 2025

Viewed by 484

Abstract

The underground coal mine environment is complex and dynamic, making the application of visual algorithms for object detection a crucial component of underground safety management as well as a key factor in ensuring the safe operation of workers. We look at this in [...] Read more.

The underground coal mine environment is complex and dynamic, making the application of visual algorithms for object detection a crucial component of underground safety management as well as a key factor in ensuring the safe operation of workers. We look at this in the context of helmet-wearing detection in underground mines, where over 25% of the targets are small objects. To address challenges such as the lack of effective samples for unworn helmets, significant background interference, and the difficulty of detecting small helmet targets, this paper proposes a novel underground helmet-wearing detection algorithm that combines dynamic background awareness with a limited number of valid samples to improve accuracy for underground workers. The algorithm begins by analyzing the distribution of visual surveillance data and spatial biases in underground environments. By using data augmentation techniques, it then effectively expands the number of training samples by introducing positive and negative samples for helmet-wearing detection from ordinary scenes. Thereafter, based on YOLOv10, the algorithm incorporates a background awareness module with region masks to reduce the adverse effects of complex underground backgrounds on helmet-wearing detection. Specifically, it adds a convolution and attention fusion module in the detection head to enhance the model’s perception of small helmet-wearing objects by enlarging the detection receptive field. By analyzing the aspect ratio distribution of helmet wearing data, the algorithm improves the aspect ratio constraints in the loss function, further enhancing detection accuracy. Consequently, it achieves precise detection of helmet-wearing in underground coal mines. Experimental results demonstrate that the proposed algorithm can detect small helmet-wearing objects in complex underground scenes, with a 14% reduction in background false detection rates, and thereby achieving accuracy, recall, and average precision rates of 94.4%, 89%, and 95.4%, respectively. Compared to other mainstream object detection algorithms, the proposed algorithm shows improvements in detection accuracy of 6.7%, 5.1%, and 11.8% over YOLOv9, YOLOv10, and RT-DETR, respectively. The algorithm proposed in this paper can be applied to real-time helmet-wearing detection in underground coal mine scenes, providing safety alerts for standardized worker operations and enhancing the level of underground security intelligence. Full article

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition: 2nd Edition)

► Show Figures

Graphical abstract

19 pages, 1926 KB

Open AccessArticle

A Novel Approach to Company Bankruptcy Prediction Using Convolutional Neural Networks and Generative Adversarial Networks

by Alessia D’Ercole and Gianluigi Me

Mach. Learn. Knowl. Extr. 2025, 7(3), 63; https://doi.org/10.3390/make7030063 - 7 Jul 2025

Viewed by 795

Abstract

Predicting company bankruptcy is a critical task in financial risk assessment. This study introduces a novel approach using Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs) to enhance bankruptcy prediction accuracy. By transforming financial statements into grayscale images and leveraging synthetic data [...] Read more.

Predicting company bankruptcy is a critical task in financial risk assessment. This study introduces a novel approach using Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs) to enhance bankruptcy prediction accuracy. By transforming financial statements into grayscale images and leveraging synthetic data generation, we analyze a dataset of 6249 companies, including 3256 active and 2993 bankrupt firms. Our methodology innovates by addressing dataset limitations through GAN-based data augmentation. CNNs are employed to take advantage of their ability to extract hierarchical patterns from financial statement images, providing a new approach to financial analysis, while GANs help mitigate dataset imbalance by generating realistic synthetic data for training. We generate synthetic financial data that closely mimics real-world patterns, expanding the training dataset and potentially improving classifier performance. The CNN model is trained on a combination of real and synthetic data, with strict separation between training/validation and testing. Full article

(This article belongs to the Section Network)

► Show Figures

Graphical abstract

22 pages, 568 KB

Open AccessReview

A Review of Methods for Unobtrusive Measurement of Work-Related Well-Being

by Zoja Anžur, Klara Žinkovič, Junoš Lukan, Pietro Barbiero, Gašper Slapničar, Mohan Li, Martin Gjoreski, Maike E. Debus, Sebastijan Trojer, Mitja Luštrek and Marc Langheinrich

Mach. Learn. Knowl. Extr. 2025, 7(3), 62; https://doi.org/10.3390/make7030062 - 1 Jul 2025

Viewed by 1120

Abstract

Work-related well-being is an important research topic, as it is linked to various aspects of individuals’ lives, including job performance. To measure it effectively, unobtrusive sensors are desirable to minimize the burden on employees. Because there is a lack of consensus on the [...] Read more.

Work-related well-being is an important research topic, as it is linked to various aspects of individuals’ lives, including job performance. To measure it effectively, unobtrusive sensors are desirable to minimize the burden on employees. Because there is a lack of consensus on the definitions of well-being in the psychological literature in terms of its dimensions, our work begins by proposing a conceptualization of well-being based on the refined definition of health provided by the World Health Organization. We focus on reviewing the existing literature on the unobtrusive measurement of well-being. In our literature review, we focus on affect, engagement, fatigue, stress, sleep deprivation, physical comfort, and social interactions. Our initial search resulted in a total of 644 studies, from which we then reviewed 35, revealing a variety of behavioral markers such as facial expressions, posture, eye movements, and speech. The most commonly used sensory devices were red, green, and blue (RGB) cameras, followed by microphones and smartphones. The methods capture a variety of behavioral markers, the most common being body movement, facial expressions, and posture. Our work serves as an investigation into various unobtrusive measuring methods applicable to the workplace context, aiming to foster a more employee-centric approach to the measurement of well-being and to emphasize its affective component. Full article

(This article belongs to the Special Issue Sustainable Applications for Machine Learning)

► Show Figures

Figure 1

13 pages, 2983 KB

Open AccessArticle

AI-Driven Intelligent Financial Forecasting: A Comparative Study of Advanced Deep Learning Models for Long-Term Stock Market Prediction

by Sira Yongchareon

Mach. Learn. Knowl. Extr. 2025, 7(3), 61; https://doi.org/10.3390/make7030061 - 1 Jul 2025

Viewed by 1697

Abstract

The integration of artificial intelligence (AI) and advanced deep learning techniques is reshaping intelligent financial forecasting and decision-support systems. This study presents a comprehensive comparative analysis of advanced deep learning models, including state-of-the-art transformer architectures and established non-transformer approaches, for long-term stock market [...] Read more.

The integration of artificial intelligence (AI) and advanced deep learning techniques is reshaping intelligent financial forecasting and decision-support systems. This study presents a comprehensive comparative analysis of advanced deep learning models, including state-of-the-art transformer architectures and established non-transformer approaches, for long-term stock market index prediction. Utilizing historical data from major global indices (S&P 500, NASDAQ, and Hang Seng), we evaluate ten models across multiple forecasting horizons. A dual-metric evaluation framework is employed, combining traditional predictive accuracy metrics with critical financial performance indicators such as returns, volatility, maximum drawdown, and the Sharpe ratio. Statistical validation through the Mann–Whitney U test ensures robust differentiation in model performance. The results highlight that model effectiveness varies significantly with forecasting horizons and market conditions—where transformer-based models like PatchTST excel in short-term forecasts, while simpler architectures demonstrate greater stability over extended periods. This research offers actionable insights for the development of AI-driven intelligent financial forecasting systems, enhancing risk-aware investment strategies and supporting practical applications in FinTech and smart financial analytics. Full article

► Show Figures

Figure 1

Journal Menu

Journal Browser

Mach. Learn. Knowl. Extr., Volume 7, Issue 3 (September 2025) – 33 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI