Journal Description
Machine Learning and Knowledge Extraction
Machine Learning and Knowledge Extraction
is an international, peer-reviewed, open access journal on machine learning and applications. It publishes original research articles, reviews, tutorials, research ideas, short notes and Special Issues that focus on machine learning and applications. Please see our video on YouTube explaining the MAKE journal concept. The journal is published quarterly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, and other databases.
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 25.5 days after submission; acceptance to publication is undertaken in 3.4 days (median values for papers published in this journal in the first half of 2025).
- Journal Rank: JCR - Q1 (Engineering, Electrical and Electronic) / CiteScore - Q1 (Engineering (miscellaneous))
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
6.0 (2024);
5-Year Impact Factor:
5.7 (2024)
Latest Articles
Transfer Learning for Generalized Safety Risk Detection in Industrial Video Operations
Mach. Learn. Knowl. Extr. 2025, 7(4), 111; https://doi.org/10.3390/make7040111 - 30 Sep 2025
Abstract
►
Show Figures
This paper proposes a transfer learning-based approach to enhance video-driven safety risk detection in industrial environments, addressing the critical challenge of limited generalization across diverse operational scenarios. Conventional deep learning models trained on specific operational contexts often fail when applied to new environments
[...] Read more.
This paper proposes a transfer learning-based approach to enhance video-driven safety risk detection in industrial environments, addressing the critical challenge of limited generalization across diverse operational scenarios. Conventional deep learning models trained on specific operational contexts often fail when applied to new environments with different lighting, camera angles, or machinery configurations, exhibiting a significant drop in performance (e.g., F1-score declining below 0.85). To overcome this issue, an incremental feature transfer learning strategy is introduced, enabling efficient adaptation of risk detection models using only small amounts of data from new scenarios. This approach leverages prior knowledge from pre-trained models to reduce the reliance on large-labeled datasets, particularly valuable in industrial settings where rare but critical safety risk events are difficult to capture. Additionally, training efficiency is improved compared with a classic approach, supporting deployment on resource-constrained edge devices. The strategy involves incremental retraining using video segments with average durations ranging from 2.5 to 25 min (corresponding to 5–50% of new scenario data), approximately, enabling scalable generalization across multiple forklift-related risk activities. Interpretability is enhanced through SHAP-based analysis, which reveals a redistribution of feature relevance toward critical components, thereby improving model transparency and reducing annotation demands. Experimental results confirm that the transfer learning strategy significantly improves detection accuracy, robustness, and adaptability, making it a practical and scalable solution for safety monitoring in dynamic industrial environments.
Full article
Open AccessArticle
Attention-Guided Differentiable Channel Pruning for Efficient Deep Networks
by
Anouar Chahbouni, Khaoula El Manaa, Yassine Abouch, Imane El Manaa, Badre Bossoufi, Mohammed El Ghzaoui and Rachid El Alami
Mach. Learn. Knowl. Extr. 2025, 7(4), 110; https://doi.org/10.3390/make7040110 - 29 Sep 2025
Abstract
►▼
Show Figures
Deploying deep learning (DL) models in real-world environments remains a major challenge, particularly under resource-constrained conditions where achieving both high accuracy and compact architectures is essential. While effective, Conventional pruning methods often suffer from high computational overhead, accuracy degradation, or disruption of the
[...] Read more.
Deploying deep learning (DL) models in real-world environments remains a major challenge, particularly under resource-constrained conditions where achieving both high accuracy and compact architectures is essential. While effective, Conventional pruning methods often suffer from high computational overhead, accuracy degradation, or disruption of the end-to-end training process, limiting their practicality for embedded and real-time applications. We present Dynamic Attention-Guided Pruning (DAGP), a Dynamic Attention-Guided Soft Channel Pruning framework that overcomes these limitations by embedding learnable, differentiable pruning masks directly within convolutional neural networks (CNNs). These masks act as implicit attention mechanisms, adaptively suppressing non-informative channels during training. A progressively scheduled L1 regularization, activated after a warm-up phase, enables gradual sparsity while preserving early learning capacity. Unlike prior methods, DAGP is retraining-free, introduces minimal architectural overhead, and supports optional hard pruning for deployment efficiency. Joint optimization of classification and sparsity objectives ensures stable convergence and task-adaptive channel selection. Experiments on CIFAR-10 (VGG16, ResNet56) and PlantVillage (custom CNN) achieve up to 98.82% FLOPs reduction with accuracy gains over baselines. Real-world validation on an enhanced PlantDoc dataset for agricultural monitoring achieves 60 ms inference with only 2.00 MB RAM on a Raspberry Pi 4, confirming efficiency under field conditions. These results illustrate DAGP’s potential to scale beyond agriculture to diverse edge-intelligent systems requiring lightweight, accurate, and deployable models.
Full article

Figure 1
Open AccessArticle
Enhancing Soundscape Characterization and Pattern Analysis Using Low-Dimensional Deep Embeddings on a Large-Scale Dataset
by
Daniel Alexis Nieto Mora, Leonardo Duque-Muñoz and Juan David Martínez Vargas
Mach. Learn. Knowl. Extr. 2025, 7(4), 109; https://doi.org/10.3390/make7040109 - 24 Sep 2025
Abstract
►▼
Show Figures
Soundscape monitoring has become an increasingly important tool for studying ecological processes and supporting habitat conservation. While many recent advances focus on identifying species through supervised learning, there is growing interest in understanding the soundscape as a whole while considering patterns that extend
[...] Read more.
Soundscape monitoring has become an increasingly important tool for studying ecological processes and supporting habitat conservation. While many recent advances focus on identifying species through supervised learning, there is growing interest in understanding the soundscape as a whole while considering patterns that extend beyond individual vocalizations. This broader view requires unsupervised approaches capable of capturing meaningful structures related to temporal dynamics, frequency content, spatial distribution, and ecological variability. In this study, we present a fully unsupervised framework for analyzing large-scale soundscape data using deep learning. We applied a convolutional autoencoder (Soundscape-Net) to extract acoustic representations from over 60,000 recordings collected across a grid-based sampling design in the Rey Zamuro Reserve in Colombia. These features were initially compared with other audio characterization methods, showing superior performance in multiclass classification, with accuracies of 0.85 for habitat cover identification and 0.89 for time-of-day classification across 13 days. For the unsupervised study, optimized dimensionality reduction methods (Uniform Manifold Approximation and Projection and Pairwise Controlled Manifold Approximation and Projection) were applied to project the learned features, achieving trustworthiness scores above 0.96. Subsequently, clustering was performed using KMeans and Density-Based Spatial Clustering of Applications with Noise (DBSCAN), with evaluations based on metrics such as the silhouette, where scores above 0.45 were obtained, thus supporting the robustness of the discovered latent acoustic structures. To interpret and validate the resulting clusters, we combined multiple strategies: spatial mapping through interpolation, analysis of acoustic index variance to understand the cluster structure, and graph-based connectivity analysis to identify ecological relationships between the recording sites. Our results demonstrate that this approach can uncover both local and broad-scale patterns in the soundscape, providing a flexible and interpretable pathway for unsupervised ecological monitoring.
Full article

Figure 1
Open AccessArticle
Learning to Balance Mixed Adversarial Attacks for Robust Reinforcement Learning
by
Mustafa Erdem and Nazım Kemal Üre
Mach. Learn. Knowl. Extr. 2025, 7(4), 108; https://doi.org/10.3390/make7040108 - 24 Sep 2025
Abstract
►▼
Show Figures
Reinforcement learning agents are highly susceptible to adversarial attacks that can severely compromise their performance. Although adversarial training is a common countermeasure, most existing research focuses on defending against single-type attacks targeting either observations or actions. This narrow focus overlooks the complexity of
[...] Read more.
Reinforcement learning agents are highly susceptible to adversarial attacks that can severely compromise their performance. Although adversarial training is a common countermeasure, most existing research focuses on defending against single-type attacks targeting either observations or actions. This narrow focus overlooks the complexity of real-world mixed attacks, where an agent’s perceptions and resulting actions are perturbed simultaneously. To systematically study these threats, we introduce the Action and State-Adversarial Markov Decision Process (ASA-MDP), which models the interaction as a zero-sum game between the agent and an adversary attacking both states and actions. Using this framework, we show that agents trained conventionally or against single-type attacks remain highly vulnerable to mixed perturbations. Moreover, we identify a key challenge in this setting: a naive mixed-type adversary often fails to effectively balance its perturbations across modalities during training, limiting the agent’s robustness. To address this, we propose the Action and State-Adversarial Proximal Policy Optimization (ASA-PPO) algorithm, which enables the adversary to learn a balanced strategy, distributing its attack budget across both state and action spaces. This, in turn, enhances the robustness of the trained agent against a wide range of adversarial scenarios. Comprehensive experiments across diverse environments demonstrate that policies trained with ASA-PPO substantially outperform baselines—including standard PPO and single-type adversarial methods—under action-only, observation-only, and, most notably, mixed-attack conditions.
Full article

Figure 1
Open AccessArticle
Saliency-Guided Local Semantic Mixing for Long-Tailed Image Classification
by
Jiahui Lv, Jun Lei, Jun Zhang, Chao Chen and Shuohao Li
Mach. Learn. Knowl. Extr. 2025, 7(3), 107; https://doi.org/10.3390/make7030107 - 22 Sep 2025
Abstract
►▼
Show Figures
In real-world visual recognition tasks, long-tailed distributions pose a widespread challenge, with extreme class imbalance severely limiting the representational learning capability of deep models. In practice, due to this imbalance, deep models often exhibit poor generalization performance on tail classes. To address this
[...] Read more.
In real-world visual recognition tasks, long-tailed distributions pose a widespread challenge, with extreme class imbalance severely limiting the representational learning capability of deep models. In practice, due to this imbalance, deep models often exhibit poor generalization performance on tail classes. To address this issue, data augmentation through the synthesis of new tail-class samples has become an effective method. One popular approach is CutMix, which explicitly mixes images from tail and other classes, constructing labels based on the ratio of the regions cropped from both images. However, region-based labels completely ignore the inherent semantic information of the augmented samples. To overcome this problem, we propose a saliency-guided local semantic mixing (LSM) method, which uses differentiable block decoupling and semantic-aware local mixing techniques. This method integrates head-class backgrounds while preserving the key discriminative features of tail classes and dynamically assigns labels to effectively augment tail-class samples. This results in efficient balancing of long-tailed data distributions and significant improvements in classification performance. The experimental validation shows that this method demonstrates significant advantages across three long-tailed benchmark datasets, improving classification accuracy by 5.0%, 7.3%, and 6.1%, respectively. Notably, the LSM framework is highly compatible, seamlessly integrating with existing classification models and providing significant performance gains, validating its broad applicability.
Full article

Figure 1
Open AccessArticle
Bayesian Learning Strategies for Reducing Uncertainty of Decision-Making in Case of Missing Values
by
Vitaly Schetinin and Livija Jakaite
Mach. Learn. Knowl. Extr. 2025, 7(3), 106; https://doi.org/10.3390/make7030106 - 22 Sep 2025
Abstract
Background: Liquidity crises pose significant risks to financial stability, and missing data in predictive models increase the uncertainty in decision-making. This study aims to develop a robust Bayesian Model Averaging (BMA) framework using decision trees (DTs) to enhance liquidity crisis prediction under missing
[...] Read more.
Background: Liquidity crises pose significant risks to financial stability, and missing data in predictive models increase the uncertainty in decision-making. This study aims to develop a robust Bayesian Model Averaging (BMA) framework using decision trees (DTs) to enhance liquidity crisis prediction under missing data conditions, offering reliable probabilistic estimates and insights into uncertainty. Methods: We propose a BMA framework over DTs, employing Reversible Jump Markov Chain Monte Carlo (RJ MCMC) sampling with a sweeping strategy to mitigate overfitting. Three preprocessing techniques for missing data were evaluated: Cont (treating variables as continuous with missing values labeled by a constant), ContCat (converting variables with missing values to categorical), and Ext (extending features with binary missing-value indicators). Results: The Ext method achieved 100% accuracy on a synthetic dataset and 92.2% on a real-world dataset of 20,000 companies (11% in crisis), outperforming baselines (AUC PRC 0.817 vs. 0.803, p < 0.05). The framework provided interpretable uncertainty estimates and identified key financial indicators driving crisis predictions. Conclusions: The BMA-DT framework with the Ext technique offers a scalable, interpretable solution for handling missing data, improving prediction accuracy and uncertainty estimation in liquidity crisis forecasting, with potential applications in finance, healthcare, and environmental modeling.
Full article
(This article belongs to the Section Learning)
►▼
Show Figures

Graphical abstract
Open AccessSystematic Review
Customer Churn Prediction: A Systematic Review of Recent Advances, Trends, and Challenges in Machine Learning and Deep Learning
by
Mehdi Imani, Majid Joudaki, Ali Beikmohammadi and Hamid Reza Arabnia
Mach. Learn. Knowl. Extr. 2025, 7(3), 105; https://doi.org/10.3390/make7030105 - 21 Sep 2025
Abstract
►▼
Show Figures
Background: Customer churn significantly impacts business revenues. Machine Learning (ML) and Deep Learning (DL) methods are increasingly adopted to predict churn, yet a systematic synthesis of recent advancements is lacking. Objectives: This systematic review evaluates ML and DL approaches for churn prediction, identifying
[...] Read more.
Background: Customer churn significantly impacts business revenues. Machine Learning (ML) and Deep Learning (DL) methods are increasingly adopted to predict churn, yet a systematic synthesis of recent advancements is lacking. Objectives: This systematic review evaluates ML and DL approaches for churn prediction, identifying trends, challenges, and research gaps from 2020 to 2024. Data Sources: Six databases (Springer, IEEE, Elsevier, MDPI, ACM, Wiley) were searched via Lens.org for studies published between January 2020 and December 2024. Study Eligibility Criteria: Peer-reviewed original studies applying ML/DL techniques for churn prediction were included. Reviews, preprints, and non-peer-reviewed works were excluded. Methods: Screening followed PRISMA 2020 guidelines. A two-phase strategy identified 240 studies for bibliometric analysis and 61 for detailed qualitative synthesis. Results: Ensemble methods (e.g., XGBoost, LightGBM) remain dominant in ML, while DL approaches (e.g., LSTM, CNN) are increasingly applied to complex data. Challenges include class imbalance, interpretability, concept drift, and limited use of profit-oriented metrics. Explainable AI and adaptive learning show potential but limited real-world adoption. Limitations: No formal risk of bias or certainty assessments were conducted. Study heterogeneity prevented meta-analysis. Conclusions: ML and DL methods have matured as key tools for churn prediction, yet gaps remain in interpretability, real-world deployment, and business-aligned evaluation. Systematic Review Registration: Registered retrospectively in OSF.
Full article

Graphical abstract
Open AccessArticle
Screening Smarter, Not Harder: Budget Allocation Strategies for Technology-Assisted Reviews (TARs) in Empirical Medicine
by
Giorgio Maria Di Nunzio
Mach. Learn. Knowl. Extr. 2025, 7(3), 104; https://doi.org/10.3390/make7030104 - 20 Sep 2025
Abstract
In the technology-assisted review (TAR) area, most research has focused on ranking effectiveness and active learning strategies within individual topics, often assuming unconstrained review effort. However, real-world applications such as legal discovery or medical systematic reviews are frequently subject to global screening budgets.
[...] Read more.
In the technology-assisted review (TAR) area, most research has focused on ranking effectiveness and active learning strategies within individual topics, often assuming unconstrained review effort. However, real-world applications such as legal discovery or medical systematic reviews are frequently subject to global screening budgets. In this paper, we revisit the CLEF eHealth TAR shared tasks (2017–2019) through the lens of budget-aware evaluation. We first reproduce and verify the official participant results, organizing them into a unified dataset for comparative analysis. Then, we introduce and assess four intuitive budget allocation strategies—even, proportional, inverse proportional, and threshold-capped greedy—to explore how review effort can be efficiently distributed across topics. To evaluate systems under resource constraints, we propose two cost-aware metrics: relevant found per cost unit (RFCU) and utility gain at budget (UG@B). These complement traditional recall by explicitly modeling efficiency and trade-offs between true and false positives. Our results show that different allocation strategies optimize different metrics: even and inverse proportional allocation favor recall, while proportional and capped strategies better maximize RFCU. UG@B remains relatively stable across strategies, reflecting its balanced formulation. A correlation analysis reveals that RFCU and UG@B offer distinct perspectives from recall, with varying alignment across years. Together, these findings underscore the importance of aligning evaluation metrics and allocation strategies with screening goals. We release all data and code to support reproducibility and future research on cost-sensitive TAR.
Full article
(This article belongs to the Topic The Use of Big Data in Public Health Research and Practice)
►▼
Show Figures

Graphical abstract
Open AccessArticle
Leveraging LLMs for Automated Extraction and Structuring of Educational Concepts and Relationships
by
Tianyuan Yang, Baofeng Ren, Chenghao Gu, Tianjia He, Boxuan Ma and Shin’ichi Konomi
Mach. Learn. Knowl. Extr. 2025, 7(3), 103; https://doi.org/10.3390/make7030103 - 19 Sep 2025
Abstract
►▼
Show Figures
Students must navigate large catalogs of courses and make appropriate enrollment decisions in many online learning environments. In this context, identifying key concepts and their relationships is essential for understanding course content and informing course recommendations. However, identifying and extracting concepts can be
[...] Read more.
Students must navigate large catalogs of courses and make appropriate enrollment decisions in many online learning environments. In this context, identifying key concepts and their relationships is essential for understanding course content and informing course recommendations. However, identifying and extracting concepts can be an extremely labor-intensive and time-consuming task when it has to be done manually. Traditional NLP-based methods to extract relevant concepts from courses heavily rely on resource-intensive preparation of detailed course materials, thereby failing to minimize labor. As recent advances in large language models (LLMs) offer a promising alternative for automating concept identification and relationship inference, we thoroughly investigate the potential of LLMs in automatically generating course concepts and their relations. Specifically, we systematically evaluate three LLM variants (GPT-3.5, GPT-4o-mini, and GPT-4o) across three distinct educational tasks, which are concept generation, concept extraction, and relation identification, using six systematically designed prompt configurations that range from minimal context (course title only) to rich context (course description, seed concepts, and subtitles). We systematically assess model performance through extensive automated experiments using standard metrics (Precision, Recall, F1, and Accuracy) and human evaluation by four domain experts, providing a comprehensive analysis of how prompt design and model choice influence the quality and reliability of the generated concepts and their interrelations. Our results show that GPT-3.5 achieves the highest scores on quantitative metrics, whereas GPT-4o and GPT-4o-mini often generate concepts that are more educationally meaningful despite lexical divergence from the ground truth. Nevertheless, LLM outputs still require expert revision, and performance is sensitive to prompt complexity. Overall, our experiments demonstrate the viability of LLMs as a tool for supporting educational content selection and delivery.
Full article

Graphical abstract
Open AccessArticle
Exploiting the Feature Space Structures of KNN and OPF Algorithms for Identification of Incipient Faults in Power Transformers
by
André Gifalli, Marco Akio Ikeshoji, Danilo Sinkiti Gastaldello, Victor Hideki Saito Yamaguchi, Welson Bassi, Talita Mazon, Floriano Torres Neto, Pedro da Costa Junior and André Nunes de Souza
Mach. Learn. Knowl. Extr. 2025, 7(3), 102; https://doi.org/10.3390/make7030102 - 18 Sep 2025
Abstract
►▼
Show Figures
Power transformers represent critical assets within the electrical power system, and their unexpected failures may result in substantial financial losses for both utilities and consumers. Dissolved Gas Analysis (DGA) is a well-established diagnostic method extensively employed to detect incipient faults in power transformers.
[...] Read more.
Power transformers represent critical assets within the electrical power system, and their unexpected failures may result in substantial financial losses for both utilities and consumers. Dissolved Gas Analysis (DGA) is a well-established diagnostic method extensively employed to detect incipient faults in power transformers. Although several conventional and machine learning techniques have been applied to DGA, most of them focus only on fault classification and lack the capability to provide predictive scenarios that would enable proactive maintenance planning. In this context, the present study introduces a novel approach to DGA interpretation, which highlights the trends and progression of faults by exploring the feature space through the algorithms k-Nearest Neighbors (KNN) and Optimum-Path Forest (OPF). To improve accuracy, the following strategies were implemented: statistical filtering based on normal distribution to eliminate outliers from the dataset; augmentation of gas-related features; and feature selection using optimization algorithms such as Cuckoo Search and Genetic Algorithms. The approach was validated using data from several transformers, with fault diagnoses cross-checked against inspection reports provided by the utility company. The findings indicate that the proposed method offers valuable insights into the progression, proximity, and classification of faults with satisfactory accuracy, thereby supporting its recommendation as a complementary tool for diagnosing incipient transformer faults.
Full article

Figure 1
Open AccessArticle
CRISP-NET: Integration of the CRISP-DM Model with Network Analysis
by
Héctor Alejandro Acuña-Cid, Eduardo Ahumada-Tello, Óscar Omar Ovalle-Osuna, Richard Evans, Julia Elena Hernández-Ríos and Miriam Alondra Zambrano-Soto
Mach. Learn. Knowl. Extr. 2025, 7(3), 101; https://doi.org/10.3390/make7030101 - 16 Sep 2025
Abstract
To carry out data analysis, it is necessary to implement a model that guides the process in an orderly and sequential manner, with the aim of maintaining control over software development and its documentation. One of the most widely used tools in the
[...] Read more.
To carry out data analysis, it is necessary to implement a model that guides the process in an orderly and sequential manner, with the aim of maintaining control over software development and its documentation. One of the most widely used tools in the field of data analysis is the Cross-Industry Standard Process for Data Mining (CRISP-DM), which serves as a reference framework for data mining, allowing the identification of patterns and, based on them, supporting informed decision-making. Another tool used for pattern identification and the study of relationships within systems is network analysis (NA), which makes it possible to explore how different components are interconnected. The integration of these tools can be justified and developed under the principles of Situational Method Engineering (SME), which allows for the adaptation and customization of existing methods according to the specific needs of a problem or context. Through SME, it is possible to determine which components of CRISP-DM need to be adjusted to efficiently incorporate NA, ensuring that this integration aligns with the project’s objectives in a structured and effective manner. The proposed methodological process was applied in a real working group, which allowed its functionality to be validated, each phase to be documented, and concrete outputs to be generated, demonstrating its usefulness for the development of analytical projects.
Full article
(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, Second Edition)
►▼
Show Figures

Figure 1
Open AccessArticle
Learnable Petri Net Neural Network Using Max-Plus Algebra
by
Mohammed Sharafath Abdul Hameed, Sofiene Lassoued and Andreas Schwung
Mach. Learn. Knowl. Extr. 2025, 7(3), 100; https://doi.org/10.3390/make7030100 - 13 Sep 2025
Abstract
►▼
Show Figures
Interpretable decision-making algorithms are important when used in the context of production optimization. While concepts like Petri nets are inherently interpretable, they are not straightforwardly learnable. This paper presents a novel approach to transform the Petri net model into a learnable entity. This
[...] Read more.
Interpretable decision-making algorithms are important when used in the context of production optimization. While concepts like Petri nets are inherently interpretable, they are not straightforwardly learnable. This paper presents a novel approach to transform the Petri net model into a learnable entity. This is accomplished by establishing a relationship between the Petri net description in the event domain, its representation in the max-plus algebra, and a one-layer perceptron neural network. This allows us to apply standard supervised learning methods adapted to the max-plus domain to infer the parameters of the Petri net. To this end, the feed-forward and back-propagation paths are modified to accommodate the differing mathematical operations in the context of max-plus algebra. We apply our approach to a multi-robot handling system with potentially varying processing and operation times. The results show that essential timing parameters can be inferred from data with high precision.
Full article

Figure 1
Open AccessArticle
Dynamic Graph Analysis: A Hybrid Structural–Spatial Approach for Brain Shape Correspondence
by
Jonnatan Arias-García, Hernán Felipe García, Andrés Escobar-Mejía, David Cárdenas-Peña and Álvaro A. Orozco
Mach. Learn. Knowl. Extr. 2025, 7(3), 99; https://doi.org/10.3390/make7030099 - 10 Sep 2025
Abstract
►▼
Show Figures
Accurate correspondence of complex neuroanatomical surfaces under non-rigid deformations remains a formidable challenge in computational neuroimaging, owing to inter-subject topological variability, partial occlusions, and non-isometric distortions. Here, we introduce the Dynamic Graph Analyzer (DGA), a unified hybrid framework that integrates simplified structural descriptors
[...] Read more.
Accurate correspondence of complex neuroanatomical surfaces under non-rigid deformations remains a formidable challenge in computational neuroimaging, owing to inter-subject topological variability, partial occlusions, and non-isometric distortions. Here, we introduce the Dynamic Graph Analyzer (DGA), a unified hybrid framework that integrates simplified structural descriptors with spatial constraints and formulates matching as a global linear assignment. Structurally, the DGA computes node-level metrics, degree weighted by betweenness centrality and local clustering coefficients, to capture essential topological patterns at a low computational cost. Spatially, it employs a two-stage scheme that combines global maximum distances and local rescaling of adjacent node separations to preserve geometric fidelity. By embedding these complementary measures into a single cost matrix solved via the Kuhn–Munkres algorithm followed by a refinement of weak correspondences, the DGA ensures a globally optimal correspondence. In benchmark evaluations on the FAUST dataset, the DGA achieved a significant reduction in the mean geodetic reconstruction error compared to spectral graph convolutional netwworks (GCNs)—which learn optimized spectral descriptors akin to classical approaches like heat/wave kernel signatures (HKS/WKS)—and traditional spectral methods. Additional experiments demonstrate robust performance on partial matches in TOSCA and cross-species alignments in SHREC-20, validating resilience to morphological variation and symmetry ambiguities. These results establish the DGA as a scalable and accurate approach for brain shape correspondence, with promising applications in biomarker mapping, developmental studies, and clinical morphometry.
Full article

Figure 1
Open AccessArticle
MCTS-Based Policy Improvement for Reinforcement Learning
by
György Csippán, István Péter, Bálint Kővári and Tamás Bécsi
Mach. Learn. Knowl. Extr. 2025, 7(3), 98; https://doi.org/10.3390/make7030098 - 10 Sep 2025
Abstract
Curriculum Learning (CL) is a potent field in Machine Learning that provides several excellent techniques for enhancing the performance of the training process given the same data points, regardless of the training method used. In this research, we propose a novel Monte Carlo
[...] Read more.
Curriculum Learning (CL) is a potent field in Machine Learning that provides several excellent techniques for enhancing the performance of the training process given the same data points, regardless of the training method used. In this research, we propose a novel Monte Carlo Tree Search (MCTS)-based technique that enhances model performance, articulating the utilization of MCTS in Curriculum Learning. The proposed approach leverages MCTS to optimize the sequence of batches during the training process. First, we demonstrate the application of our method in Reinforcement Learning, where sparse rewards often diminish convergence and deteriorate performance. By leveraging the strategic planning and exploration capabilities of MCTS, our method systematically identifies and selects trajectories that are more informative and have a higher potential to enhance policy improvement. This MCTS-guided batch optimization focuses the learning process on valuable experiences, accelerating convergence and improving overall performance. We evaluate our approach on standard RL benchmarks, demonstrating that it outperforms conventional batch selection methods regarding learning speed and policy effectiveness. The results highlight the potential of combining MCTS with CL to optimize batch selection, offering a promising direction for future research in efficient Reinforcement Learning.
Full article
(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, Second Edition)
►▼
Show Figures

Figure 1
Open AccessReview
A Review of Large Language Models for Automated Test Case Generation
by
Arda Celik and Qusay H. Mahmoud
Mach. Learn. Knowl. Extr. 2025, 7(3), 97; https://doi.org/10.3390/make7030097 - 9 Sep 2025
Abstract
►▼
Show Figures
Automated test case generation aims to improve software testing by reducing the manual effort required to create test cases. Recent advancements in large language models (LLMs), with their ability to understand natural language and generate code, have identified new opportunities to enhance this
[...] Read more.
Automated test case generation aims to improve software testing by reducing the manual effort required to create test cases. Recent advancements in large language models (LLMs), with their ability to understand natural language and generate code, have identified new opportunities to enhance this process. In this review, the focus is on the use of LLMs in test case generation to identify the effectiveness of the proposed methods compared with existing tools and potential directions for future research. A literature search was conducted using online resources, filtering the studies based on the defined inclusion and exclusion criteria. This paper presents the findings from the selected studies according to the three research questions and further categorizes the findings based on the common themes. These findings highlight the opportunities and challenges associated with the use of LLMs in this domain. Although improvements were observed in metrics such as test coverage, usability, and correctness, limitations such as inconsistent performance and compilation errors were highlighted. This provides a state-of-the-art review of LLM-based test case generation, emphasizing the potential of LLMs to improve automated testing while identifying areas for further advancements.
Full article

Graphical abstract
Open AccessArticle
Leveraging DNA-Based Computing to Improve the Performance of Artificial Neural Networks in Smart Manufacturing
by
Angkush Kumar Ghosh and Sharifu Ura
Mach. Learn. Knowl. Extr. 2025, 7(3), 96; https://doi.org/10.3390/make7030096 - 9 Sep 2025
Abstract
►▼
Show Figures
Bioinspired computing methods, such as Artificial Neural Networks (ANNs), play a significant role in machine learning. This is particularly evident in smart manufacturing, where ANNs and their derivatives, like deep learning, are widely used for pattern recognition and adaptive control. However, ANNs sometimes
[...] Read more.
Bioinspired computing methods, such as Artificial Neural Networks (ANNs), play a significant role in machine learning. This is particularly evident in smart manufacturing, where ANNs and their derivatives, like deep learning, are widely used for pattern recognition and adaptive control. However, ANNs sometimes fail to achieve the desired results, especially when working with small datasets. To address this limitation, this article presents the effectiveness of DNA-Based Computing (DBC) as a complementary approach. DBC is an innovative machine learning method rooted in the central dogma of molecular biology that deals with the genetic information of DNA/RNA to protein. In this article, two machine learning approaches are considered. In the first approach, an ANN was trained and tested using time series datasets driven by long and short windows, with features extracted from the time domain. Each long-window-driven dataset contained approximately 150 data points, while each short-window-driven dataset had approximately 10 data points. The results showed that the ANN performed well for long-window-driven datasets. However, its performance declined significantly in the case of short-window-driven datasets. In the last approach, a hybrid model was developed by integrating DBC with the ANN. In this case, the features were first extracted using DBC. The extracted features were used to train and test the ANN. This hybrid approach demonstrated robust performance for both long- and short-window-driven datasets. The ability of DBC to overcome the ANN’s limitations with short-window-driven datasets underscores its potential as a pragmatic machine learning solution for developing more effective smart manufacturing systems, such as digital twins.
Full article

Figure 1
Open AccessArticle
Machine Unlearning for Robust DNNs: Attribution-Guided Partitioning and Neuron Pruning in Noisy Environments
by
Deliang Jin, Gang Chen, Shuo Feng, Yufeng Ling and Haoran Zhu
Mach. Learn. Knowl. Extr. 2025, 7(3), 95; https://doi.org/10.3390/make7030095 - 5 Sep 2025
Abstract
►▼
Show Figures
Deep neural networks (DNNs) are highly effective across many domains but are sensitive to noisy or corrupted training data. Existing noise mitigation strategies often rely on strong assumptions about noise distributions or require costly retraining, limiting their scalability. Inspired by machine unlearning, we
[...] Read more.
Deep neural networks (DNNs) are highly effective across many domains but are sensitive to noisy or corrupted training data. Existing noise mitigation strategies often rely on strong assumptions about noise distributions or require costly retraining, limiting their scalability. Inspired by machine unlearning, we propose a novel framework that integrates attribution-guided data partitioning, neuron pruning, and targeted fine-tuning to enhance robustness. Our method uses gradient-based attribution to probabilistically identify clean samples without assuming specific noise characteristics. It then applies sensitivity-based neuron pruning to remove components most susceptible to noise, followed by fine-tuning on the retained high-quality subset. This approach jointly addresses data and model-level noise, offering a practical alternative to full retraining or explicit noise modeling. We evaluate our method on CIFAR-10 image classification and keyword spotting tasks under varying levels of label corruption. On CIFAR-10, our framework improves accuracy by up to 10% (F-FT vs. retrain) and reduces retraining time by 47% (L-FT vs. retrain), highlighting both accuracy and efficiency gains. These results highlight its effectiveness and efficiency in noisy settings, making it a scalable solution for robust generalization.
Full article

Figure 1
Open AccessArticle
A Dynamic Hypergraph-Based Encoder–Decoder Risk Model for Longitudinal Predictions of Knee Osteoarthritis Progression
by
John B. Theocharis, Christos G. Chadoulos and Andreas L. Symeonidis
Mach. Learn. Knowl. Extr. 2025, 7(3), 94; https://doi.org/10.3390/make7030094 - 2 Sep 2025
Abstract
Knee osteoarthritis (KOA) is a most prevalent chronic muscoloskeletal disorder causing pain and functional impairment. Accurate predictions of KOA evolution are important for early interventions and preventive treatment planning. In this paper, we propose a novel dynamic hypergraph-based risk model (DyHRM) which integrates
[...] Read more.
Knee osteoarthritis (KOA) is a most prevalent chronic muscoloskeletal disorder causing pain and functional impairment. Accurate predictions of KOA evolution are important for early interventions and preventive treatment planning. In this paper, we propose a novel dynamic hypergraph-based risk model (DyHRM) which integrates the encoder–decoder (ED) architecture with hypergraph convolutional neural networks (HGCNs). The risk model is used to generate longitudinal forecasts of KOA incidence and progression based on the knee evolution at a historical stage. DyHRM comprises two main parts, namely the dynamic hypergraph gated recurrent unit (DyHGRU) and the multi-view HGCN (MHGCN) networks. The ED-based DyHGRU follows the sequence-to-sequence learning approach. The encoder first transforms a knee sequence at the historical stage into a sequence of hidden states in a latent space. The Attention-based Context Transformer (ACT) is designed to identify important temporal trends in the encoder’s state sequence, while the decoder is used to generate sequences of KOA progression, at the prediction stage. MHGCN conducts multi-view spatial HGCN convolutions of the original knee data at each step of the historic stage. The aim is to acquire more comprehensive feature representations of nodes by exploiting different hyperedges (views), including the global shape descriptors of the cartilage volume, the injury history, and the demographic risk factors. In addition to DyHRM, we also propose the HyGraphSMOTE method to confront the inherent class imbalance problem in KOA datasets, between the knee progressors (minority) and non-progressors (majority). Embedded in MHGCN, the HyGraphSMOTE algorithm tackles data balancing in a systematic way, by generating new synthetic node sequences of the minority class via interpolation. Extensive experiments are conducted using the Osteoarthritis Initiative (OAI) cohort to validate the accuracy of longitudinal predictions acquired by DyHRM under different definition criteria of KOA incidence and progression. The basic finding of the experiments is that the larger the historic depth, the higher the accuracy of the obtained forecasts ahead. Comparative results demonstrate the efficacy of DyHRM against other state-of-the-art methods in this field.
Full article
(This article belongs to the Special Issue Advances in Machine and Deep Learning)
►▼
Show Figures

Figure 1
Open AccessArticle
Geometric Reasoning in the Embedding Space
by
David Mojžíšek, Jan Hůla, Jiří Janeček, David Herel and Mikoláš Janota
Mach. Learn. Knowl. Extr. 2025, 7(3), 93; https://doi.org/10.3390/make7030093 - 2 Sep 2025
Abstract
►▼
Show Figures
While neural networks can solve complex geometric problems, as demonstrated by systems like AlphaGeometry, we have limited understanding of how they internally represent and reason about spatial relationships. In this work, we investigate how neural networks develop internal spatial understanding by training Graph
[...] Read more.
While neural networks can solve complex geometric problems, as demonstrated by systems like AlphaGeometry, we have limited understanding of how they internally represent and reason about spatial relationships. In this work, we investigate how neural networks develop internal spatial understanding by training Graph Neural Networks and Transformers to predict point positions on a discrete 2D grid from geometric constraints that describe hidden figures. We show that both models develop interpretable internal representations that mirror the geometric structure of the problems they solve. Specifically, we observe that point embeddings self-organize into 2D grid structures during training, and during inference, the models iteratively construct the hidden geometric figures within their embedding spaces. Our analysis reveals how reasoning complexity correlates with prediction accuracy, and shows that models solve constraints through an iterative refinement process, which might resemble continuous optimization. We also find that Graph Neural Networks prove more suitable than Transformers for this type of structured constraint reasoning and scale more effectively to larger problems. These findings provide initial insights into how neural networks can develop structured understanding and contribute to their interpretability.
Full article

Figure 1
Open AccessArticle
A Novel Prediction Model for Multimodal Medical Data Based on Graph Neural Networks
by
Lifeng Zhang, Teng Li, Hongyan Cui, Quan Zhang, Zijie Jiang, Jiadong Li, Roy E. Welsch and Zhongwei Jia
Mach. Learn. Knowl. Extr. 2025, 7(3), 92; https://doi.org/10.3390/make7030092 - 2 Sep 2025
Abstract
►▼
Show Figures
Multimodal medical data provides a wide and real basis for disease diagnosis. Computer-aided diagnosis (CAD) powered by artificial intelligence (AI) is becoming increasingly prominent in disease diagnosis. CAD for multimodal medical data requires addressing the issues of data fusion and prediction. Traditionally, the
[...] Read more.
Multimodal medical data provides a wide and real basis for disease diagnosis. Computer-aided diagnosis (CAD) powered by artificial intelligence (AI) is becoming increasingly prominent in disease diagnosis. CAD for multimodal medical data requires addressing the issues of data fusion and prediction. Traditionally, the prediction performance of CAD models has not been good enough due to the complicated dimensionality reduction. Therefore, this paper proposes a fusion and prediction model—EPGC—for multimodal medical data based on graph neural networks. Firstly, we select features from unstructured multimodal medical data and quantify them. Then, we transform the multimodal medical data into a graph data structure by establishing each patient as a node, and establishing edges based on the similarity of features between the patients. Normalization of data is also essential in this process. Finally, we build a node prediction model based on graph neural networks and predict the node classification, which predicts the patients’ diseases. The model is validated on two publicly available datasets of heart diseases. Compared to the existing models that typically involve dimensionality reduction, classification, or the establishment of complex deep learning networks, the proposed model achieves outstanding results with the experimental dataset. This demonstrates that the fusion and diagnosis of multimodal data can be effectively achieved without dimension reduction or intricate deep learning networks. We take pride in exploring unstructured multimodal medical data using deep learning and hope to make breakthroughs in various fields.
Full article

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
Cancers, IJERPH, IJGI, MAKE, Smart Cities
The Use of Big Data in Public Health Research and Practice
Topic Editors: Quynh C. Nguyen, Thu T. NguyenDeadline: 31 December 2025
Topic in
Applied Sciences, Computers, Entropy, Information, MAKE, Systems
Opportunities and Challenges in Explainable Artificial Intelligence (XAI)
Topic Editors: Luca Longo, Mario Brcic, Sebastian LapuschkinDeadline: 31 January 2026
Topic in
Applied Sciences, Electronics, J. Imaging, MAKE, Information, BDCC, Signals
Applications of Image and Video Processing in Medical Imaging
Topic Editors: Jyh-Cheng Chen, Kuangyu ShiDeadline: 30 April 2026
Topic in
Applied Sciences, ASI, Blockchains, Computers, MAKE, Software
Recent Advances in AI-Enhanced Software Engineering and Web Services
Topic Editors: Hai Wang, Zhe HouDeadline: 31 May 2026

Special Issues
Special Issue in
MAKE
Advances in Explainable Artificial Intelligence (XAI): 3rd Edition
Guest Editor: Luca LongoDeadline: 30 September 2025
Special Issue in
MAKE
Deep Learning in Image Analysis and Pattern Recognition, 2nd Edition
Guest Editors: Xianzhi Wang, Guoqing ChaoDeadline: 26 November 2025
Special Issue in
MAKE
Language Acquisition and Understanding
Guest Editors: Michal Ptaszynski, Rafal Rzepka, Masaharu YoshiokaDeadline: 15 July 2026
Topical Collections
Topical Collection in
MAKE
Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction
Collection Editor: Andreas Holzinger