Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,538)

Search Parameters:
Keywords = supervised machine learning modelling

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
30 pages, 815 KB  
Review
Next-Generation Machine Learning in Healthcare Fraud Detection: Current Trends, Challenges, and Future Research Directions
by Kamran Razzaq and Mahmood Shah
Information 2025, 16(9), 730; https://doi.org/10.3390/info16090730 (registering DOI) - 25 Aug 2025
Abstract
The growing complexity and size of healthcare systems have rendered fraud detection increasingly challenging; however, the current literature lacks a holistic view of the latest machine learning (ML) techniques with practical implementation concerns. The present study addresses this gap by highlighting the importance [...] Read more.
The growing complexity and size of healthcare systems have rendered fraud detection increasingly challenging; however, the current literature lacks a holistic view of the latest machine learning (ML) techniques with practical implementation concerns. The present study addresses this gap by highlighting the importance of machine learning (ML) in preventing and mitigating healthcare fraud, evaluating recent advancements, investigating implementation barriers, and exploring future research dimensions. To further address the limited research on the evaluation of machine learning (ML) and hybrid approaches, this study considers a broad spectrum of ML techniques, including supervised ML, unsupervised ML, deep learning, and hybrid ML approaches such as SMOTE-ENN, explainable AI, federated learning, and ensemble learning. The study also explored their potential use in enhancing fraud detection in imbalanced and multidimensional datasets. A significant finding of the study was the identification of commonly employed datasets, such as Medicare, the List of Excluded Individuals and Entities (LEIE), and Kaggle datasets, which serve as a baseline for evaluating machine learning (ML) models. The study’s findings comprehensively identify the challenges of employing machine learning (ML) in healthcare systems, including data quality, system scalability, regulatory compliance, and resource constraints. The study provides actionable insights, such as model interpretability to enable regulatory compliance and federated learning for confidential data sharing, which is particularly relevant for policymakers, healthcare providers, and insurance companies that intend to deploy a robust, scalable, and secure fraud detection infrastructure. The study presents a comprehensive framework for enhancing real-time healthcare fraud detection through self-learning, interpretable, and safe machine learning (ML) infrastructures, integrating theoretical advancements with practical application needs. Full article
Show Figures

Figure 1

25 pages, 4100 KB  
Article
An Adaptive Unsupervised Learning Approach for Credit Card Fraud Detection
by John Adejoh, Nsikak Owoh, Moses Ashawa, Salaheddin Hosseinzadeh, Alireza Shahrabi and Salma Mohamed
Big Data Cogn. Comput. 2025, 9(9), 217; https://doi.org/10.3390/bdcc9090217 - 25 Aug 2025
Abstract
Credit card fraud remains a major cause of financial loss around the world. Traditional fraud detection methods that rely on supervised learning often struggle because fraudulent transactions are rare compared to legitimate ones, leading to imbalanced datasets. Additionally, the models must be retrained [...] Read more.
Credit card fraud remains a major cause of financial loss around the world. Traditional fraud detection methods that rely on supervised learning often struggle because fraudulent transactions are rare compared to legitimate ones, leading to imbalanced datasets. Additionally, the models must be retrained frequently, as fraud patterns change over time and require new labeled data for retraining. To address these challenges, this paper proposes an ensemble unsupervised learning approach for credit card fraud detection that combines Autoencoders (AEs), Self-Organizing Maps (SOMs), and Restricted Boltzmann Machines (RBMs), integrated with an Adaptive Reconstruction Threshold (ART) mechanism. The ART dynamically adjusts anomaly detection thresholds by leveraging the clustering properties of SOMs, effectively overcoming the limitations of static threshold approaches in machine learning and deep learning models. The proposed models, AE-ASOMs (Autoencoder—Adaptive Self-Organizing Maps) and RBM-ASOMs (Restricted Boltzmann Machines—Adaptive Self-Organizing Maps), were evaluated on the Kaggle Credit Card Fraud Detection and IEEE-CIS datasets. Our AE-ASOM model achieved an accuracy of 0.980 and an F1-score of 0.967, while the RBM-ASOM model achieved an accuracy of 0.975 and an F1-score of 0.955. Compared to models such as One-Class SVM and Isolation Forest, our approach demonstrates higher detection accuracy and significantly reduces false positive rates. In addition to its performance, the model offers considerable computational efficiency with a training time of 200.52 s and memory usage of 3.02 megabytes. Full article
Show Figures

Figure 1

36 pages, 590 KB  
Review
Machine Translation in the Era of Large Language Models:A Survey of Historical and Emerging Problems
by Duygu Ataman, Alexandra Birch, Nizar Habash, Marcello Federico, Philipp Koehn and Kyunghyun Cho
Information 2025, 16(9), 723; https://doi.org/10.3390/info16090723 - 25 Aug 2025
Abstract
Historically regarded as one of the most challenging tasks presented to achieve complete artificial intelligence (AI), machine translation (MT) research has seen continuous devotion over the past decade, resulting in cutting-edge architectures for the modeling of sequential information. While the majority of statistical [...] Read more.
Historically regarded as one of the most challenging tasks presented to achieve complete artificial intelligence (AI), machine translation (MT) research has seen continuous devotion over the past decade, resulting in cutting-edge architectures for the modeling of sequential information. While the majority of statistical models traditionally relied on the idea of learning from parallel translation examples, recent research exploring self-supervised and multi-task learning methods extended the capabilities of MT models, eventually allowing the creation of general-purpose large language models (LLMs). In addition to versatility in providing translations useful across languages and domains, LLMs can in principle perform any natural language processing (NLP) task given sufficient amount of task-specific examples. While LLMs now reach a point where they can both replace and augment traditional MT models, the extent of their advantages and the ways in which they leverage translation capabilities across multilingual NLP tasks remains a wide area for exploration. In this literature survey, we present an introduction to the current position of MT research with a historical look at different modeling approaches to MT, how these might be advantageous for the solution of particular problems, and which problems are solved or remain open in regard to recent developments. We also discuss the connection of MT models leading to the development of prominent LLM architectures, how they continue to support LLM performance across different tasks by providing a means for cross-lingual knowledge transfer, and the redefinition of the task with the possibilities that LLM technology brings. Full article
(This article belongs to the Special Issue Human and Machine Translation: Recent Trends and Foundations)
Show Figures

Figure 1

22 pages, 8391 KB  
Article
Combine Virtual Reality and Machine-Learning to Identify the Presence of Dyslexia: A Cross-Linguistic Approach
by Michele Materazzini, Gianluca Morciano, José Manuel Alcalde-Llergo, Enrique Yeguas-Bolívar, Giuseppe Calabrò, Andrea Zingoni and Juri Taborri
Information 2025, 16(9), 719; https://doi.org/10.3390/info16090719 - 22 Aug 2025
Viewed by 99
Abstract
This study explores the use of virtual reality (VR) and artificial intelligence (AI) to predict the presence of dyslexia in Italian and Spanish university students. In particular, the research investigates whether VR-derived data from Silent Reading (SR) tests and self-esteem assessments can differentiate [...] Read more.
This study explores the use of virtual reality (VR) and artificial intelligence (AI) to predict the presence of dyslexia in Italian and Spanish university students. In particular, the research investigates whether VR-derived data from Silent Reading (SR) tests and self-esteem assessments can differentiate between students that are affected by dyslexia and students that are not, employing machine learning (ML) algorithms. Participants completed VR-based tasks measuring reading performance and self-esteem. A preliminary statistical analysis (t-tests and Mann–Whitney tests) on these data was performed, to compare the obtained scores between individuals with and without dyslexia, revealing significant differences in completion time for the SR test, but not in accuracy, nor in self-esteem. Then, supervised ML models were trained and tested, demonstrating an ability to classify the presence/absence of dyslexia with an accuracy of 87.5% for Italian, 66.6% for Spanish, and 75.0% for the pooled group. These findings suggest that VR and ML can effectively be used as supporting tools for assessing dyslexia, particularly by capturing differences in task completion speed, but language-specific factors may influence classification accuracy. Full article
(This article belongs to the Special Issue Machine Learning and Artificial Intelligence with Applications)
Show Figures

Figure 1

13 pages, 1423 KB  
Article
Quantifying “Medical Renal Disease”: A Pediatric Pilot Study Using Ultrasound Radiomics for Differentiating Acute Kidney Injury and Chronic Kidney Disease
by Laura De Leon-Benedetti, Laith R. Sultan, Hansel J. Otero, Tatiana Morales-Tisnés, Joya Sims, Kate Fitzpatrick, Julie C. Fitzgerald, Susan Furth, Benjamin L. Laskin and Bernarda Viteri
Diagnostics 2025, 15(16), 2112; https://doi.org/10.3390/diagnostics15162112 - 21 Aug 2025
Viewed by 199
Abstract
Background: Differentiating acute kidney injury (AKI) from chronic kidney disease (CKD) in children remains a critical unmet need due to the limitations of current clinical and biochemical markers. Conventional ultrasound lacks the sensitivity to discern subtle parenchymal alterations. This study explores the application [...] Read more.
Background: Differentiating acute kidney injury (AKI) from chronic kidney disease (CKD) in children remains a critical unmet need due to the limitations of current clinical and biochemical markers. Conventional ultrasound lacks the sensitivity to discern subtle parenchymal alterations. This study explores the application of ultrasound radiomics—a novel, non-invasive, and quantitative image analysis method—for distinguishing AKI from CKD in pediatric patients. Methods: In this retrospective cross-sectional pilot study, kidney ultrasound images were obtained from 31 pediatric subjects: 8 with oliguric AKI, 14 with CKD, and 9 healthy controls. Renal parenchyma was manually segmented, and 124 advanced texture features were extracted using the open-source ©PyFeats. Features encompassed multiple categories (e.g., GLCM, GLSZM, WP). Statistical comparisons evaluated intergroup differences. Principal Component Analysis identified the top 10 most informative features, which were used to train supervised machine learning models. Model performance used five-fold cross-validation. Results: Radiomic analysis revealed significant intergroup differences (p < 0.05). CKD cases exhibited increased echogenicity and heterogeneity, particularly in GLCM and GLSZM features, consistent with chronic fibrosis. AKI cases displayed more homogeneous texture, likely reflecting edema or acute inflammation. While echogenicity separated diseased from healthy kidneys, it lacked specificity between AKI and CKD. Among ML models, XGBoost achieved the highest macro-averaged F1 score (0.90), followed closely by SVM and Random Forest, demonstrating strong classification performance. Conclusions: Radiomics-based texture analysis of grayscale ultrasound images effectively differentiated AKI from CKD in this pilot study, offering a promising, non-invasive imaging biomarker for pediatric kidney disease. These preliminary findings justify prospective validation in larger, multicenter cohorts. Full article
(This article belongs to the Special Issue Advanced Ultrasound Techniques in Diagnosis)
Show Figures

Figure 1

36 pages, 6877 KB  
Article
Machine Learning for Reservoir Quality Prediction in Chlorite-Bearing Sandstone Reservoirs
by Thomas E. Nichols, Richard H. Worden, James E. Houghton, Joshua Griffiths, Christian Brostrøm and Allard W. Martinius
Geosciences 2025, 15(8), 325; https://doi.org/10.3390/geosciences15080325 - 19 Aug 2025
Viewed by 168
Abstract
We have developed a generalisable machine learning framework for reservoir quality prediction in deeply buried clastic systems. Applied to the Lower Jurassic deltaic sandstones of the Tilje Formation (Halten Terrace, North Sea), the approach integrates sedimentological facies modelling with mineralogical and petrophysical prediction [...] Read more.
We have developed a generalisable machine learning framework for reservoir quality prediction in deeply buried clastic systems. Applied to the Lower Jurassic deltaic sandstones of the Tilje Formation (Halten Terrace, North Sea), the approach integrates sedimentological facies modelling with mineralogical and petrophysical prediction in a single workflow. Using supervised Extreme Gradient Boosting (XGBoost) models, we classify reservoir facies, predict permeability directly from standard wireline log parameters and estimate the abundance of porosity-preserving grain coating chlorite (gamma ray, neutron porosity, caliper, photoelectric effect, bulk density, compressional and shear sonic, and deep resistivity). Model development and evaluation employed stratified K-fold cross-validation to preserve facies proportions and mineralogical variability across folds, supporting robust performance assessment and testing generalisability across a geologically heterogeneous dataset. Core description, point count petrography, and core plug analyses were used for ground truthing. The models distinguish chlorite-associated facies with up to 80% accuracy and estimate permeability with a mean absolute error of 0.782 log(mD), improving substantially on conventional regression-based approaches. The models also enable prediction, for the first time using wireline logs, grain-coating chlorite abundance with a mean absolute error of 1.79% (range 0–16%). The framework takes advantage of diagnostic petrophysical responses associated with chlorite and high porosity, yielding geologically consistent and interpretable results. It addresses persistent challenges in characterising thinly bedded, heterogeneous intervals beyond the resolution of traditional methods and is transferable to other clastic reservoirs, including those considered for carbon storage and geothermal applications. The workflow supports cost-effective, high-confidence subsurface characterisation and contributes a flexible methodology for future work at the interface of geoscience and machine learning. Full article
Show Figures

Figure 1

25 pages, 1872 KB  
Article
Food Safety Risk Prediction and Regulatory Policy Enlightenment Based on Machine Learning
by Daqing Wu, Hangqi Cai and Tianhao Li
Systems 2025, 13(8), 715; https://doi.org/10.3390/systems13080715 - 19 Aug 2025
Viewed by 203
Abstract
This paper focuses on the challenges in food safety governance in megacities, taking Shanghai as the research object. Aiming at the pain points in food sampling inspections, it proposes a risk prediction and regulatory optimization scheme combining text mining and machine learning. First, [...] Read more.
This paper focuses on the challenges in food safety governance in megacities, taking Shanghai as the research object. Aiming at the pain points in food sampling inspections, it proposes a risk prediction and regulatory optimization scheme combining text mining and machine learning. First, the paper uses the LDA method to conduct in-depth mining on over 78,000 pieces of food sampling data across 34 categories in Shanghai, so as to identify core risk themes. Second, it applies SMOTE oversampling to the sampling data with an extremely low unqualified rate (0.5%). Finally, a machine learning prediction model for food safety risks is constructed, and predictions are made based on this model. The research findings are as follows: ① Food risks in Shanghai show significant characteristics in terms of time, category, and pollution causes. ② Supply chain links, regulatory intensity, and consumption scenarios are among the core influencing factors. ③ The traditional “full coverage” model is inefficient, and resources need to be tilted toward high-risk categories. ④ Public attention (e.g., the “You Order, We Inspect” initiative) can drive regulatory responses to improve the qualified rate. Based on these findings, this paper suggests that relevant authorities should ① classify three levels of risks for categories, increase inspection frequency for high-risk products in summer, adjust sampling intensity for different business entities, and establish a dynamic hierarchical regulatory mechanism; ② tackle source governance, reduce environmental pollution, upgrade process supervision, and strengthen whole-chain risk prevention and control; and ③ promote public participation, strengthen the enterprise responsibility system, and deepen the social co-governance pattern. This study effectively addresses the risk early warning problems in food safety supervision of megacities, providing a scientific basis and practical path for optimizing the allocation of regulatory resources and improving governance efficiency. Full article
(This article belongs to the Topic Digital Technologies in Supply Chain Risk Management)
Show Figures

Figure 1

14 pages, 521 KB  
Article
A Machine Learning Approach to Predict Successful Trans-Ventricular Off-Pump Micro-Invasive Mitral Valve Repair
by Alessandro Vairo, Caterina Russo, Andrea Saglietto, Rino Andrea Cimino, Marco Pocar, Cristina Barbero, Andrea Costamagna, Gaetano Maria De Ferrari, Mauro Rinaldi and Stefano Salizzoni
J. Clin. Med. 2025, 14(16), 5863; https://doi.org/10.3390/jcm14165863 - 19 Aug 2025
Viewed by 295
Abstract
Background: The NeoChord procedure is a trans-ventricular, echo-guided, beating-heart mitral valve (MV) repair technique used to treat degenerative mitral regurgitation (MR) caused by leaflet prolapse and/or flail. Objectives: This study aimed to develop a machine learning (ML) scoring system using pre-procedural [...] Read more.
Background: The NeoChord procedure is a trans-ventricular, echo-guided, beating-heart mitral valve (MV) repair technique used to treat degenerative mitral regurgitation (MR) caused by leaflet prolapse and/or flail. Objectives: This study aimed to develop a machine learning (ML) scoring system using pre-procedural clinical and echocardiographic variables to predict the success of the NeoChord procedure—defined as less than moderate MR at follow-up. Methods: A total of 80 patients were included. Preoperative MV anatomical parameters were assessed using three-dimensional (3D) transesophageal echocardiography and analyzed with dedicated post-processing software (QLAB software, version 15.0, Philips Healthcare, Amstelveen, NL, The Netherlands). Two supervised ML models (random forest and decision tree) were trained on the dataset, with hyperparameters optimized via 10-fold cross-validation. The random forest model also provided a variable importance ranking using a filter-based method. Key predictors identified by the models included age, flail gap, early systolic mitral valve area, and indexed left atrial volume. Results: The mean and median cross-validated area under the curve of the ML models were 0.79 and 0.83 for the random forest model and 0.72 and 0.77 for the decision tree model, respectively. Conclusions: A machine learning approach integrating clinical and 3D echocardiographic parameters can effectively predict mid-term procedural success of the NeoChord technique. This method may support future preoperative patient selection, pending validation in larger cohorts. Full article
(This article belongs to the Special Issue Mitral Valve Surgery: Current Status and Future Challenges)
Show Figures

Figure 1

22 pages, 3665 KB  
Article
Comparative Study of Linear and Non-Linear ML Algorithms for Cement Mortar Strength Estimation
by Sebghatullah Jueyendah, Zeynep Yaman, Turgay Dere and Türker Fedai Çavuş
Buildings 2025, 15(16), 2932; https://doi.org/10.3390/buildings15162932 - 19 Aug 2025
Viewed by 202
Abstract
The compressive strength (Fc) of cement mortar (CM) is a key parameter in ensuring the mechanical reliability and durability of cement-based materials. Traditional testing methods are labor-intensive, time-consuming, and often lack predictive flexibility. With the increasing adoption of machine learning (ML) in civil [...] Read more.
The compressive strength (Fc) of cement mortar (CM) is a key parameter in ensuring the mechanical reliability and durability of cement-based materials. Traditional testing methods are labor-intensive, time-consuming, and often lack predictive flexibility. With the increasing adoption of machine learning (ML) in civil engineering, data-driven approaches offer a rapid, cost-effective alternative for forecasting material properties. This study investigates a wide range of supervised linear and nonlinear ML regression models to predict the Fc of CM. The evaluated models include linear regression, ridge regression, lasso regression, decision trees, random forests, gradient boosting, k-nearest neighbors (KNN), and twelve neural network (NN) architectures, developed by combining different optimizers (L-BFGS, Adam, and SGD) with activation functions (tanh, relu, logistic, and identity). Model performance was assessed using the root mean squared error (RMSE), coefficient of determination (R2), and mean absolute error (MAE). Among all models, NN_tanh_lbfgs achieved the best results, with an almost perfect fit in training (R2 = 0.9999, RMSE = 0.0083, MAE = 0.0063) and excellent generalization in testing (R2 = 0.9946, RMSE = 1.5032, MAE = 1.2545). NN_logistic_lbfgs, gradient boosting, and NN_relu_lbfgs also exhibited high predictive accuracy and robustness. The SHAP analysis revealed that curing age and nano silica/cement ratio (NS/C) positively influence Fc, while porosity has the strongest negative impact. The main novelty of this study lies in the systematic tuning of neural networks via distinct optimizer–activation combinations, and the integration of SHAP for interpretability—bridging the gap between predictive performance and explainability in cementitious materials research. These results confirm the NN_tanh_lbfgs as a highly reliable model for estimating Fc in CM, offering a robust, interpretable, and scalable solution for data-driven strength prediction. Full article
(This article belongs to the Special Issue Advanced Research on Concrete Materials in Construction)
Show Figures

Figure 1

17 pages, 832 KB  
Article
Supervised Machine Learning Algorithms for Fitness-Based Cardiometabolic Risk Classification in Adolescents
by Rodrigo Yáñez-Sepúlveda, Rodrigo Olivares, Pablo Olivares, Juan Pablo Zavala-Crichton, Claudio Hinojosa-Torres, Frano Giakoni-Ramírez, Josivaldo de Souza-Lima, Matías Monsalves-Álvarez, Marcelo Tuesta, Jacqueline Páez-Herrera, Jorge Olivares-Arancibia, Tomás Reyes-Amigo, Guillermo Cortés-Roco, Juan Hurtado-Almonacid, Eduardo Guzmán-Muñoz, Nicole Aguilera-Martínez, José Francisco López-Gil and Vicente Javier Clemente-Suárez
Sports 2025, 13(8), 273; https://doi.org/10.3390/sports13080273 - 18 Aug 2025
Viewed by 242
Abstract
Background: Cardiometabolic risk in adolescents represents a growing public health concern that is closely linked to modifiable factors such as physical fitness. Traditional statistical approaches often fail to capture complex, nonlinear relationships among anthropometric and fitness-related variables. Objective: To develop and evaluate supervised [...] Read more.
Background: Cardiometabolic risk in adolescents represents a growing public health concern that is closely linked to modifiable factors such as physical fitness. Traditional statistical approaches often fail to capture complex, nonlinear relationships among anthropometric and fitness-related variables. Objective: To develop and evaluate supervised machine learning algorithms, including artificial neural networks and ensemble methods, for classifying cardiometabolic risk levels among Chilean adolescents based on standardized physical fitness assessments. Methods: A cross-sectional analysis was conducted using a large representative sample of school-aged adolescents. Field-based physical fitness tests, such as cardiorespiratory fitness (in terms of estimated maximal oxygen consumption [VO2max]), muscular strength (push-ups), and explosive power (horizontal jump) testing, were used as input variables. A cardiometabolic risk index was derived using international criteria. Various supervised machine learning models were trained and compared regarding accuracy, F1 score, recall, and area under the receiver operating characteristic curve (AUC-ROC). Results: Among all the models tested, the gradient boosting classifier achieved the best overall performance, with an accuracy of 77.0%, an F1 score of 67.3%, and the highest AUC-ROC (0.601). These results indicate a strong balance between sensitivity and specificity in classifying adolescents at cardiometabolic risk. Horizontal jumps and push-ups emerged as the most influential predictive variables. Conclusions: Gradient boosting proved to be the most effective model for predicting cardiometabolic risk based on physical fitness data. This approach offers a practical, data-driven tool for early risk detection in adolescent populations and may support scalable screening efforts in educational and clinical settings. Full article
(This article belongs to the Special Issue Fostering Sport for a Healthy Life)
Show Figures

Graphical abstract

28 pages, 2314 KB  
Article
Identifying Key Drivers of Foodborne Diseases in Zhejiang, China: A Machine Learning Approach
by Cangyu Jin, Xiaojuan Qi, Jikai Wang, Lili Chen, Jiang Chen and Han Yin
Foods 2025, 14(16), 2857; https://doi.org/10.3390/foods14162857 - 18 Aug 2025
Viewed by 215
Abstract
Foodborne diseases represent a significant public health challenge worldwide. This study systematically analyzed the temporal dynamics, key predictors, and seasonal patterns of pathogen-specific foodborne diseases using a dataset of 56,970 cases from Zhejiang Province, China, spanning 2014 to 2023. A comprehensive set of [...] Read more.
Foodborne diseases represent a significant public health challenge worldwide. This study systematically analyzed the temporal dynamics, key predictors, and seasonal patterns of pathogen-specific foodborne diseases using a dataset of 56,970 cases from Zhejiang Province, China, spanning 2014 to 2023. A comprehensive set of 91 candidate variables was constructed by integrating epidemiological, environmental, socioeconomic, and agricultural data. Lasso regression was employed to identify 41 important predictors. Based on these variables, supervised machine learning models (Random Forest and XGBoost) were trained and evaluated, achieving training set classification accuracies of 86% and 87%, respectively, demonstrating robust performance. Feature importance analysis revealed that patient age, food type, climate policy, and processing methods were the most influential determinants, highlighting the combined impact of host, exposure, and environmental factors on disease risk. The results demonstrated significant shifts in the pathogen spectrum over the past decade, including a steady decline in Vibrio parahaemolyticus, an increase in Salmonella after 2016, and persistent seasonal peaks in Norovirus and Vibrio parahaemolyticus during warmer months. Seasonal ARIMA modeling and time-series decomposition further confirmed the critical role of seasonal and trend components in bacterial incidence. Overall, this study demonstrates the value of integrating machine learning and time-series analysis for pathogen-specific surveillance, risk prediction, and targeted public health interventions. Full article
(This article belongs to the Special Issue Emerging Challenges in the Management of Food Safety and Authenticity)
Show Figures

Figure 1

54 pages, 1637 KB  
Article
MICRA: A Modular Intelligent Cybersecurity Response Architecture with Machine Learning Integration
by Alessandro Carvalho Coutinho and Luciano Vieira de Araújo
J. Cybersecur. Priv. 2025, 5(3), 60; https://doi.org/10.3390/jcp5030060 - 16 Aug 2025
Viewed by 514
Abstract
The growing sophistication of cyber threats has posed significant challenges for organizations in terms of accurately detecting and responding to incidents in a coordinated manner. Despite advances in the application of machine learning and automation, many solutions still face limitations such as high [...] Read more.
The growing sophistication of cyber threats has posed significant challenges for organizations in terms of accurately detecting and responding to incidents in a coordinated manner. Despite advances in the application of machine learning and automation, many solutions still face limitations such as high false positive rates, low scalability, and difficulties in interorganizational cooperation. This study presents MICRA (Modular Intelligent Cybersecurity Response Architecture), a modular conceptual proposal that integrates dynamic data acquisition, cognitive threat analysis, multi-layer validation, adaptive response orchestration, and collaborative intelligence sharing. The architecture consists of six interoperable modules and incorporates techniques such as supervised learning, heuristic analysis, and behavioral modeling. The modules are designed for operation in diverse environments, including corporate networks, educational networks, and critical infrastructures. MICRA seeks to establish a flexible and scalable foundation for proactive cyber defense, reconciling automation, collaborative intelligence, and adaptability. This proposal aims to support future implementations and research on incident response and cyber resilience in complex operational contexts. Full article
(This article belongs to the Collection Machine Learning and Data Analytics for Cyber Security)
Show Figures

Graphical abstract

30 pages, 5536 KB  
Article
Explainable Artificial Intelligence for the Rapid Identification and Characterization of Ocean Microplastics
by Dimitris Kalatzis, Angeliki I. Katsafadou, Eleni I. Katsarou, Dimitrios C. Chatzopoulos and Yiannis Kiouvrekis
Microplastics 2025, 4(3), 51; https://doi.org/10.3390/microplastics4030051 - 14 Aug 2025
Viewed by 436
Abstract
Accurate identification of microplastic polymers in marine environments is essential for tracing pollution sources, understanding ecological impacts, and guiding mitigation strategies. This study presents a comprehensive, explainable-AI framework that uses Raman spectroscopy to classify pristine and weathered microplastics versus biological materials. Using a [...] Read more.
Accurate identification of microplastic polymers in marine environments is essential for tracing pollution sources, understanding ecological impacts, and guiding mitigation strategies. This study presents a comprehensive, explainable-AI framework that uses Raman spectroscopy to classify pristine and weathered microplastics versus biological materials. Using a curated spectral library of 78 polymer specimens—including pristine, weathered, and biological materials—we benchmark seven supervised machine learning models (Decision Trees, Random Forest, k-Nearest Neighbours, Neural Networks, LightGBM, XGBoost and Support Vector Machines) without and with Principal Component Analysis for binary classification. Although k-Nearest Neighbours and Support Vector Machines achieved the highest single metric accuracy (82.5%), k NN also recorded the highest recall both with and without PCA, thereby offering the most balanced overall performance. To enhance interpretability, we employed SHapley Additive exPlanations, which revealed chemically meaningful spectral regions (notably near 700 cm−1 and 1080 cm−1) as critical to model predictions. Notably, models trained without Principal Component Analysis provided clearer feature attributions, suggesting improved interpretability in raw spectral space. This pipeline surpasses traditional spectral matching techniques and also delivers transparent insights into classification logic. Our findings can support scalable, real-time deployment of AI-based tools for oceanic microplastic monitoring and environmental policy development. Full article
Show Figures

Figure 1

45 pages, 59922 KB  
Article
Machine Learning Applied to Professional Football: Performance Improvement and Results Prediction
by Diego Moya, Christian Tipantuña, Génesis Villa, Xavier Calderón-Hinojosa, Belén Rivadeneira and Robin Álvarez
Mach. Learn. Knowl. Extr. 2025, 7(3), 85; https://doi.org/10.3390/make7030085 - 14 Aug 2025
Viewed by 938
Abstract
This paper examines the integration of machine learning (ML) techniques in professional football, focusing on two key areas: (i) player and team performance, and (ii) match outcome prediction. Using a systematic methodology, this study reviews 172 papers from a five-year observation period (2019–2024) [...] Read more.
This paper examines the integration of machine learning (ML) techniques in professional football, focusing on two key areas: (i) player and team performance, and (ii) match outcome prediction. Using a systematic methodology, this study reviews 172 papers from a five-year observation period (2019–2024) to identify relevant applications, focusing on the analysis of game actions (free kicks, passes, and penalties), individual and collective performance, and player position. A predominance of supervised learning, deep learning, and hybrid models (which integrate several ML techniques) is observed in the ML categories. Among the most widely used algorithms are decision trees, extreme gradient boosting, and artificial neural networks, which focus on optimizing sports performance and predicting outcomes. This paper discusses challenges such as the limited availability of public datasets due to access and cost restrictions, the restricted use of advanced visualization tools, and the poor integration of data acquisition devices, such as sensors. However, it also highlights the role of ML in addressing these challenges, thereby representing future research opportunities. Furthermore, this paper includes two illustrative case studies: (i) predicting the date Cristiano Ronaldo will reach 1000 goals, and (ii) an example of predicting penalty shoots; these examples demonstrate the practical potential of ML for performance monitoring and tactical decision-making in real-world football environments. Full article
Show Figures

Figure 1

11 pages, 586 KB  
Article
Fibroblast Activation Protein (FAP) as a Serum Biomarker for Fibrotic Ovarian Aging: A Clinical Validation Study Based on Translational Transcriptomic Targets
by Hyun Joo Lee, Yunju Jo, Shibo Wei, Eun Hee Yu, Sul Lee, Dongryeol Ryu and Jong Kil Joo
Int. J. Mol. Sci. 2025, 26(16), 7807; https://doi.org/10.3390/ijms26167807 - 13 Aug 2025
Viewed by 223
Abstract
Chronological age is an imprecise proxy for reproductive capacity, necessitating biomarkers that reflect the underlying pathophysiology of the ovary. Fibrotic remodeling of the ovarian stroma is a key hallmark of biological ovarian aging, yet it cannot be assessed by current clinical tools. This [...] Read more.
Chronological age is an imprecise proxy for reproductive capacity, necessitating biomarkers that reflect the underlying pathophysiology of the ovary. Fibrotic remodeling of the ovarian stroma is a key hallmark of biological ovarian aging, yet it cannot be assessed by current clinical tools. This study aimed to identify and validate a novel serum biomarker for fibrotic ovarian aging by applying supervised machine learning (ML) to human ovarian transcriptomic data. Transcriptomic data from the Genotype-Tissue Expression (GTEx) database were analyzed using ML algorithms to identify candidate genes predictive of ovarian aging, and finally, fibroblast activation protein (FAP) and collectin-11 (COLEC11) were selected for clinical validation. In a cross-sectional study, serum levels of FAP and COLEC11, along with key hormonal indices, were measured in two nested patient cohorts, and their associations with ovarian reserve and clinical parameters were analyzed. Serum FAP levels did not correlate with age but showed a strong inverse correlation with anti-Müllerian hormone (AMH) (r = −0.61, p = 0.001), a finding accentuated in women with decreased ovarian reserve (DOR). While COLEC11 correlated with age, it failed to differentiate DOR status. FAP levels were independent of central hormonal regulation, consistent with preclinical fibrotic models. Circulating FAP reflects age-independent, fibrotic ovarian aging, offering stromal-specific information not captured by conventional hormonal markers. This study provides the first clinical validation of FAP as a biomarker for ovarian stromal aging, holding potential for improved reproductive risk assessment. Full article
(This article belongs to the Section Molecular Biology)
Show Figures

Figure 1

Back to TopTop