MDPI - Publisher of Open Access Journals

16 pages, 1600 KiB

Open AccessArticle

Leveraging Neural ODEs for Population Pharmacokinetics of Dalbavancin in Sparse Clinical Data

by Tommaso Giacometti, Ettore Rocchi, Pier Giorgio Cojutti, Federico Magnani, Daniel Remondini, Federico Pea and Gastone Castellani

Entropy 2025, 27(6), 602; https://doi.org/10.3390/e27060602 - 5 Jun 2025

Abstract

This study investigates the use of Neural Ordinary Differential Equations (NODEs) as an alternative to traditional compartmental models and Nonlinear Mixed-Effects (NLME) models for drug concentration prediction in pharmacokinetics. Unlike standard models that rely on strong assumptions and often struggle with high-dimensional covariate [...] Read more.

This study investigates the use of Neural Ordinary Differential Equations (NODEs) as an alternative to traditional compartmental models and Nonlinear Mixed-Effects (NLME) models for drug concentration prediction in pharmacokinetics. Unlike standard models that rely on strong assumptions and often struggle with high-dimensional covariate relationships, NODEs offer a data-driven approach, learning differential equations directly from data while integrating covariates. To evaluate their performance, NODEs were applied to a real-world Dalbavancin pharmacokinetic dataset comprising 218 patients and compared against a two-compartment model and an NLME within a cross-validation framework, which ensures an evaluation of robustness. Given the challenge of limited data availability, a data augmentation strategy was employed to pre-train NODEs. Their predictive performance was assessed both with and without covariates, while model explainability was analyzed using Shapley additive explanations (SHAP) values. Results show that, in the absence of covariates, NODEs performed comparably to state-of-the-art NLME models. However, when covariates were incorporated, NODEs demonstrated superior predictive accuracy. SHAP analyses further revealed how NODEs leverage covariates in their predictions. These results establish NODEs as a promising alternative for pharmacokinetic modeling, particularly in capturing complex covariate interactions, even when dealing with sparse and small datasets, thus paving the way for improved drug concentration predictions and personalized treatment strategies in precision medicine. Full article

► Show Figures

Figure 1

23 pages, 6569 KiB

Open AccessArticle

Comparative Analysis of the Impact of Built Environment and Land Use on Monthly and Annual Mean PM_2.5 Levels

by Anjian Song, Zhenbao Wang, Shihao Li and Xinyi Chen

Atmosphere 2025, 16(6), 682; https://doi.org/10.3390/atmos16060682 - 5 Jun 2025

Abstract

Urban planners are progressively recognizing the significant effects of the built environment and land use on PM_2.5 levels. However, in analyzing the drivers of PM_2.5 levels, researchers’ reliance on annual mean and seasonal means may overlook the monthly variations in PM [...] Read more.

Urban planners are progressively recognizing the significant effects of the built environment and land use on PM_2.5 levels. However, in analyzing the drivers of PM_2.5 levels, researchers’ reliance on annual mean and seasonal means may overlook the monthly variations in PM_2.5 levels, potentially impeding accurate predictions during periods of high pollution. This study focuses on the area within the Sixth Ring Road of Beijing, China. It utilizes gridded monthly and annual mean PM_2.5 data from 2019 as the dependent variable. The research selects 33 independent variables from the perspectives of the built environment and land use. The Extreme Gradient Boosting (XGBoost) method is employed to reveal the driving impacts of the built environment and land use on PM_2.5 levels. To enhance the model accuracy and address the randomness in the division of training and testing sets, we conducted twenty comparisons for each month. We employed Shapley Additive Explanations (SHAP) and Partial Dependence Plots (PDP) to interpret the models’ results and analyze the interactions between the explanatory variables. The results indicate that models incorporating both the built environment and land use outperformed those that considered only a single aspect. Notably, in the test set for April, the R² value reached up to 0.78. Specifically, the fitting accuracy for high pollution months in February, April, and November is higher than the annual mean, while July shows the opposite trend. The coefficient of variation for the importance rankings of the seven key explanatory variables exceeds 30% for both monthly and annual means. Among these variables, building density exhibited the highest coefficient of variation, at 123%. Building density and parking lots density demonstrate strong explanatory power for most months and exhibit significant interactions with other variables. Land use factors such as wetlands fraction, croplands fraction, park and greenspace fraction, and forests fraction have significant driving effects during the summer and autumn seasons months. The research on time scales aims to more effectively reduce PM_2.5 levels, which is essential for developing refined urban planning strategies that foster healthier urban environments. Full article

(This article belongs to the Special Issue Modeling and Monitoring of Air Quality: From Data to Predictions)

► Show Figures

Figure 1

28 pages, 4113 KiB

Open AccessArticle

Building Electricity Prediction Using BILSTM-RF-XGBOOST Hybrid Model with Improved Hyperparameters Based on Bayesian Algorithm

by Yuqing Liu, Binbin Li and Hejun Liang

Electronics 2025, 14(11), 2287; https://doi.org/10.3390/electronics14112287 - 4 Jun 2025

Abstract

Accurate building energy consumption prediction is essential for efficient energy management and energy optimization. This study utilizes bidirectional long short-term memory (BiLSTM) to automatically extract deep time series features. The nonlinear fitting and high-precision prediction capabilities of Random Forest (RF) and XGBoost models [...] Read more.

Accurate building energy consumption prediction is essential for efficient energy management and energy optimization. This study utilizes bidirectional long short-term memory (BiLSTM) to automatically extract deep time series features. The nonlinear fitting and high-precision prediction capabilities of Random Forest (RF) and XGBoost models are then utilized to develop a BiLSTM-RF-XGBoost stacked hybrid model. To enhance model generalization and reduce overfitting, a Bayesian algorithm with an early stopping mechanism is utilized to fine-tune hyperparameters, and strict K-fold time series cross-validation (TSCV) is implemented for performance evaluation. The hybrid model achieves a high TSCV average R² value of 0.989 during cross-validation. When evaluated on an independent test set, it yields a mean square error (MSE) of 0.00003, a root mean square error (RMSE) of 0.00548, a mean absolute error (MAE) of 0.00130, and a mean absolute percentage error (MAPE) of 0.26%. These values are significantly lower than those of comparison models, indicating a significant improvement in predictive performance. The study offers insights into the internal decision-making of the model through SHAP (SHapley Additive exPlanations) feature significance analysis, revealing the key roles of temperature and power lag features, and validating that the stacked model effectively utilizes the outputs of base models as meta-features. This study makes contributions by proposing a novel hybrid model trained with Bayesian optimization, analyzing the influence of various feature factors, and providing innovative technological solutions for building energy consumption prediction. It also provides theoretical value and guidance for low-carbon building energy management and application. Full article

► Show Figures

Graphical abstract

34 pages, 4567 KiB

Open AccessArticle

Predictive Models with Applicable Graphical User Interface (GUI) for the Compressive Performance of Quaternary Blended Plastic-Derived Sustainable Mortar

by Aïssa Rezzoug, Ahmed A. Abdou Elabbasy, Muwaffaq Alqurashi and Ali H. AlAteah

Buildings 2025, 15(11), 1932; https://doi.org/10.3390/buildings15111932 - 3 Jun 2025

Abstract

Machine learning (ML) models in material science and construction engineering have significantly improved predictive accuracy and decision making. However, the practical implementation of these models often requires technical expertise, limiting their accessibility for engineers and practitioners. A user-friendly graphical user interface (GUI) can [...] Read more.

Machine learning (ML) models in material science and construction engineering have significantly improved predictive accuracy and decision making. However, the practical implementation of these models often requires technical expertise, limiting their accessibility for engineers and practitioners. A user-friendly graphical user interface (GUI) can be an essential tool to bridge this gap. In this study, a sustainable approach to improve the compressive strength (C.S) of plastic-based mortar mixes (PMMs) by replacing cement with industrial waste materials was investigated using ML models such as support vector machine, AdaBoost regressor, and extreme gradient boosting. The significance of key mix parameters was further analyzed using SHapley Additive exPlanations (SHAPs) to interpret the influence of input variables on model predictions. To enhance the usability and real-world application of these ML models, a GUI was developed to provide an accessible platform for predicting the C.S of PMMs based on input material proportions. The ML models demonstrated strong correlations with experimental results, and the insights from SHAP analysis further support data-driven mix design strategies. The developed GUI serves as a practical and scalable decision support system, encouraging the adoption of ML-based approaches in sustainable construction engineering. Full article

(This article belongs to the Special Issue Advanced Materials for Modern Methods of Construction: Innovations, Challenges, and Sustainable Building Applications)

► Show Figures

Figure 1

25 pages, 1344 KiB

Open AccessArticle

Customer-Centric Decision-Making with XAI and Counterfactual Explanations for Churn Mitigation

by Simona-Vasilica Oprea and Adela Bâra

J. Theor. Appl. Electron. Commer. Res. 2025, 20(2), 129; https://doi.org/10.3390/jtaer20020129 - 3 Jun 2025

Abstract

In this paper, we propose a methodology designed to deliver actionable insights that help businesses retain customers. While Machine Learning (ML) techniques predict whether a customer is likely to churn, this alone is not enough. Explainable Artificial Intelligence (XAI) methods, such as SHapley [...] Read more.

In this paper, we propose a methodology designed to deliver actionable insights that help businesses retain customers. While Machine Learning (ML) techniques predict whether a customer is likely to churn, this alone is not enough. Explainable Artificial Intelligence (XAI) methods, such as SHapley Additive Explanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME), highlight the features influencing the prediction, but businesses need strategies to prevent churn. Counterfactual (CF) explanations bridge this gap by identifying the minimal changes in the business–customer relationship that could shift an outcome from churn to retention, offering steps to enhance customer loyalty and reduce losses to competitors. These explanations might not fully align with business constraints; however, alternative scenarios can be developed to achieve the same objective. Among the six classifiers used to detect churn cases, the Balanced Random Forest classifier was selected for its superior performance, achieving the highest recall score of 0.72. After classification, Diverse Counterfactual Explanations with ML (DiCEML) through Mixed-Integer Linear Programming (MILP) is applied to obtain the required changes in the features, as well as in the range permitted by the business itself. We further apply DiCEML to uncover potential biases within the model, calculating the disparate impact of some features. Full article

► Show Figures

Figure 1

21 pages, 3798 KiB

Open AccessArticle

Nondestructive Detection of Rice Milling Quality Using Hyperspectral Imaging with Machine and Deep Learning Regression

by Zhongjie Tang, Shanlin Ma, Hengnian Qi, Xincheng Zhang and Chu Zhang

Foods 2025, 14(11), 1977; https://doi.org/10.3390/foods14111977 - 3 Jun 2025

Viewed by 32

Abstract

The brown rice rate (BRR), milled rice rate (MRR), and head rice rate (HRR) are important indicators of rice milling quality. The simultaneous detection of these three metrics holds significant economic value for rice milling quality assessments. In this study, hyperspectral imaging was [...] Read more.

The brown rice rate (BRR), milled rice rate (MRR), and head rice rate (HRR) are important indicators of rice milling quality. The simultaneous detection of these three metrics holds significant economic value for rice milling quality assessments. In this study, hyperspectral imaging was employed to estimate the rice milling quality attributes of two rice varieties (Xiushui121 and Zhehujing26). Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), Convolutional Neural Networks (CNNs), and Backpropagation Neural Networks (BPNNs) were used to establish both single-task and multi-task models for the prediction of milling quality attributes. Most multi-task models demonstrated a higher prediction accuracy compared with their corresponding single-task models. Among single-task models, BPNNs outperformed the others in predicting BRR and HRR, with correlation coefficients (r) up to 0.9. SVR excelled in forecasting the MRR. In multi-task learning, BPNNs exhibited relatively better performance, with r values exceeding 0.81 for all three indicators. SHapley Additive exPlanations (SHAP) analysis was used to explore the relationship between wavelength and rice milling quality attributes. This study confirmed that this nondestructive detection method for rice milling quality using hyperspectral imaging combined with machine learning and deep learning algorithms could effectively assess rice milling quality, thus contributing to breeding and growth management in the industry. Full article

(This article belongs to the Special Issue The Non-destructive Testing, In Situ Analysis, and Automated Sorting of Food/Agricultural Product Quality)

► Show Figures

Figure 1

29 pages, 3274 KiB

Open AccessArticle

Enhancing Heart Attack Prediction: Feature Identification from Multiparametric Cardiac Data Using Explainable AI

by Muhammad Waqar, Muhammad Bilal Shahnawaz, Sajid Saleem, Hassan Dawood, Usman Muhammad and Hussain Dawood

Algorithms 2025, 18(6), 333; https://doi.org/10.3390/a18060333 - 2 Jun 2025

Viewed by 232

Abstract

Heart attack is a leading cause of mortality, necessitating timely and precise diagnosis to improve patient outcomes. However, timely diagnosis remains a challenge due to the complex and nonlinear relationships between clinical indicators. Machine learning (ML) and deep learning (DL) models have the [...] Read more.

Heart attack is a leading cause of mortality, necessitating timely and precise diagnosis to improve patient outcomes. However, timely diagnosis remains a challenge due to the complex and nonlinear relationships between clinical indicators. Machine learning (ML) and deep learning (DL) models have the potential to predict cardiac conditions by identifying complex patterns within data, but their “black-box” nature restricts interpretability, making it challenging for healthcare professionals to comprehend the reasoning behind predictions. This lack of interpretability limits their clinical trust and adoption. The proposed approach addresses this limitation by integrating predictive modeling with Explainable AI (XAI) to ensure both accuracy and transparency in clinical decision-making. The proposed study enhances heart attack prediction using the University of California, Irvine (UCI) dataset, which includes various heart analysis parameters collected through electrocardiogram (ECG) sensors, blood pressure monitors, and biochemical analyzers. Due to class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied to enhance the representation of the minority class. After preprocessing, various ML algorithms were employed, among which Artificial Neural Networks (ANN) achieved the highest performance with 96.1% accuracy, 95.7% recall, and 95.7% F1-score. To enhance the interpretability of ANN, two XAI techniques, specifically SHapley Additive Explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME), were utilized. This study incrementally benchmarks SMOTE, ANN, and XAI techniques such as SHAP and LIME on standardized cardiac datasets, emphasizing clinical interpretability and providing a reproducible framework for practical healthcare implementation. These techniques enable healthcare practitioners to understand the model’s decisions, identify key predictive features, and enhance clinical judgment. By bridging the gap between AI-driven performance and practical medical implementation, this work contributes to making heart attack prediction both highly accurate and interpretable, facilitating its adoption in real-world clinical settings. Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

► Show Figures

Figure 1

18 pages, 1146 KiB

Open AccessArticle

Explainable Machine Learning in the Prediction of Depression

by Christina Mimikou, Christos Kokkotis, Dimitrios Tsiptsios, Konstantinos Tsamakis, Stella Savvidou, Lillian Modig, Foteini Christidi, Antonia Kaltsatou, Triantafyllos Doskas, Christoph Mueller, Aspasia Serdari, Kostas Anagnostopoulos and Gregory Tripsianis

Diagnostics 2025, 15(11), 1412; https://doi.org/10.3390/diagnostics15111412 - 2 Jun 2025

Viewed by 124

Abstract

Background: Depression constitutes a major public health issue, being one of the leading causes of the burden of disease worldwide. The risk of depression is determined by both genetic and environmental factors. While genetic factors cannot be altered, the identification of potentially reversible [...] Read more.

Background: Depression constitutes a major public health issue, being one of the leading causes of the burden of disease worldwide. The risk of depression is determined by both genetic and environmental factors. While genetic factors cannot be altered, the identification of potentially reversible environmental factors is crucial in order to try and limit the prevalence of depression. Aim: A cross-sectional, questionnaire-based study on a sample from the multicultural region of Thrace in northeast Greece was designed to assess the potential association of depression with several sociodemographic characteristics, lifestyle, and health status. The study employed four machine learning (ML) methods to assess depression: logistic regression (LR), support vector machine (SVM), XGBoost, and neural networks (NNs). These models were compared to identify the best-performing approach. Additionally, a genetic algorithm (GA) was utilized for feature selection and SHAP (SHapley Additive exPlanations) for interpreting the contributions of each employed feature. Results: The XGBoost classifier demonstrated the highest performance on the test dataset to predict depression with excellent accuracy (97.83%), with NNs a close second (accuracy, 97.02%). The XGBoost classifier utilized the 15 most significant risk factors identified by the GA algorithm. Additionally, the SHAP analysis revealed that anxiety, education level, alcohol consumption, and body mass index were the most influential predictors of depression. Conclusions: These findings provide valuable insights for the development of personalized public health interventions and clinical strategies, ultimately promoting improved mental well-being for individuals. Future research should expand datasets to enhance model accuracy, enabling early detection and personalized mental healthcare systems for better intervention. Full article

(This article belongs to the Special Issue The Future of Diagnostics: Exploring the Role of Artificial Intelligence in Medicine)

► Show Figures

Figure 1

27 pages, 5253 KiB

Open AccessArticle

Machine Learning and SHAP-Based Analysis of Deforestation and Forest Degradation Dynamics Along the Iraq–Turkey Border

by Milat Hasan Abdullah and Yaseen T. Mustafa

Earth 2025, 6(2), 49; https://doi.org/10.3390/earth6020049 - 1 Jun 2025

Viewed by 200

Abstract

This study explores the spatiotemporal patterns and drivers of deforestation and forest degradation along the politically sensitive Iraq–Turkey border within the Duhok Governorate between 2015 and 2024. Utilizing paired remote sensing (RS) and high-end machine learning (ML) methods, forest dynamics were simulated from [...] Read more.

This study explores the spatiotemporal patterns and drivers of deforestation and forest degradation along the politically sensitive Iraq–Turkey border within the Duhok Governorate between 2015 and 2024. Utilizing paired remote sensing (RS) and high-end machine learning (ML) methods, forest dynamics were simulated from Sentinel-2 imagery, climate datasets, and topographic variables. Seven ML models were evaluated, and XGBoost consistently outperformed the others, yielding predictive accuracies (R²) of 0.903 (2015), 0.910 (2019), and 0.950 (2024), and a low RMSE (≤0.035). Model interpretability was further improved through the application of SHapley Additive exPlanations (SHAP) to estimate variable contributions and a Generalized Additive Model (GAM) to elucidate complex nonlinear interactions. The results showed distinct temporal shifts; climatic factors (rainfall and temperature) primarily influenced vegetation cover in 2015, whereas anthropogenic drivers such as forest fires (NBR), road construction (RI), and soil exposure (BSI) intensified by 2024, accounting for up to 12% of the observed forest loss. Forest canopy cover decreased significantly, from approximately 630 km² in 2015 to 577 km² in 2024, mainly due to illegal deforestation, road network expansion, and conflict-induced fires. This study highlights the effectiveness of an ML-driven RS analysis for geoinformation needs in geopolitically complex and data-scarce regions. These findings underscore the urgent need for robust, evidence-based conservation policies and demonstrate the utility of interpretable ML techniques for forest management policy optimization, providing a reproducible methodological blueprint for future ecological assessment. Full article

► Show Figures

Figure 1

18 pages, 4321 KiB

Open AccessReview

Methodological Review of Classification Trees for Risk Stratification: An Application Example in the Obesity Paradox

by Javier Trujillano, Luis Serviá, Mariona Badia, José C. E. Serrano, María Luisa Bordejé-Laguna, Carol Lorencio, Clara Vaquerizo, José Luis Flordelis-Lasierra, Itziar Martínez de Lagrán, Esther Portugal-Rodríguez and Juan Carlos López-Delgado

Nutrients 2025, 17(11), 1903; https://doi.org/10.3390/nu17111903 - 31 May 2025

Viewed by 177

Abstract

Background: Classification trees (CTs) are widely used machine learning algorithms with growing applications in clinical research, especially for risk stratification. Their ability to generate interpretable decision rules makes them attractive to healthcare professionals. This review provides an accessible yet rigorous overview of CT [...] Read more.

Background: Classification trees (CTs) are widely used machine learning algorithms with growing applications in clinical research, especially for risk stratification. Their ability to generate interpretable decision rules makes them attractive to healthcare professionals. This review provides an accessible yet rigorous overview of CT methodology for clinicians, highlighting their utility through a case study addressing the “obesity paradox” in critically ill patients. Methods: We describe key methodological aspects of CTs, including model development, pruning, validation, and classification types (simple, ensemble, and hybrid). Using data from the ENPIC (Evaluation of Practical Nutrition Practices in the Critical Care Patient) study, which assessed artificial nutrition in ICU (intensive care unit) patients, we applied various CT approaches—CART (classification and regression trees), CHAID (chi-square automatic interaction detection), and XGBoost (extreme gradient boosting)—and compared them with logistic regression. SHAP (SHapley Additive exPlanation) values were used to interpret ensemble models. Results: CTs allowed for identification of optimal cut-off points in continuous variables and revealed complex, non-linear interactions among predictors. Although the obesity paradox was not confirmed in the full cohort, CTs uncovered a specific subgroup in which obesity was associated with reduced mortality. The ensemble model (XGBoost) achieved the best predictive performance (highest area under the ROC curve), though at the expense of interpretability. Conclusions: CTs are valuable tools in clinical epidemiology, complementing traditional models by uncovering hidden patterns and enhancing risk stratification. While ensemble models offer superior predictive accuracy, their complexity necessitates interpretability techniques such as SHAP. CT-based approaches can guide personalized medicine but require cautious interpretation and external validation. Full article

(This article belongs to the Special Issue Biostatistics Methods in Nutritional Research)

► Show Figures

Figure 1

17 pages, 737 KiB

Open AccessArticle

Machine Learning for Predicting the Low Risk of Postoperative Pancreatic Fistula After Pancreaticoduodenectomy: Toward a Dynamic and Personalized Postoperative Management Strategy

by Roberto Cammarata, Filippo Ruffini, Alberto Catamerò, Gennaro Melone, Gianluca Costa, Silvia Angeletti, Federico Seghetti, Vincenzo La Vaccara, Roberto Coppola, Paolo Soda, Valerio Guarrasi and Damiano Caputo

Cancers 2025, 17(11), 1846; https://doi.org/10.3390/cancers17111846 - 31 May 2025

Viewed by 168

Abstract

Background. Postoperative pancreatic fistula (POPF) remains one of the most relevant complications following pancreaticoduodenectomy (PD), significantly impacting short-term outcomes and delaying adjuvant therapies. Current predictive models offer limited accuracy, often failing to incorporate early postoperative data. This retrospective study aimed to develop and [...] Read more.

Background. Postoperative pancreatic fistula (POPF) remains one of the most relevant complications following pancreaticoduodenectomy (PD), significantly impacting short-term outcomes and delaying adjuvant therapies. Current predictive models offer limited accuracy, often failing to incorporate early postoperative data. This retrospective study aimed to develop and validate machine learning (ML) models to predict the absence and severity of POPF using clinical, surgical, and early postoperative variables. Methods. Data from 216 patients undergoing PD were analyzed. A total of twenty-four machine learning (ML) algorithms were systematically evaluated using the Matthews Correlation Coefficient (MCC) and AUC-ROC metrics. Among these, the GradientBoostingClassifier consistently outperformed all other models, demonstrating the best predictive performance, particularly in identifying patients at low risk of postoperative pancreatic fistula (POPF) during the early postoperative period. To enhance transparency and interpretability, a SHAP (SHapley Additive exPlanations) analysis was applied, highlighting the key role of early postoperative biomarkers in the model predictions. Results. The performance of the GradientBoostingClassifier was also directly compared to that of a traditional logistic regression model, confirming the superior predictive performance over conventional approaches. This study demonstrates that ML can effectively stratify POPF risk, potentially supporting early drain removal and optimizing postoperative management. Conclusions. While the model showed promising performance in a single-center cohort, external validation across different surgical settings will be essential to confirm its generalizability and clinical utility. The integration of ML into clinical workflows may represent a step forward in delivering personalized and dynamic care after pancreatic surgery. Full article

(This article belongs to the Special Issue Current Clinical Studies of Pancreatic Ductal Adenocarcinoma)

► Show Figures

Figure 1

23 pages, 1370 KiB

Open AccessArticle

Machine Learning-Based Identification of Phonological Biomarkers for Speech Sound Disorders in Saudi Arabic-Speaking Children

by Deema F. Turki and Ahmad F. Turki

Diagnostics 2025, 15(11), 1401; https://doi.org/10.3390/diagnostics15111401 - 31 May 2025

Viewed by 228

Abstract

Background/Objectives: This study investigates the application of machine learning (ML) techniques in diagnosing speech sound disorders (SSDs) in Saudi Arabic-speaking children, with a specific focus on phonological biomarkers, particularly Infrequent Variance (InfrVar), to improve diagnostic accuracy. SSDs are a significant concern in pediatric [...] Read more.

Background/Objectives: This study investigates the application of machine learning (ML) techniques in diagnosing speech sound disorders (SSDs) in Saudi Arabic-speaking children, with a specific focus on phonological biomarkers, particularly Infrequent Variance (InfrVar), to improve diagnostic accuracy. SSDs are a significant concern in pediatric speech pathology, affecting an estimated 10–15% of preschool-aged children worldwide. However, accurate diagnosis remains challenging, especially in linguistically diverse populations. Traditional diagnostic tools, such as the Percentage of Consonants Correct (PCC), often fail to capture subtle phonological variations. This study explores the potential of machine learning models to enhance diagnostic accuracy by incorporating culturally relevant phonological biomarkers like InfrVar, aiming to develop a more effective diagnostic approach for SSDs in Saudi Arabic-speaking children. Methods: Data from 235 Saudi Arabic-speaking children aged 2;6 to 5;11 years were analyzed using several machine learning models: Random Forest, Support Vector Machine (SVM), XGBoost, Logistic Regression, K-Nearest Neighbors, and Naïve Bayes. The dataset was used to classify speech patterns into four categories: Atypical, Typical Development (TD), Articulation, and Delay. Phonological features such as Phonological Variance (PhonVar), InfrVar, and Percentage of Consonants Correct (PCC) were used as key variables. SHapley Additive exPlanations (SHAP) analysis was employed to interpret the contributions of individual features to model predictions. Results: The XGBoost and Random Forest models demonstrated the highest performance, with an accuracy of 91.49% and an AUC of 99.14%. SHAP analysis revealed that articulation patterns and phonological patterns were the most influential features for distinguishing between Atypical and TD categories. The K-Means clustering approach identified four distinct subgroups based on speech development patterns: TD (46.61%), Articulation (25.42%), Atypical (18.64%), and Delay (9.32%). Conclusions: Machine learning models, particularly XGBoost and Random Forest, effectively classified speech development categories in Saudi Arabic-speaking children. This study highlights the importance of incorporating culturally specific phonological biomarkers like InfrVar and PhonVar to improve diagnostic precision for SSDs. These findings lay the groundwork for the development of AI-assisted diagnostic tools tailored to diverse linguistic contexts, enhancing early intervention strategies in pediatric speech pathology. Full article

(This article belongs to the Special Issue Artificial Intelligence for Health and Medicine)

► Show Figures

Figure 1

24 pages, 3235 KiB

Open AccessArticle

Alzheimer’s Disease Detection from Speech Using Shapley Additive Explanations for Feature Selection and Enhanced Interpretability

by Irati Oiza-Zapata and Ascensión Gallardo-Antolín

Electronics 2025, 14(11), 2248; https://doi.org/10.3390/electronics14112248 - 31 May 2025

Viewed by 160

Abstract

Smart cities provide an ideal framework for the integration of advanced healthcare applications, such as early Alzheimer’s Disease (AD) detection that is essential to facilitate timely interventions and slow its progression. In this context, speech analysis, combined with Artificial Intelligence (AI) techniques, has [...] Read more.

Smart cities provide an ideal framework for the integration of advanced healthcare applications, such as early Alzheimer’s Disease (AD) detection that is essential to facilitate timely interventions and slow its progression. In this context, speech analysis, combined with Artificial Intelligence (AI) techniques, has emerged as a promising approach for the automatic detection of AD, as vocal biomarkers can provide valuable indicators of cognitive decline. The proposed approach focuses on two key goals: minimizing computational overhead while maintaining high accuracy, and improving model interpretability for clinical usability. To enhance efficiency, the framework incorporates a data quality method that removes unreliable speech segments based on duration thresholds and applies Shapley Additive Explanations (SHAP) to select the most influential acoustic features. SHAP is also used to improve interpretability by providing global and local explanations of model decisions. The final model, that is based on Extreme Gradient Boosting (XGBoost), achieves an F1-Score of 0.7692 on the ADReSS dataset, showing good performance and a satisfactory level of clinical utility. This work highlights the potential of explainable AI to bridge machine learning techniques with clinically meaningful insights in the domain of AD detection from speech. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

16 pages, 1085 KiB

Open AccessSystematic Review

Explainable Artificial Intelligence in Radiological Cardiovascular Imaging—A Systematic Review

by Matteo Haupt, Martin H. Maurer and Rohit Philip Thomas

Diagnostics 2025, 15(11), 1399; https://doi.org/10.3390/diagnostics15111399 - 31 May 2025

Viewed by 300

Abstract

Background: Artificial intelligence (AI) and deep learning are increasingly applied in cardiovascular imaging. However, the “black box” nature of these models raises challenges for clinical trust and integration. Explainable Artificial Intelligence (XAI) seeks to address these concerns by providing insights into model decision-making. [...] Read more.

Background: Artificial intelligence (AI) and deep learning are increasingly applied in cardiovascular imaging. However, the “black box” nature of these models raises challenges for clinical trust and integration. Explainable Artificial Intelligence (XAI) seeks to address these concerns by providing insights into model decision-making. This systematic review synthesizes current research on the use of XAI methods in radiological cardiovascular imaging. Methods: A systematic literature search was conducted in PubMed, Scopus, and Web of Science to identify peer-reviewed original research articles published between January 2015 and March 2025. Studies were included if they applied XAI techniques—such as Gradient-Weighted Class Activation Mapping (Grad-CAM), Shapley Additive Explanations (SHAPs), Local Interpretable Model-Agnostic Explanations (LIMEs), or saliency maps—to cardiovascular imaging modalities, including cardiac computed tomography (CT), magnetic resonance imaging (MRI), echocardiography and other ultrasound examinations, and chest X-ray (CXR). Studies focusing on nuclear medicine, structured/tabular data without imaging, or lacking concrete explainability features were excluded. Screening and data extraction followed PRISMA guidelines. Results: A total of 28 studies met the inclusion criteria. Ultrasound examinations (n = 9) and CT (n = 9) were the most common imaging modalities, followed by MRI (n = 6) and chest X-rays (n = 4). Clinical applications included disease classification (e.g., coronary artery disease and valvular heart disease) and the detection of myocardial or congenital abnormalities. Grad-CAM was the most frequently employed XAI method, followed by SHAP. Most studies used saliency-based techniques to generate visual explanations of model predictions. Conclusions: XAI holds considerable promise for improving the transparency and clinical acceptance of deep learning models in cardiovascular imaging. However, the evaluation of XAI methods remains largely qualitative, and standardization is lacking. Future research should focus on the robust, quantitative assessment of explainability, prospective clinical validation, and the development of more advanced XAI techniques beyond saliency-based methods. Strengthening the interpretability of AI models will be crucial to ensuring their safe, ethical, and effective integration into cardiovascular care. Full article

(This article belongs to the Special Issue Latest Advances and Prospects in Cardiovascular Imaging)

► Show Figures

Figure 1

20 pages, 1343 KiB

Open AccessArticle

Predicting Mobile Payment Behavior Through Explainable Machine Learning and Application Usage Analysis

by Myounggu Lee, Insu Choi and Woo-Chang Kim

J. Theor. Appl. Electron. Commer. Res. 2025, 20(2), 117; https://doi.org/10.3390/jtaer20020117 - 30 May 2025

Viewed by 206

Abstract

In the increasingly competitive mobile ecosystem, understanding user behavior is essential to improve targeted sales and the effectiveness of advertising. With the widespread adoption of smartphones and the increasing variety of mobile applications, predicting user behavior has become more complex. This study presents [...] Read more.

In the increasingly competitive mobile ecosystem, understanding user behavior is essential to improve targeted sales and the effectiveness of advertising. With the widespread adoption of smartphones and the increasing variety of mobile applications, predicting user behavior has become more complex. This study presents a comprehensive framework for predicting mobile payment behavior by integrating demographic, situational, and behavioral factors, focusing on patterns in mobile application usage. To address the complexity of the data, we use a combination of machine-learning models, including extreme gradient boosting, light gradient boosting machine, and CatBoost, along with Shapley additive explanations (SHAP) to improve interpretability. An analysis of extensive panel data from Korean Android users reveals that incorporating application usage behavior in such models considerably improves the accuracy of mobile payment predictions. The study identifies key predictors of payment behavior, indicated by high Shapley values, such as using social networking services (e.g., KakaoTalk and Instagram), media applications (e.g., YouTube), and financial and membership applications (e.g., Toss and OK Cashbag). Moreover, the results of the SHAP force analysis reveal the individual session-level drivers of mobile purchases. These findings advance the literature on mobile payment prediction and offer practical insights for improving targeted marketing strategies by identifying key behavioral drivers of mobile transactions. Full article

► Show Figures

Figure 1

Search Results (792)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (792)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI