MDPI - Publisher of Open Access Journals

40 pages, 30640 KB

Open AccessReview

From Data to Diagnosis: A Novel Deep Learning Model for Early and Accurate Diabetes Prediction

by Muhammad Mohsin Zafar, Zahoor Ali Khan, Nadeem Javaid, Muhammad Aslam and Nabil Alrajeh

Healthcare 2025, 13(17), 2138; https://doi.org/10.3390/healthcare13172138 - 27 Aug 2025

Background: Diabetes remains a major global health challenge, contributing significantly to premature mortality due to its potential progression to organ failure if not diagnosed early. Traditional diagnostic approaches are subject to human error, highlighting the need for modern computational techniques in clinical decision [...] Read more.

Background: Diabetes remains a major global health challenge, contributing significantly to premature mortality due to its potential progression to organ failure if not diagnosed early. Traditional diagnostic approaches are subject to human error, highlighting the need for modern computational techniques in clinical decision support systems. Although these systems have successfully integrated deep learning (DL) models, they still encounter several challenges, such as a lack of intricate pattern learning, imbalanced datasets, and poor interpretability of predictions. Methods: To address these issues, the temporal inception perceptron network (TIPNet), a novel DL model, is designed to accurately predict diabetes by capturing complex feature relationships and temporal dynamics. An adaptive synthetic oversampling strategy is utilized to reduce severe class imbalance in an extensive diabetes health indicators dataset consisting of 253,680 instances and 22 features, providing a diverse and representative sample for model evaluation. The model’s performance and generalizability are assessed using a 10-fold cross-validation technique. To enhance interpretability, explainable artificial intelligence techniques are integrated, including local interpretable model-agnostic explanations and Shapley additive explanations, providing insights into the model’s decision-making process. Results: Experimental results demonstrate that TIPNet achieves improvement scores of 3.53% in accuracy, 3.49% in F1-score, 1.14% in recall, and 5.95% in the area under the receiver operating characteristic curve. Conclusion: These findings indicate that TIPNet is a promising tool for early diabetes prediction, offering accurate and interpretable results. The integration of advanced DL modeling with oversampling strategies and explainable AI techniques positions TIPNet as a valuable resource for clinical decision support, paving the way for its future application in healthcare settings. Full article

24 pages, 1522 KB

Open AccessArticle

Spatial Heterogeneity in Temperature Elasticity of Agricultural Economic Production in Xinjiang Province, China

by Shiwei Liu, Yongyu Yue, Lei Wang and Yang Yang

Sustainability 2025, 17(17), 7724; https://doi.org/10.3390/su17177724 (registering DOI) - 27 Aug 2025

Abstract

Agricultural production is significantly impacted by climate change. Owing to its arid and warm climate, investigating the impacts of climate change on agricultural production in Xinjiang Province can help improve resilience and designate adaptive responses for the agricultural sector. On the basis of [...] Read more.

Agricultural production is significantly impacted by climate change. Owing to its arid and warm climate, investigating the impacts of climate change on agricultural production in Xinjiang Province can help improve resilience and designate adaptive responses for the agricultural sector. On the basis of agricultural output data at the county level in Xinjiang from 1990–2019, we used the feasible generalized least squares (FGLS), panel-corrected standard errors (PCSE), and double machine learning (DML) model to study the spatial heterogeneity in temperature elasticity of agricultural economic production. The results revealed that there is an inverted U-shape of temperature impact on agricultural economic production. The presented temperature elasticity in county level showed that regions with negative temperature elasticities are primarily located in the mainstream of the Tarim basin and the Turpan basin in southern Xinjiang. The SHapley Additive exPlanations (SHAP) analysis was further incorporated to elucidate the impact of different factors on the spatial heterogeneity in temperature elasticity. The results indicated that temperature is the most substantial factor influencing temperature elasticity, with labor and precipitation following in sequence. In particular, increased precipitation in arid and hot regions could alleviate the heat stress and lead to a positive temperature elasticity prediction. These findings provide a scientific basis for spatial heterogeneity in the response of agricultural economic production to climate change, and help identify priority regions for achieving Sustainable Development Goals (SDGs) 1 and 2. Full article

(This article belongs to the Special Issue Sustainability of Rural Areas and Agriculture under Uncertainties)

22 pages, 828 KB

Open AccessArticle

Stock Price Prediction Using FinBERT-Enhanced Sentiment with SHAP Explainability and Differential Privacy

by Linyan Ruan and Haiwei Jiang

Mathematics 2025, 13(17), 2747; https://doi.org/10.3390/math13172747 - 26 Aug 2025

Abstract

Stock price forecasting remains a central challenge in financial modeling due to the non-stationarity, noise, and high dimensionality of market dynamics, as well as the growing importance of unstructured textual information. In this work, we propose a multimodal prediction framework that combines FinBERT-based [...] Read more.

Stock price forecasting remains a central challenge in financial modeling due to the non-stationarity, noise, and high dimensionality of market dynamics, as well as the growing importance of unstructured textual information. In this work, we propose a multimodal prediction framework that combines FinBERT-based financial sentiment extraction with technical and statistical indicators to forecast short-term stock price movement. Contextual sentiment signals are derived from financial news headlines using FinBERT, a domain-specific transformer model fine-tuned on annotated financial text. These signals are aggregated and fused with price- and volatility-based features, forming the input to a gradient-boosted decision tree classifier (XGBoost). To ensure interpretability, we employ SHAP (SHapley Additive exPlanations), which decomposes each prediction into additive feature attributions while satisfying game-theoretic fairness axioms. In addition, we integrate differential privacy into the training pipeline to ensure robustness against membership inference attacks and protect proprietary or client-sensitive data. Empirical evaluations across multiple S&P 500 equities from 2018–2023 demonstrate that our FinBERT-enhanced model consistently outperforms both technical-only and lexicon-based sentiment baselines in terms of AUC, F1-score, and simulated trading profitability. SHAP analysis confirms that FinBERT-derived features rank among the most influential predictors. Our findings highlight the complementary value of domain-specific NLP and privacy-preserving machine learning in financial forecasting, offering a principled, interpretable, and deployable solution for real-world quantitative finance applications. Full article

(This article belongs to the Special Issue Privacy-Preserving Techniques in AI, Blockchain and Cloud Systems with Formal Mathematical Analysis, 2nd Edition)

► Show Figures

Figure 1

35 pages, 7622 KB

Open AccessArticle

Bayesian Optimization Meets Explainable AI: Enhanced Chronic Kidney Disease Risk Assessment

by Jianbo Huang, Long Li, Mengdi Hou and Jia Chen

Mathematics 2025, 13(17), 2726; https://doi.org/10.3390/math13172726 - 25 Aug 2025

Viewed by 1

Abstract

Chronic kidney disease (CKD) affects over 850 million individuals worldwide, yet conventional risk stratification approaches fail to capture complex disease progression patterns. Current machine learning approaches suffer from inefficient parameter optimization and limited clinical interpretability. We developed an integrated framework combining advanced Bayesian [...] Read more.

Chronic kidney disease (CKD) affects over 850 million individuals worldwide, yet conventional risk stratification approaches fail to capture complex disease progression patterns. Current machine learning approaches suffer from inefficient parameter optimization and limited clinical interpretability. We developed an integrated framework combining advanced Bayesian optimization with explainable artificial intelligence for enhanced CKD risk assessment. Our approach employs XGBoost ensemble learning with intelligent parameter optimization through Optuna (a Bayesian optimization framework) and comprehensive interpretability analysis using SHAP (SHapley Additive exPlanations) to explain model predictions. To address algorithmic “black-box” limitations and enhance clinical trustworthiness, we implemented four-tier risk stratification using stratified cross-validation and balanced evaluation metrics that ensure equitable performance across all patient risk categories, preventing bias toward common cases while maintaining sensitivity for high-risk patients. The optimized model achieved exceptional performance with 92.4% accuracy, 91.9% F1-score, and 97.7% ROC-AUC, significantly outperforming 16 baseline algorithms by 7.9–18.9%. Bayesian optimization reduced computational time by 74% compared to traditional grid search while maintaining robust generalization. Model interpretability analysis identified CKD stage, albumin-creatinine ratio, and estimated glomerular filtration rate as primary predictors, fully aligning with established clinical guidelines. This framework delivers superior predictive accuracy while providing transparent, clinically-meaningful explanations for CKD risk stratification, addressing critical challenges in medical AI deployment: computational efficiency, algorithmic transparency, and equitable performance across diverse patient populations. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence, Machine Learning and Data Science in Industrial and Medical Domains)

► Show Figures

Figure 1

30 pages, 3241 KB

Open AccessArticle

Identifying Influence Mechanisms of Low-Carbon Travel Intention Through the Integration of Built Environment and Policy Perceptions: A Case Study in Shanghai, China

by Yingjie Sheng, Anning Ni, Lijie Liu, Linjie Gao, Yi Zhang and Yutong Zhu

Sustainability 2025, 17(17), 7647; https://doi.org/10.3390/su17177647 - 25 Aug 2025

Viewed by 38

Abstract

Promoting low-carbon travel modes is crucial for China’s transportation sector to achieve the dual carbon goals. When exploring the mechanisms behind individuals’ travel decisions, the relationships between factors such as the built environment and transportation policies are often derived from prior experience or [...] Read more.

Promoting low-carbon travel modes is crucial for China’s transportation sector to achieve the dual carbon goals. When exploring the mechanisms behind individuals’ travel decisions, the relationships between factors such as the built environment and transportation policies are often derived from prior experience or subjective judgment, rather than being grounded in a solid theoretical foundation. In this paper, we build on and integrate the Theory of Planned Behavior (TPB) and the Technology Acceptance Model (TAM) by introducing built environment perception (BEP), encouraging policy perception (EPP), and restrictive policy perception (RPP) as either perceived ease of use (PEOU) or perceived usefulness (PU). The integration aims to explain how the latent variables in TPB and TAM jointly affect low-carbon travel intention. We conduct a traveler survey in Shanghai, China to obtain the data and employ a structural equation modeling (SEM) approach to characterize the latent mechanisms. The SEM results show that traveler attitude is the most critical variable in shaping low-carbon travel intentions. Perceived ease of use has a significant positive effect on perceived usefulness, and both constructs directly or indirectly influence attitude. As for transportation policies, encouraging policies are more effective in fostering voluntary low-carbon travel intentions than restrictive ones. Considering the heterogeneity of the traveling population, differentiated policy recommendations are proposed based on machine learning modeling and SHapley Additive exPlanations (SHAP) analysis, offering theoretical support for promoting low-carbon travel strategies. Full article

(This article belongs to the Special Issue Sustainable Transportation Systems and Travel Behaviors)

► Show Figures

Figure 1

28 pages, 7744 KB

Open AccessArticle

Optimizing Random Forest with Hybrid Swarm Intelligence Algorithms for Predicting Shear Bond Strength of Cable Bolts

by Ming Xu, Yingui Qiu, Manoj Khandelwal, Mohammad Hossein Kadkhodaei and Jian Zhou

Machines 2025, 13(9), 758; https://doi.org/10.3390/machines13090758 - 24 Aug 2025

Viewed by 227

Abstract

This study combines three optimization algorithms, Tunicate Swarm Algorithm (TSA), Whale Optimization Algorithm (WOA), and Jellyfish Search Optimizer (JSO), with random forest (RF) to predict the shear bond strength of cable bolts under different types and grouting conditions. Based on the original dataset, [...] Read more.

This study combines three optimization algorithms, Tunicate Swarm Algorithm (TSA), Whale Optimization Algorithm (WOA), and Jellyfish Search Optimizer (JSO), with random forest (RF) to predict the shear bond strength of cable bolts under different types and grouting conditions. Based on the original dataset, a database of 860 samples was generated by introducing random noise around each data point. After establishing three hybrid models (RF-WOA, RF-JSO, RF-TSA) and training them, the obtained models were evaluated using six metrics: coefficient of determination (R²), root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), variance account for (VAF), and A-20 index. The results indicate that the RF-JSO model exhibits superior performance compared to the other models. The RF-JSO model achieved an excellent performance on the testing set (R² = 0.981, RMSE = 11.063, MAE = 6.457, MAPE = 9, VAF = 98.168, A-20 = 0.891). In addition, Shapley Additive exPlanations (SHAP), Partial Dependence Plot (PDP), and Local Interpretable Model-agnostic Explanations (LIME) were used to analyze the interpretability of the model, and it was found that confining pressure (Stress), elastic modulus (E), and a standard cable type (cable type_standard) contributed the most to the prediction of shear bond strength. In summary, the hybrid model proposed in this study can effectively predict the shear bond strength of cable bolts. Full article

(This article belongs to the Special Issue Key Technologies in Intelligent Mining Equipment)

► Show Figures

Figure 1

28 pages, 44995 KB

Open AccessArticle

Constitutive Modeling of Coal Gangue Concrete with Integrated Global–Local Explainable AI and Finite Element Validation

by Xuehong Dong, Guanghong Xiong, Xiao Guan and Chenghua Zhang

Buildings 2025, 15(17), 3007; https://doi.org/10.3390/buildings15173007 - 24 Aug 2025

Viewed by 183

Abstract

Coal gangue concrete (CGC), a recycled cementitious material derived from industrial solid waste, presents both opportunities and challenges for structural applications due to its heterogeneous composition and variable mechanical behavior. This study develops an ensemble learning framework—incorporating XGBoost, LightGBM, and CatBoost—to predict four [...] Read more.

Coal gangue concrete (CGC), a recycled cementitious material derived from industrial solid waste, presents both opportunities and challenges for structural applications due to its heterogeneous composition and variable mechanical behavior. This study develops an ensemble learning framework—incorporating XGBoost, LightGBM, and CatBoost—to predict four key constitutive parameters based on experimental data. The predicted parameters are subsequently incorporated into an ABAQUS finite element model to simulate the compressive–bending response of CGC columns, with simulation results aligning well with experimental observations in terms of failure mode, load development, and deformation characteristics. To enhance model interpretability, a hybrid approach is adopted, combining permutation-based global feature importance analysis with SHAP (SHapley Additive exPlanations)-derived local explanations. This joint framework captures both the overall influence of each feature and its context-dependent effects, revealing a three-stage stiffness evolution pattern—brittle, quasi-ductile, and re-brittle—governed by gangue replacement levels and consistent with micromechanical mechanisms and numerical responses. Coupled feature interactions, such as between gangue content and crush index, are shown to exacerbate stiffness loss through interfacial weakening and pore development. This integrated approach delivers both predictive accuracy and mechanistic transparency, providing a reference for developing physically interpretable, data-driven constitutive models and offering guidance for tailoring CGC toward ductile, energy-absorbing structural materials in seismic and sustainability-focused engineering. Full article

(This article belongs to the Special Issue Smart Building Materials and Designs for Sustainable Built Environment)

► Show Figures

Figure 1

24 pages, 4754 KB

Open AccessArticle

Machine Learning Prediction of Short Cervix in Mid-Pregnancy Based on Multimodal Data from the First-Trimester Screening Period: An Observational Study in a High-Risk Population

by Shengyu Wu, Jiaqi Dong, Jifan Shi, Xiaoxian Qu, Yirong Bao, Xiaoyuan Mao, Mu Lv, Xuan Chen and Hao Ying

Biomedicines 2025, 13(9), 2057; https://doi.org/10.3390/biomedicines13092057 - 23 Aug 2025

Viewed by 233

Abstract

Background: A short cervix in the second trimester significantly increases preterm birth risk, yet no reliable first-trimester prediction method exists. Current guidelines lack consensus on which women should undergo transvaginal ultrasound (TVUS) screening for cost-effective prevention. Therefore, it is vital to establish [...] Read more.

Background: A short cervix in the second trimester significantly increases preterm birth risk, yet no reliable first-trimester prediction method exists. Current guidelines lack consensus on which women should undergo transvaginal ultrasound (TVUS) screening for cost-effective prevention. Therefore, it is vital to establish a highly accurate and economical method for use in the early stages of pregnancy to predict short cervix in mid-pregnancy. Methods: A total of 1480 pregnant women with singleton pregnancies and at least one risk factor for spontaneous preterm birth (<37 weeks) were recruited from January 2020 to December 2020 at the Shanghai First Maternity and Infant Hospital, Tongji University School of Medicine. Cervical length was assessed at 20–24 weeks of gestation, with a short cervix defined as <25 mm. Feature selection employed tree models, regularization, and recursive feature elimination (RFE). Seven machine learning models (logistic regression, linear discriminant analysis, k-nearest neighbors, support vector machine, decision tree, random forest, XGBoost) were trained to predict mid-trimester short cervix. The XGBoost model—an ensemble method leveraging sequential decision trees—was analyzed using Shapley Additive Explanation (SHAP) values to assess feature importance, revealing consistent associations between clinical predictors and outcomes that align with known clinical patterns. Results: Among 1480 participants, 376 (25.4%) developed mid-trimester short cervix. The XGBoost-based prediction model demonstrated high predictive performance in the training set (Recall = 0.838, F1 score = 0.848), test set (Recall = 0.850, F1 score = 0.910), and an independent dataset collected in January 2025 (Recall = 0.708, F1 score = 0.791), with SHAP analysis revealing pre-pregnancy BMI as the strongest predictor, followed by second-trimester pregnancy loss history, peripheral blood leukocyte count (WBC), and positive vaginal microbiological culture results (≥10⁵ CFU/mL, measured between 11⁺⁰ and 13⁺⁶ weeks). Conclusions: The XGBoost model accurately predicts mid-trimester short cervix using first-trimester clinical data, providing a 6-week window for targeted interventions before the 20–24-week gestational assessment. This early prediction could help guide timely preventive measures, potentially reducing the risk of spontaneous preterm birth (sPTB). Full article

(This article belongs to the Topic Development of Diagnosis and Treatment Modalities in Obstetrics and Gynecology)

► Show Figures

Figure 1

20 pages, 5304 KB

Open AccessArticle

Deep Learning with UAV Imagery for Subtropical Sphagnum Peatland Vegetation Mapping

by Zhengshun Liu and Xianyu Huang

Remote Sens. 2025, 17(17), 2920; https://doi.org/10.3390/rs17172920 - 22 Aug 2025

Viewed by 311

Abstract

Peatlands are vital for global carbon cycling, and their ecological functions are influenced by vegetation composition. Accurate vegetation mapping is crucial for peatland management and conservation, but traditional methods face limitations such as low spatial resolution and labor-intensive fieldwork. We used ultra-high-resolution UAV [...] Read more.

Peatlands are vital for global carbon cycling, and their ecological functions are influenced by vegetation composition. Accurate vegetation mapping is crucial for peatland management and conservation, but traditional methods face limitations such as low spatial resolution and labor-intensive fieldwork. We used ultra-high-resolution UAV imagery captured across seasonal and topographic gradients and assessed the impact of phenology and topography on classification accuracy. Additionally, this study evaluated the performance of four deep learning models (ResNet, Swin Transformer, ConvNeXt, and EfficientNet) for mapping vegetation in a subtropical Sphagnum peatland. ConvNeXt achieved peak accuracy at 87% during non-growing seasons through its large-kernel feature extraction capability, while ResNet served as the optimal efficient alternative for growing-season applications. Non-growing seasons facilitated superior identification of Sphagnum and monocotyledons, whereas growing seasons enhanced dicotyledon distinction through clearer morphological features. Overall accuracy in low-lying humid areas was 12–15% lower than in elevated terrain due to severe spectral confusion among vegetation. SHapley Additive exPlanations (SHAP) of the ConvNeXt model identified key vegetation indices, the digital surface model, and select textural features as primary performance drivers. This study concludes that the combination of deep learning and UAV imagery presents a powerful tool for peatland vegetation mapping, highlighting the importance of considering phenological and topographical factors. Full article

► Show Figures

Figure 1

14 pages, 1100 KB

Open AccessArticle

Algorithmic Bias Under the EU AI Act: Compliance Risk, Capital Strain, and Pricing Distortions in Life and Health Insurance Underwriting

by Siddharth Mahajan, Rohan Agarwal and Mihir Gupta

Risks 2025, 13(9), 160; https://doi.org/10.3390/risks13090160 - 22 Aug 2025

Viewed by 427

Abstract

The EU Artificial Intelligence Act (Regulation (EU) 2024/1689) designates AI systems used in life and health insurance underwriting as high-risk systems, imposing rigorous requirements for bias testing, technical documentation, and post-deployment monitoring. Leveraging 12.4 million quote–bind–claim observations from four pan-European insurers (2019 Q1–2024 [...] Read more.

The EU Artificial Intelligence Act (Regulation (EU) 2024/1689) designates AI systems used in life and health insurance underwriting as high-risk systems, imposing rigorous requirements for bias testing, technical documentation, and post-deployment monitoring. Leveraging 12.4 million quote–bind–claim observations from four pan-European insurers (2019 Q1–2024 Q4), we evaluate how compliance affects premium schedules, loss ratios, and solvency positions. We estimate gradient-boosted decision tree (Extreme Gradient Boosting (XGBoost)) models alongside benchmark GLMs for mortality, morbidity, and lapse risk, using Shapley Additive Explanations (SHAP) values for explainability. Protected attributes (gender, ethnicity proxy, disability, and postcode deprivation) are excluded from training but retained for audit. We measure bias via statistical parity difference, disparate impact ratio, and equalized odds gap against the 10 percent tolerance in regulatory guidance, and then apply counterfactual mitigation strategies—re-weighing, reject option classification, and adversarial debiasing. We simulate impacts on expected loss ratios, the Solvency II Standard Formula Solvency Capital Requirement (SCR), and internal model economic capital. To translate fairness breaches into compliance risk, we compute expected penalties under the Act’s two-tier fine structure and supervisory detection probabilities inferred from GDPR enforcement. Under stress scenarios—full retraining, feature excision, and proxy disclosure—preliminary results show that bottom-income quintile premiums exceed fair benchmarks by 5.8 percent (life) and 7.2 percent (health). Mitigation closes 65–82 percent of these gaps but raises capital requirements by up to 4.1 percent of own funds; expected fines exceed rectification costs once detection probability surpasses 9 percent. We conclude that proactive adversarial debiasing offers insurers a capital-efficient compliance pathway and outline implications for enterprise risk management and future monitoring. Full article

► Show Figures

Figure 1

21 pages, 16313 KB

Open AccessArticle

An Interpretable Deep Learning Framework for River Water Quality Prediction—A Case Study of the Poyang Lake Basin

by Ying Yuan, Chunjin Zhou, Jingwen Wu, Fuliang Deng, Wei Liu, Mei Sun and Lanhui Li

Water 2025, 17(16), 2496; https://doi.org/10.3390/w17162496 - 21 Aug 2025

Viewed by 360

Abstract

Accurate prediction of water quality involves early identification of future pollutant concentrations and water quality indicators, which is an important prerequisite for optimizing water environment management. Although deep learning algorithms have demonstrated considerable potential in predicting water quality parameters, their broader adoption remains [...] Read more.

Accurate prediction of water quality involves early identification of future pollutant concentrations and water quality indicators, which is an important prerequisite for optimizing water environment management. Although deep learning algorithms have demonstrated considerable potential in predicting water quality parameters, their broader adoption remains hindered by limited interpretability. This study proposes an interpretable deep learning framework integrating an artificial neural network (ANN) model with Shapley additive explanations (SHAP) analysis to predict spatiotemporal variations in water quality and identify key influencing factors. A case study was conducted in the Poyang Lake Basin, utilizing multi-dimensional datasets encompassing topographic, meteorological, socioeconomic, and land use variables. Results indicated that the ANN model exhibited strong predictive performance for dissolved oxygen (DO), total nitrogen (TN), total phosphorus (TP), permanganate index (CODMn), ammonia nitrogen (NH₃N), and turbidity (Turb), achieving R² values ranging from 0.47 to 0.77. Incorporating land use and socioeconomic factors enhanced prediction accuracy by 37.8–246.7% compared to models using only meteorological data. SHAP analysis revealed differences in the dominant factors influencing various water quality parameters. Specifically, cropland area, forest cover, air temperature, and slope in each sub-basin were identified as the most important variables affecting water quality parameters in the case area. These findings provide scientific support for the intelligent management of the regional water environment. Full article

(This article belongs to the Section Water Quality and Contamination)

► Show Figures

Figure 1

20 pages, 3795 KB

Open AccessArticle

Leaf Area Index Estimation of Grassland Based on UAV-Borne Hyperspectral Data and Multiple Machine Learning Models in Hulun Lake Basin

by Dazhou Wu, Saru Bao, Yi Tong, Yifan Fan, Lu Lu, Songtao Liu, Wenjing Li, Mengyong Xue, Bingshuai Cao, Quan Li, Muha Cha, Qian Zhang and Nan Shan

Remote Sens. 2025, 17(16), 2914; https://doi.org/10.3390/rs17162914 - 21 Aug 2025

Viewed by 280

Abstract

Leaf area index (LAI) is a crucial parameter reflecting the crown structure of the grassland. Accurately obtaining LAI is of great significance for estimating carbon sinks in grassland ecosystems. However, spectral noise interference and pronounced spatial heterogeneity within vegetation canopies constitute significant impediments [...] Read more.

Leaf area index (LAI) is a crucial parameter reflecting the crown structure of the grassland. Accurately obtaining LAI is of great significance for estimating carbon sinks in grassland ecosystems. However, spectral noise interference and pronounced spatial heterogeneity within vegetation canopies constitute significant impediments to achieving high-precision LAI retrieval. This study used hyperspectral sensor mounted on an unmanned aerial vehicle (UAV) to estimate LAI in a typical grassland, Hulun Lake Basin. Multiple machine learning (ML) models were constructed to reveal a relationship between hyperspectral data and grassland LAI using two input datasets, namely spectral transformations and vegetation indices (VIs), while SHAP (SHapley Additive ExPlanation) interpretability analysis was further employed to identify high-contribution features in the ML models. The analysis revealed that grassland LAI has good correlations with the original spectrum at 550 nm and 750 nm–1000 nm, first and second derivatives at 506 nm–574 nm, 649 nm–784 nm, and vegetation indices including the triangular vegetation index (TVI), enhanced vegetation index 2 (EVI2), and soil-adjusted vegetation index (SAVI). In the models using spectral transformations and VIs, the random forest (RF) models outperformed other models (testing R² = 0.89/0.88, RMSE = 0.20/0.21, and RRMSE = 27.34%/28.98%). The prediction error of the random forest model exhibited a positive correlation with measured LAI magnitude but demonstrated an inverse relationship with quadrat-level species richness, quantified by Margalef’s richness index (MRI). We also found that at the quadrat level, the spectral response curve pattern is influenced by attributes within the quadrat, like dominant species and vegetation cover, and that LAI has positive relationship with quadrat vegetation cover. The LAI inversion results in this study were also compared to main LAI products, showing a good correlation (r = 0.71). This study successfully established a high-fidelity inversion framework for hyperspectral-derived LAI estimation in mid-to-high latitude grasslands of the Hulun Lake Basin, supporting the spatial refinement of continental-scale carbon sink models at a regional scale. Full article

(This article belongs to the Section Ecological Remote Sensing)

► Show Figures

Figure 1

28 pages, 983 KB

Open AccessArticle

A Novel Explainable Deep Learning Framework for Accurate Diabetes Mellitus Prediction

by Khadija Iftikhar, Nadeem Javaid, Imran Ahmed and Nabil Alrajeh

Appl. Sci. 2025, 15(16), 9162; https://doi.org/10.3390/app15169162 - 20 Aug 2025

Viewed by 202

Abstract

Diabetes, a chronic condition caused by insufficient insulin production in the pancreas, presents significant health risks. Its increasing global prevalence necessitates the development of accurate and efficient predictive algorithms to support timely diagnosis. While recent advancements in deep learning (DL) have demonstrated potential [...] Read more.

Diabetes, a chronic condition caused by insufficient insulin production in the pancreas, presents significant health risks. Its increasing global prevalence necessitates the development of accurate and efficient predictive algorithms to support timely diagnosis. While recent advancements in deep learning (DL) have demonstrated potential for diabetes prediction, conventional models face limitations in handling class imbalance, capturing complex feature interactions, and providing interpretability for clinical decision-making. This paper proposes a DL framework for diabetes mellitus prediction. The framework ensures high predictive accuracy by integrating advanced preprocessing, effective class balancing, and a novel EchoceptionNet model. An analysis was conducted on a diabetes prediction dataset obtained from Kaggle, comprising nine features and 100,000 instances. The dataset is characterized by severe class imbalance, which is effectively addressed using a proximity-weighted synthetic oversampling technique, ensuring balanced class distribution. EchoceptionNet demonstrated notable performance improvements over state-of-the-art deep learning models, achieving a 4.39% increase in accuracy, 8.99% in precision, 2.19% in recall, 5.55% in F1-score, and a 7.77% in area under the curve score. Model robustness and generalizability were validated through 10-fold cross-validation, demonstrating consistent performance across diverse data splitting. To enhance clinical applicability, EchoceptionNet integrates explainable artificial intelligence techniques, Shapley additive explanations, and local interpretable model-agnostic explanations. These methods provide transparency by identifying the critical importance of features in the model’s predictions. EchoceptionNet exhibits superior predictive accuracy and ensures interpretability and reliability, making it a robust solution for accurate diabetes prediction. Full article

(This article belongs to the Special Issue Applications of Artificial Intelligence in Healthcare)

► Show Figures

Figure 1

24 pages, 731 KB

Open AccessArticle

Textual Analysis of Sustainability Reports: Topics, Firm Value, and the Moderating Role of Assurance

by Sunita Rao, Norma Juma and Karthik Srinivasan

J. Risk Financial Manag. 2025, 18(8), 463; https://doi.org/10.3390/jrfm18080463 - 20 Aug 2025

Viewed by 303

Abstract

This study investigated how specific sustainability topics disclosed in standalone sustainability reports influence firm value and whether third-party assurance moderates this relationship. Drawing on signaling, agency, stakeholder, and legitimacy theories, we applied latent Dirichlet allocation (LDA) to extract latent topics from U.S. corporate [...] Read more.

This study investigated how specific sustainability topics disclosed in standalone sustainability reports influence firm value and whether third-party assurance moderates this relationship. Drawing on signaling, agency, stakeholder, and legitimacy theories, we applied latent Dirichlet allocation (LDA) to extract latent topics from U.S. corporate sustainability reports. We analyzed their impact on Tobin’s Q using panel regressions and supplement our findings with discrete Bayesian networks (DBNs) and Shapley additive explanations (SHAP) to capture non-linear patterns. We identified six core topics: environmental impact, sustainable consumption, daily necessities, socio-economic impact, healthcare, and operations. The results revealed that topics of healthcare and daily necessities have immediate and sustained positive effects on firm value, while environmental and socio-economic impact topics demonstrate lagged effects, primarily two years after disclosure. The presence of assurance, however, produces mixed outcomes: it enhances credibility in some cases, but reduces firm value in others, especially when applied to environmental and socio-economic disclosures. This suggests a dual signaling effect of assurance, potentially increasing investor scrutiny when gaps in performance are highlighted. Our findings underscore the importance of topic selection, consistency in reporting, and strategic application of assurance in ESG communications to maintain stakeholder trust and market value. Full article

(This article belongs to the Special Issue Sustainability Reporting and Corporate Governance)

► Show Figures

Figure 1

26 pages, 6361 KB

Open AccessArticle

Improving the Generalization Performance of Debris-Flow Susceptibility Modeling by a Stacking Ensemble Learning-Based Negative Sample Strategy

by Jiayi Li, Jialan Zhang, Jingyuan Yu, Yongbo Chu and Haijia Wen

Water 2025, 17(16), 2460; https://doi.org/10.3390/w17162460 - 19 Aug 2025

Viewed by 316

Abstract

To address the negative sample selection bias and limited interpretability of traditional debris-flow event susceptibility models, this study proposes a framework that enhances generalization by integrating negative sample screening via a stacking ensemble model with an interpretable random forest. Using Wenchuan County, Sichuan [...] Read more.

To address the negative sample selection bias and limited interpretability of traditional debris-flow event susceptibility models, this study proposes a framework that enhances generalization by integrating negative sample screening via a stacking ensemble model with an interpretable random forest. Using Wenchuan County, Sichuan Province, as the study area, 19 influencing factors were selected, encompassing topographic, geological, environmental, and anthropogenic variables. First, a stacking ensemble—comprising logistic regression (LR), decision tree (DT), gradient boosting decision tree (GBDT), and random forest (RF)—was employed as a preliminary classifier to identify very low-susceptibility areas as reliable negative samples, achieving a balanced 1:1 ratio of positive to negative instances. Subsequently, a stacking–random forest model (Stacking-RF) was trained for susceptibility zonation, and SHAP (Shapley additive explanations) was applied to quantify each factor’s contribution. The results show that: (1) the stacking ensemble achieved a test-set AUC (area under the receiver operating characteristic curve) of 0.9044, confirming its effectiveness in screening dependable negative samples; (2) the random forest model attained a test-set AUC of 0.9931, with very high-susceptibility zones—covering 15.86% of the study area—encompassing 92.3% of historical debris-flow events; (3) SHAP analysis identified the distance to a road and point-of-interest (POI) kernel density as the primary drivers of debris-flow susceptibility. The method quantified nonlinear impact thresholds, revealing significant susceptibility increases when road distance was less than 500 m or POI kernel density ranged between 50 and 200 units/km²; and (4) cross-regional validation in Qingchuan County demonstrated that the proposed model improved the capture rate for high/very high susceptibility areas by 48.86%, improving it from 4.55% to 53.41%, with a site density of 0.0469 events/km² in very high-susceptibility zones. Overall, this framework offers a high-precision and interpretable debris-flow risk management tool, highlights the substantial influence of anthropogenic factors such as roads and land development, and introduces a “negative-sample screening with cross-regional generalization” strategy to support land-use planning and disaster prevention in mountainous regions. Full article

(This article belongs to the Special Issue Intelligent Analysis, Monitoring and Assessment of Debris Flow)

► Show Figures

Figure 1

Search Results (1,123)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,123)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI