MDPI - Publisher of Open Access Journals

19 pages, 4334 KB

Open AccessArticle

Machine Learning-Based Ground-Level NO₂ Estimation in Istanbul: A Comparative Analysis of Sentinel-5P and GEOS-CF

by Nur Yagmur Aydin

Appl. Sci. 2025, 15(20), 10997; https://doi.org/10.3390/app152010997 - 13 Oct 2025

Nitrogen dioxide (NO₂) poses severe risks to human health and the environment, especially in densely populated megacities. Ground-based air quality monitoring stations provide high-temporal-resolution data but are spatially limited, while satellite observations offer broad coverage but measure column densities rather than [...] Read more.

Nitrogen dioxide (NO₂) poses severe risks to human health and the environment, especially in densely populated megacities. Ground-based air quality monitoring stations provide high-temporal-resolution data but are spatially limited, while satellite observations offer broad coverage but measure column densities rather than surface concentrations. To overcome these limitations, this study integrates ground-based observations with satellite-derived NO₂ from Sentinel-5P TROPOMI and GEOS-CF products to estimate ground-level NO₂ in Istanbul using machine learning (ML) approaches. Three ML algorithms (RF, XGB, and CB) were tested on two datasets spanning 2019–2024 at ~1 km resolution, incorporating 20 features, including topographic, meteorological, environmental, and demographic variables. Among models, CB achieved the best performance (R: 0.686, RMSE: 16.23 µg/m³, and MAE: 11.75 µg/m³ in the test dataset) with the Sentinel-5P dataset, successfully capturing spatial and seasonal variations in ground-level NO₂ both quantitatively and qualitatively. SHAP analysis revealed that regarding satellite-derived NO₂, anthropogenic indicators such as population density, road length, and digital elevation model were the most influential features, while meteorological factors contributed secondarily. Despite the lower spatial resolution of GEOS-CF data, both Sentinel-5P and GEOS-CF datasets supported reliable model outputs. This study provides the first ML-based ground-level NO₂ estimation framework for the Istanbul Metropolitan City. Full article

(This article belongs to the Special Issue Air Quality Monitoring, Analysis and Modeling)

15 pages, 8859 KB

Open AccessArticle

A Hybrid Estimation Model for Graphite Nodularity of Ductile Cast Iron Based on Multi-Source Feature Extraction

by Yongjian Yang, Yanhui Liu, Yuqian He, Zengren Pan and Zhiwei Li

Modelling 2025, 6(4), 126; https://doi.org/10.3390/modelling6040126 - 13 Oct 2025

Abstract

Graphite nodularity is a key indicator for evaluating the microstructure quality of ductile iron and plays a crucial role in ensuring product quality and enhancing manufacturing efficiency. Existing research often only focuses on a single type of feature and fails to utilize multi-source [...] Read more.

Graphite nodularity is a key indicator for evaluating the microstructure quality of ductile iron and plays a crucial role in ensuring product quality and enhancing manufacturing efficiency. Existing research often only focuses on a single type of feature and fails to utilize multi-source information in a coordinated manner. Single-feature methods are difficult to comprehensively capture microstructures, which limits the accuracy and robustness of the model. This study proposes a hybrid estimation model for the graphite nodularity of ductile cast iron based on multi-source feature extraction. A comprehensive feature engineering pipeline was established, incorporating geometric, color, and texture features extracted via Hue-Saturation-Value color space (HSV) histograms, gray level co-occurrence matrix (GLCM), Local Binary Pattern (LBP), and multi-scale Gabor filters. Dimensionality reduction was performed using Principal Component Analysis (PCA) to mitigate redundancy. An improved watershed algorithm combined with intelligent filtering was used for accurate particle segmentation. Several machine learning algorithms, including Support Vector Regression (SVR), Multi-Layer Perceptron (MLP), Random Forest (RF), Gradient Boosting Regressor (GBR), eXtreme Gradient Boosting (XGBoost) and Categorical Boosting (CatBoost), are applied to estimate graphite nodularity based on geometric features (GFs) and feature extraction. Experimental results demonstrate that the CatBoost model trained on fused features achieves high estimation accuracy and stability for geometric parameters, with R-squared (R²) exceeding 0.98. Furthermore, introducing geometric features into the fusion set enhances model generalization and suppresses overfitting. This framework offers an efficient and robust approach for intelligent analysis of metallographic images and provides valuable support for automated quality assessment in casting production. Full article

► Show Figures

Figure 1

34 pages, 1960 KB

Open AccessArticle

Quantum-Inspired Hybrid Metaheuristic Feature Selection with SHAP for Optimized and Explainable Spam Detection

by Qusai Shambour, Mahran Al-Zyoud and Omar Almomani

Symmetry 2025, 17(10), 1716; https://doi.org/10.3390/sym17101716 - 13 Oct 2025

Abstract

The rapid growth of digital communication has intensified spam-related threats, including phishing and malware, which employ advanced evasion tactics. Traditional filtering methods struggle to keep pace, driving the need for sophisticated machine learning (ML) solutions. The effectiveness of ML models hinges on selecting [...] Read more.

The rapid growth of digital communication has intensified spam-related threats, including phishing and malware, which employ advanced evasion tactics. Traditional filtering methods struggle to keep pace, driving the need for sophisticated machine learning (ML) solutions. The effectiveness of ML models hinges on selecting high-quality input features, especially in high-dimensional datasets where irrelevant or redundant attributes impair performance and computational efficiency. Guided by principles of symmetry to achieve an optimal balance between model accuracy, complexity, and interpretability, this study proposes an Enhanced Hybrid Quantum-Inspired Firefly and Artificial Bee Colony (EHQ-FABC) algorithm for feature selection in spam detection. EHQ-FABC leverages the Firefly Algorithm’s local exploitation and the Artificial Bee Colony’s global exploration, augmented with quantum-inspired principles to maintain search space diversity and a symmetrical balance between exploration and exploitation. It eliminates redundant attributes while preserving predictive power. For interpretability, Shapley Additive Explanations (SHAPs) are employed to ensure symmetry in explanation, meaning features with equal contributions are assigned equal importance, providing a fair and consistent interpretation of the model’s decisions. Evaluated on the ISCX-URL2016 dataset, EHQ-FABC reduces features by over 76%, retaining only 17 of 72 features, while matching or outperforming filter, wrapper, embedded, and metaheuristic methods. Tested across ML classifiers like CatBoost, XGBoost, Random Forest, Extra Trees, Decision Tree, K-Nearest Neighbors, Logistic Regression, and Multi-Layer Perceptron, EHQ-FABC achieves a peak accuracy of 99.97% with CatBoost and robust results across tree ensembles, neural, and linear models. SHAP analysis highlights features like domain_token_count and NumberOfDotsinURL as key for spam detection, offering actionable insights for practitioners. EHQ-FABC provides a reliable, transparent, and efficient symmetry-aware solution, advancing both accuracy and explainability in spam detection. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

20 pages, 4096 KB

Open AccessArticle

Transformer Core Loosening Diagnosis Based on Fusion Feature Extraction and CPO-Optimized CatBoost

by Yuanqi Xiao, Yipeng Yin, Jiaqi Xu and Yuxin Zhang

Processes 2025, 13(10), 3247; https://doi.org/10.3390/pr13103247 (registering DOI) - 12 Oct 2025

Abstract

Transformer reliability is crucial to grid security, with core loosening a common fault. This paper proposes a transformer core loosening fault diagnosis method based on a fusion feature extraction approach and Categorical Boosting (CatBoost) optimized by the Crested Porcupine Optimizer (CPO) algorithm. Firstly, [...] Read more.

Transformer reliability is crucial to grid security, with core loosening a common fault. This paper proposes a transformer core loosening fault diagnosis method based on a fusion feature extraction approach and Categorical Boosting (CatBoost) optimized by the Crested Porcupine Optimizer (CPO) algorithm. Firstly, the audio signal is decomposed into six Intrinsic Mode Functions (IMF) components through Variational Mode Decomposition (VMD). This paper utilizes Gaussian membership functions to quantify the energy proportion, central frequency, and kurtosis of IMF and constructs a fuzzy entropy discrimination function. Then, the IMF noise components are removed through an adaptive threshold. Subsequently, the denoised signal undergoes a wavelet packet transform instead of a short-time Fourier transform to optimize Mel-frequency cepstral coefficients (WPT-MFCC), combining time-domain statistical features and frequency-band energy distribution to form a 24-dimensional fusion feature. Finally, the CatBoost algorithm is employed to validate the effects of different feature schemes. The CPO is introduced to optimize its iteration number, learning rate, tree depth, and random strength parameters, thereby enhancing overall performance. The CPO-optimized CatBoost model had 99.0196% fault recognition accuracy in experimental testing, 15% better than the standard CatBoost. Accuracy exceeded 90% even under extreme 0 dB noise. This method makes fault diagnosis more accurate and reliable. Full article

(This article belongs to the Section AI-Enabled Process Engineering)

► Show Figures

Figure 1

26 pages, 5244 KB

Open AccessArticle

Optimizing Spatial Scales for Evaluating High-Resolution CO₂ Fossil Fuel Emissions: Multi-Source Data and Machine Learning Approach

by Yujun Fang, Rong Li and Jun Cao

Sustainability 2025, 17(20), 9009; https://doi.org/10.3390/su17209009 (registering DOI) - 11 Oct 2025

Viewed by 51

Abstract

High-resolution CO₂ fossil fuel emission data are critical for developing targeted mitigation policies. As a key approach for estimating spatial distributions of CO₂ emissions, top–down methods typically rely upon spatial proxies to disaggregate administrative-level emission to finer spatial scales. However, conventional [...] Read more.

High-resolution CO₂ fossil fuel emission data are critical for developing targeted mitigation policies. As a key approach for estimating spatial distributions of CO₂ emissions, top–down methods typically rely upon spatial proxies to disaggregate administrative-level emission to finer spatial scales. However, conventional linear regression models may fail to capture complex non-linear relationships between proxies and emissions. Furthermore, methods relying on nighttime light data are mostly inadequate in representing emissions for both industrial and rural zones. To address these limitations, this study developed a multiple proxy framework integrating nighttime light, points of interest (POIs), population, road networks, and impervious surface area data. Seven machine learning algorithms—Extra-Trees, Random Forest, XGBoost, CatBoost, Gradient Boosting Decision Trees, LightGBM, and Support Vector Regression—were comprehensively incorporated to estimate high-resolution CO₂ fossil fuel emissions. Comprehensive evaluation revealed that the multiple proxy Extra-Trees model significantly outperformed the single-proxy nighttime light linear regression model at the county scale, achieving R² = 0.96 (RMSE = 0.52 MtCO₂) in cross-validation and R² = 0.92 (RMSE = 0.54 MtCO₂) on the independent test set. Feature importance analysis identified brightness of nighttime light (40.70%) and heavy industrial density (21.11%) as the most critical spatial proxies. The proposed approach also showed strong spatial consistency with the Multi-resolution Emission Inventory for China, exhibiting correlation coefficients of 0.82–0.84. This study demonstrates that integrating local multiple proxy data with machine learning corrects spatial biases inherent in traditional top–down approaches, establishing a transferable framework for high-resolution emissions mapping. Full article

► Show Figures

Figure 1

17 pages, 2165 KB

Open AccessArticle

Seizure Type Classification Based on Hybrid Feature Engineering and Mutual Information Analysis Using Electroencephalogram

by Yao Miao

Entropy 2025, 27(10), 1057; https://doi.org/10.3390/e27101057 - 11 Oct 2025

Viewed by 44

Abstract

Epilepsy has diverse seizure types that challenge diagnosis and treatment, requiring automated and accurate classification to improve patient outcomes. Traditional electroencephalogram (EEG)-based diagnosis relies on manual interpretation, which is subjective and inefficient, particularly for multi-class differentiation in imbalanced datasets. This study aims to [...] Read more.

Epilepsy has diverse seizure types that challenge diagnosis and treatment, requiring automated and accurate classification to improve patient outcomes. Traditional electroencephalogram (EEG)-based diagnosis relies on manual interpretation, which is subjective and inefficient, particularly for multi-class differentiation in imbalanced datasets. This study aims to develop a hybrid framework for automated multi-class seizure type classification using segment-wise EEG processing and multi-band feature engineering to enhance precision and address data challenges. EEG signals from the TUSZ dataset were segmented into 1-s windows with 0.5-s overlaps, followed by the extraction of multi-band features, including statistical measures, sample entropy, wavelet energies, Hurst exponent, and Hjorth parameters. The mutual information (MI) approach was employed to select the optimal features, and seven machine learning models (SVM, KNN, DT, RF, XGBoost, CatBoost, LightGBM) were evaluated via 10-fold stratified cross-validation with a class balancing strategy. The results showed the following: (1) XGBoost achieved the highest performance (accuracy: 0.8710, F1 score: 0.8721, AUC: 0.9797), with

γ

-band features dominating importance. (2) Confusion matrices indicated robust discrimination but noted overlaps in focal subtypes. This framework advances seizure type classification by integrating multi-band features and the MI method, which offers a scalable and interpretable tool for supporting clinical epilepsy diagnostics. Full article

(This article belongs to the Section Signal and Data Analysis)

► Show Figures

Figure 1

27 pages, 3885 KB

Open AccessArticle

Experimental and Machine Learning-Based Assessment of Fatigue Crack Growth in API X60 Steel Under Hydrogen–Natural Gas Blending Conditions

by Nayem Ahmed, Ramadan Ahmed, Samin Rhythm, Andres Felipe Baena Velasquez and Catalin Teodoriu

Metals 2025, 15(10), 1125; https://doi.org/10.3390/met15101125 - 10 Oct 2025

Viewed by 207

Abstract

Hydrogen-assisted fatigue cracking presents a critical challenge to the structural integrity of legacy carbon steel natural gas pipelines being repurposed for hydrogen transport, posing a major barrier to the deployment of hydrogen infrastructure. This study systematically evaluates the fatigue crack growth (FCG) behavior [...] Read more.

Hydrogen-assisted fatigue cracking presents a critical challenge to the structural integrity of legacy carbon steel natural gas pipelines being repurposed for hydrogen transport, posing a major barrier to the deployment of hydrogen infrastructure. This study systematically evaluates the fatigue crack growth (FCG) behavior of API 5L X60 pipeline steel under varying hydrogen–natural gas (H₂–NG) blending conditions to assess its suitability for long-term hydrogen service. Experiments are conducted using a custom-designed autoclave to replicate field-relevant environmental conditions. Gas mixtures range from 0% to 100% hydrogen by volume, with tests performed at a constant pressure of 6.9 MPa and a temperature of 25 °C. A fixed loading frequency of 8.8 Hz and load ratio (R) of 0.60 ± 0.1 are applied to simulate operational fatigue loading. The test matrix is designed to capture FCG behavior across a broad range of stress intensity factor values (ΔK), spanning from near-threshold to moderate levels consistent with real-world pipeline pressure fluctuations. The results demonstrate a clear correlation between increasing hydrogen concentration and elevated FCG rates. Notably, at 100% hydrogen, API X60 specimens exhibit crack propagation rates up to two orders of magnitude higher than those in 0% hydrogen (natural gas) conditions, particularly within the Paris regime. In the lower threshold region (ΔK ≈ 10 MPa·√m), the FCG rate (da/dN) increased nonlinearly with hydrogen concentration, indicating early crack activation and reduced crack initiation resistance. In the upper Paris regime (ΔK ≈ 20 MPa·√m), da/dNs remained significantly elevated but exhibited signs of saturation, suggesting a potential limiting effect of hydrogen concentration on crack propagation kinetics. Fatigue life declined substantially with hydrogen addition, decreasing by ~33% at 50% H₂ and more than 55% in pure hydrogen. To complement the experimental investigation and enable predictive capability, a modular machine learning (ML) framework was developed and validated. The framework integrates sequential models for predicting hydrogen-induced reduction of area (RA), fracture toughness (FT), and FCG rate (da/dN), using CatBoost regression algorithms. This approach allows upstream degradation effects to be propagated through nested model layers, enhancing predictive accuracy. The ML models accurately captured nonlinear trends in fatigue behavior across varying hydrogen concentrations and environmental conditions, offering a transferable tool for integrity assessment of hydrogen-compatible pipeline steels. These findings confirm that even low-to-moderate hydrogen blends significantly reduce fatigue resistance, underscoring the importance of data-driven approaches in guiding material selection and infrastructure retrofitting for future hydrogen energy systems. Full article

(This article belongs to the Special Issue Failure Analysis and Evaluation of Metallic Materials)

► Show Figures

Figure 1

25 pages, 2608 KB

Open AccessArticle

Intelligent System for Student Performance Prediction: An Educational Data Mining Approach Using Metaheuristic-Optimized LightGBM with SHAP-Based Learning Analytics

by Abdalhmid Abukader, Ahmad Alzubi and Oluwatayomi Rereloluwa Adegboye

Appl. Sci. 2025, 15(20), 10875; https://doi.org/10.3390/app152010875 - 10 Oct 2025

Viewed by 88

Abstract

Educational data mining (EDM) plays a crucial role in developing intelligent early warning systems that enable timely interventions to improve student outcomes. This study presents a novel approach to student performance prediction by integrating metaheuristic hyperparameter optimization with explainable artificial intelligence for enhanced [...] Read more.

Educational data mining (EDM) plays a crucial role in developing intelligent early warning systems that enable timely interventions to improve student outcomes. This study presents a novel approach to student performance prediction by integrating metaheuristic hyperparameter optimization with explainable artificial intelligence for enhanced learning analytics. While Light Gradient Boosting Machine (LightGBM) demonstrates efficiency in educational prediction tasks, achieving optimal performance requires sophisticated hyperparameter tuning, particularly for complex educational datasets where accuracy, interpretability, and actionable insights are paramount. This research addressed these challenges by implementing and evaluating five nature-inspired metaheuristic algorithms: Fox Algorithm (FOX), Giant Trevally Optimizer (GTO), Particle Swarm Optimization (PSO), Sand Cat Swarm Optimization (SCSO), and Salp Swarm Algorithm (SSA) for automated hyperparameter optimization. Using rigorous experimental methodology with 5-fold cross-validation and 20 independent runs, we assessed predictive performance through comprehensive metrics including Coefficient of Determination (R²), Root Mean Squared Error (RMSE), Mean Squared Error (MSE), Relative Absolute Error (RAE), and Mean Error (ME). Results demonstrate that metaheuristic optimization significantly enhances educational prediction accuracy, with SCSO-LightGBM achieving superior performance with R² of 0.941. SHapley Additive exPlanations (SHAP) analysis provides crucial interpretability, identifying Attendance, Hours Studied, Previous Scores, and Parental Involvement as dominant predictive factors, offering evidence-based insights for educational stakeholders. The proposed SCSO-LightGBM framework establishes an intelligent, interpretable system that supports data-driven decision-making in educational environments, enabling proactive interventions to enhance student success. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) in Educational Data Mining and Learning Analytics)

► Show Figures

Figure 1

26 pages, 2705 KB

Open AccessArticle

GIS-Based Landslide Susceptibility Mapping with a Blended Ensemble Model and Key Influencing Factors in Sentani, Papua, Indonesia

by Zulfahmi Zulfahmi, Moch Hilmi Zaenal Putra, Dwi Sarah, Adrin Tohari, Nendaryono Madiutomo, Priyo Hartanto and Retno Damayanti

Geosciences 2025, 15(10), 390; https://doi.org/10.3390/geosciences15100390 - 9 Oct 2025

Viewed by 128

Abstract

Landslides represent a recurrent hazard in tropical mountain environments, where rapid urbanization and extreme rainfall amplify disaster risk. The Sentani region of Papua, Indonesia, is highly vulnerable, as demonstrated by the catastrophic debris flows of March 2019 that caused fatalities and widespread losses. [...] Read more.

Landslides represent a recurrent hazard in tropical mountain environments, where rapid urbanization and extreme rainfall amplify disaster risk. The Sentani region of Papua, Indonesia, is highly vulnerable, as demonstrated by the catastrophic debris flows of March 2019 that caused fatalities and widespread losses. This study developed high-resolution landslide susceptibility maps for Sentani using an ensemble machine learning framework. Three base learners—Random Forest, eXtreme Gradient Boosting (XGBoost), and CatBoost—were combined through a logistic regression meta-learner. Predictor redundancy was controlled using Pearson correlation and Variance Inflation Factor/Tolerance (VIF/TOL). The landslide inventory was constructed from multitemporal satellite imagery, integrating geological, topographic, hydrological, environmental, and seismic factors. Results showed that lithology, Slope Length and Steepness Factor (LS Factor), and earthquake density consistently dominated model predictions. The ensemble achieved the most balanced predictive performance, Area Under the Curve (AUC) > 0.96, and generated susceptibility maps that aligned closely with observed landslide occurrences. SHapley Additive Explanations (SHAP) analyses provided transparent, case-specific insights into the directional influence of key factors. Collectively, the findings highlight both the robustness and interpretability of ensemble learning for landslide susceptibility mapping, offering actionable evidence to support disaster preparedness, land-use planning, and sustainable development in Papua. Full article

23 pages, 15077 KB

Open AccessArticle

Landscape Patterns and Carbon Emissions in the Yangtze River Basin: Insights from Ensemble Models and Nighttime Light Data

by Banglong Pan, Qi Wang, Zhuo Diao, Jiayi Li, Wuyiming Liu, Qianfeng Gao, Ying Shu and Juan Du

Atmosphere 2025, 16(10), 1173; https://doi.org/10.3390/atmos16101173 - 9 Oct 2025

Viewed by 126

Abstract

Land use patterns are a critical driver of changes in carbon emissions, making it essential to elucidate the relationship between regional carbon emissions and land use types. As a nationally designated economic strategic zone, the Yangtze River Basin encompasses megacities, rapidly developing medium-sized [...] Read more.

Land use patterns are a critical driver of changes in carbon emissions, making it essential to elucidate the relationship between regional carbon emissions and land use types. As a nationally designated economic strategic zone, the Yangtze River Basin encompasses megacities, rapidly developing medium-sized cities, and relatively underdeveloped regions. However, the mechanism underlying the interaction between landscape patterns and carbon emissions across such gradients remains inadequately understood. This study utilizes nighttime light, land use and carbon emissions datasets, employing XGBoost, CatBoost, LightGBM and a stacking ensemble model to analyze the impacts and driving factors of land use changes on carbon emissions in the Yangtze River Basin from 2002 to 2022. The results showed: (1) The stacking ensemble learning model demonstrated the best predictive performance, with a coefficient of determination (R²) of 0.80, a residual prediction deviation (RPD) of 2.22, and a root mean square error (RMSE) of 4.46. Compared with the next-best models, these performance metrics represent improvements of 19.40% in R² and 28.32% in RPD, and a 22.16% reduction in RMSE. (2) Based on SHAP feature importance and Pearson correlation analysis, the primary drivers influencing CO₂ net emissions in the Yangtze River Basin are GDP per capita (GDPpc), population density (POD), Tertiary industry share (TI), land use degree comprehensive index (LUI), dynamic degree of water-body land use (K_water), Largest patch index (LPI), and number of patches (NP). These findings indicate that changes in regional landscape patterns exert a significant effect on carbon emissions in strategic economic regions, and that stacked ensemble models can effectively simulate and interpret this relationship with high predictive accuracy, thereby providing decision support for regional low-carbon development planning. Full article

(This article belongs to the Special Issue Urban Carbon Emissions: Measurement and Modeling)

► Show Figures

Figure 1

21 pages, 1160 KB

Open AccessArticle

Near Real-Time Ethereum Fraud Detection Using Explainable AI in Blockchain Networks

by Fatih Ertam

Appl. Sci. 2025, 15(19), 10841; https://doi.org/10.3390/app151910841 - 9 Oct 2025

Viewed by 237

Abstract

Blockchain technologies have profoundly transformed information systems by providing decentralized infrastructures that enhance transparency, security, and traceability. Ethereum, in particular, supports smart contracts and facilitates the development of decentralized finance (DeFi), non-fungible tokens (NFTs), and Web3 applications. However, its openness also enables illicit [...] Read more.

Blockchain technologies have profoundly transformed information systems by providing decentralized infrastructures that enhance transparency, security, and traceability. Ethereum, in particular, supports smart contracts and facilitates the development of decentralized finance (DeFi), non-fungible tokens (NFTs), and Web3 applications. However, its openness also enables illicit activities, including fraud and money laundering, through anonymous wallets. Identifying wallets involved in large transfers or abnormal transactional patterns is therefore critical to ecosystem security. This study proposes an AI-based framework employing XGBoost, LightGBM, and CatBoost to detect suspicious Ethereum wallets, achieving test accuracies between 95.83% and 96.46%. The system provides near real-time predictions for individual or recent wallet addresses using a pre-trained XGBoost model. To improve interpretability, SHAP (SHapley Additive exPlanations) visualizations are integrated, highlighting the contribution of each feature. The results demonstrate the effectiveness of AI-driven methods in monitoring and securing Ethereum transactions against fraudulent activities. Full article

(This article belongs to the Special Issue Artificial Intelligence on the Edge for Industry 4.0)

► Show Figures

Figure 1

26 pages, 3383 KB

Open AccessArticle

Biomass Gasification for Waste-to-Energy Conversion: Artificial Intelligence for Generalizable Modeling and Multi-Objective Optimization of Syngas Production

by Gema Báez-Barrón, Francisco Javier Lopéz-Flores, Eusiel Rubio-Castro and José María Ponce-Ortega

Resources 2025, 14(10), 157; https://doi.org/10.3390/resources14100157 - 8 Oct 2025

Viewed by 347

Abstract

Biomass gasification, a key waste-to-energy technology, is a complex thermochemical process with many input variables influencing the yield and quality of syngas. In this study, data-driven machine learning models are developed to capture the nonlinear relationships between feedstock properties, operating conditions, and syngas [...] Read more.

Biomass gasification, a key waste-to-energy technology, is a complex thermochemical process with many input variables influencing the yield and quality of syngas. In this study, data-driven machine learning models are developed to capture the nonlinear relationships between feedstock properties, operating conditions, and syngas composition, in order to optimize process performance. Random Forest (RF), CatBoost (Categorical Boosting), and an Artificial Neural Network (ANN) were trained to predict key syngas outputs (syngas composition and syngas yield) from process inputs. The best-performing model (ANN) was then integrated into a multi-objective optimization framework using the open-source Optimization & Machine Learning Toolkit (OMLT) in Pyomo. An optimization problem was formulated with two objectives—maximizing the hydrogen-to-carbon monoxide (H₂/CO) ratio and maximizing the syngas yield simultaneously, subject to operational constraints. The trade-off between these competing objectives was resolved by generating a Pareto frontier, which identifies optimal operating points for different priority weightings of syngas quality vs. quantity. To interpret the ML models and validate domain knowledge, SHapley Additive exPlanations (SHAP) were applied, revealing that parameters such as equivalence ratio, steam-to-biomass ratio, feedstock lower heating value, and fixed carbon content significantly influence syngas outputs. Our results highlight a clear trade-off between maximizing hydrogen content and total gas yield and pinpoint optimal conditions for balancing this trade-off. This integrated approach, combining advanced ML predictions, explainability, and rigorous multi-objective optimization, is novel for biomass gasification and provides actionable insights to improve syngas production efficiency, demonstrating the value of data-driven optimization in sustainable waste-to-energy conversion processes. Full article

► Show Figures

Figure 1

16 pages, 2458 KB

Open AccessCommunication

Machine Learning and UHPLC–MS/MS-Based Discrimination of the Geographical Origin of Dendrobium officinale from Yunnan, China

by Tao Lin, Yanping Ye, Jiao Zhang, Jing Wang, Zhengxu Hu, Khine Zar Linn, Xinglian Chen, Hongcheng Liu, Zhenhuan Liu and Qinghua Yao

Foods 2025, 14(19), 3442; https://doi.org/10.3390/foods14193442 - 8 Oct 2025

Viewed by 262

Abstract

A rapid targeted screening method for 22 compounds, including flavonoids, glycosides, and phenolics, in Dendrobium officinale was developed using UHPLC–MS/MS, demonstrating good linear correlation coefficients, precision, repeatability, and stability. D. officinale from the Guangnan and Maguan regions can be effectively classified into two [...] Read more.

A rapid targeted screening method for 22 compounds, including flavonoids, glycosides, and phenolics, in Dendrobium officinale was developed using UHPLC–MS/MS, demonstrating good linear correlation coefficients, precision, repeatability, and stability. D. officinale from the Guangnan and Maguan regions can be effectively classified into two distinct categories using PCA. In addition, OPLS-DA discriminant analysis enables clear separation between groups, with samples forming well-defined clusters. The 22 chemical components provide valuable origin-related information for D. officinale. The compounds with VIP values of >1 included eriodictyol, vanillic acid, protocatechuic acid, gentisic acid, and naringenin. The difference in naringenin content between D. officinale from the two production areas was minimal. By contrast, eriodictyol and vanillic acid were relatively abundant in D. officinale from Guangnan, while gentisic acid and protocatechuic acid were more prevalent in D. officinale from Maguan. The pathways with higher Kyoto Encyclopedia of Genes and Genomes enrichment were primarily associated with lipid metabolism and atherosclerosis, fluid shear stress and atherosclerosis, and nonalcoholic fatty liver disease. These findings suggest that D. officinale exhibits promising lipid-balancing properties and potential cardiovascular health benefits. Seven machine learning algorithms—Random Forest, XGBoost, Support Vector Machine, k-Nearest Neighbor, Backpropagation Neural Network, Random Tree, and CatBoost—demonstrated superior accuracy and precision in distinguishing D. officinale from the Guangnan and Maguan regions. The key compounds with higher weights—vanillic acid, chrysoeriol, trigonelline, isoquercitrin, gallic acid, 4-hydroxybenzaldehyde, eriodictyol, sweroside, apigenin, and homoeriodictyol—play a crucial role in model construction and the identification of D. officinale from the Guangnan and Maguan regions. The quantification of 22 compounds using UHPLC–MS/MS, combined with PCA, OPLS-DA, and machine learning, enables effective discrimination of D. officinale from these two Yunnan production areas. Full article

(This article belongs to the Special Issue Food Fraud as a Global Problem: Advanced Analytical Tools to Detect Species, Country of Origin and Adulterations: Second Edition)

► Show Figures

Figure 1

24 pages, 1582 KB

Open AccessArticle

Future Internet Applications in Healthcare: Big Data-Driven Fraud Detection with Machine Learning

by Konstantinos P. Fourkiotis and Athanasios Tsadiras

Future Internet 2025, 17(10), 460; https://doi.org/10.3390/fi17100460 - 8 Oct 2025

Viewed by 271

Abstract

Hospital fraud detection has often relied on periodic audits that miss evolving, internet-mediated patterns in electronic claims. An artificial intelligence and machine learning pipeline is being developed that is leakage-safe, imbalance aware, and aligned with operational capacity for large healthcare datasets. The preprocessing [...] Read more.

Hospital fraud detection has often relied on periodic audits that miss evolving, internet-mediated patterns in electronic claims. An artificial intelligence and machine learning pipeline is being developed that is leakage-safe, imbalance aware, and aligned with operational capacity for large healthcare datasets. The preprocessing stack integrates four tables, engineers 13 features, applies imputation, categorical encoding, Power transformation, Boruta selection, and denoising autoencoder representations, with class balancing via SMOTE-ENN evaluated inside cross-validation folds. Eight algorithms are compared under a fraud-oriented composite productivity index that weighs recall, precision, MCC, F1, ROC-AUC, and G-Mean, with per-fold threshold calibration and explicit reporting of Type I and Type II errors. Multilayer perceptron attains the highest composite index, while CatBoost offers the strongest control of false positives with high accuracy. SMOTE-ENN provides limited gains once representations regularize class geometry. The calibrated scores support prepayment triage, postpayment audit, and provider-level profiling, linking alert volume to expected recovery and protecting investigator workload. Situated in the Future Internet context, this work targets internet-mediated claim flows and web-accessible provider registries. Governance procedures for drift monitoring, fairness assessment, and change control complete an internet-ready deployment path. The results indicate that disciplined preprocessing and evaluation, more than classifier choice alone, translate AI improvements into measurable economic value and sustainable fraud prevention in digital health ecosystems. Full article

(This article belongs to the Special Issue Information and Future Internet Security, Trust and Privacy—4th Edition)

► Show Figures

Figure 1

20 pages, 4033 KB

Open AccessArticle

AI-Based Virtual Assistant for Solar Radiation Prediction and Improvement of Sustainable Energy Systems

by Tomás Gavilánez, Néstor Zamora, Josué Navarrete, Nino Vega and Gabriela Vergara

Sustainability 2025, 17(19), 8909; https://doi.org/10.3390/su17198909 - 8 Oct 2025

Viewed by 280

Abstract

Advances in machine learning have improved the ability to predict critical environmental conditions, including solar radiation levels that, while essential for life, can pose serious risks to human health. In Ecuador, due to its geographical location and altitude, UV radiation reaches extreme levels. [...] Read more.

Advances in machine learning have improved the ability to predict critical environmental conditions, including solar radiation levels that, while essential for life, can pose serious risks to human health. In Ecuador, due to its geographical location and altitude, UV radiation reaches extreme levels. This study presents the development of a chatbot system driven by a hybrid artificial intelligence model, combining Random Forest, CatBoost, Gradient Boosting, and a 1D Convolutional Neural Network. The model was trained with meteorological data, optimized using hyperparameters (iterations: 500–1500, depth: 4–8, learning rate: 0.01–0.3), and evaluated through MAE, MSE, R², and F1-Score. The hybrid model achieved superior accuracy (MAE = 13.77 W/m², MSE = 849.96, R² = 0.98), outperforming traditional methods. A 15% error margin was observed without significantly affecting classification. The chatbot, implemented via Telegram and hosted on Heroku, provided real-time personalized alerts, demonstrating an effective, accessible, and scalable solution for health safety and environmental awareness. Furthermore, it facilitates decision-making in the efficient generation of renewable energy and supports a more sustainable energy transition. It offers a tool that strengthens the relationship between artificial intelligence and sustainability by providing a practical instrument for integrating clean energy and mitigating climate change. Full article

(This article belongs to the Special Issue Advancing Sustainable Development Through Artificial Intelligence (AI))

► Show Figures

Graphical abstract

Search Results (757)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (757)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI