Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (220)

Search Parameters:
Keywords = gradient boosting (GB)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 5852 KB  
Article
Prediction Model for the Local Bearing Capacity of Stirrup-Confined Concrete Based on the PSO-BP Neural Network
by Tianming Miao, Junwu Dai, Tao Jiang, Yongjian Ding, Ruchen Qie, Yingqi Liu and Ying Zhou
Infrastructures 2026, 11(4), 143; https://doi.org/10.3390/infrastructures11040143 - 20 Apr 2026
Viewed by 153
Abstract
The calculation for the local bearing capacity of stirrup-confined concrete is an important issue in structural design. Due to the coupling effects of multiple factors, there is no unified calculation method recognized by scholars. The improved backpropagation neural network model based on the [...] Read more.
The calculation for the local bearing capacity of stirrup-confined concrete is an important issue in structural design. Due to the coupling effects of multiple factors, there is no unified calculation method recognized by scholars. The improved backpropagation neural network model based on the particle swarm optimization algorithm (PSO-BPNN) is used in this research to conduct a systematic analysis. The results of 40 stirrup-confined concrete specimens from the tests conducted by ourselves and an additional 92 similar test data points from references were combined; the calculation efficiency and accuracy of the PSO-BPNN model were verified. Compared with the BPNN model, the training iterations of the PSO-BPNN model were reduced by 74.23% with the condition of same training effect. The mean squared error (MSE) is reduced by 33.9%, and the coefficient of determination (R2) is increased by 5.5% with the condition of the same number training iterations. In addition, compared with the calculation stability and accuracy of Random Forest Regression (RFR), Support Vector Regression (SVR), and Extreme Gradient Boosting (XGBoost) models, the PSO-BPNN model also shows better results. Within the applicable range of the codes, the average ratio of the predicted values to the calculated values for GB50010-2010, MC2020 and ACI318-25 are 1.988, 1.719, and 5.387, respectively. A higher evaluation for the contribution of stirrup is considered in the MC2020 code; the predicted values of some specimens are lower than the calculated values when Acor/Al is less than 1.35. The brittleness effect is not adequately considered: the predicted values of some specimens are also lower than the calculated values with the active powder concrete (RPC) is used. The sensitivity ranking of the model with coupling effect for parameters is Al, Ab, fc,k, s, d, dcor, and fy,k. It is slightly different from the sensitivity ranking obtained by analyzing individual parameters, but the calculation logic is consistent. The research results can provide a theoretical basis for practical engineering. Full article
(This article belongs to the Section Infrastructures and Structural Engineering)
Show Figures

Figure 1

14 pages, 2210 KB  
Article
XGBPred-ACSM: A Hybrid Descriptor-Driven XGBoost Framework for Anticancer Small Molecule Prediction
by Priya Dharshini Balaji, Subathra Selvam, Anuradha Thiagarajan, Honglae Sohn and Thirumurthy Madhavan
Pharmaceuticals 2026, 19(4), 635; https://doi.org/10.3390/ph19040635 - 17 Apr 2026
Viewed by 280
Abstract
Background/Objectives: Cancer remains one of the leading global health burdens, mainly because of the lack of specificity and off-target toxicity associated with conventional therapeutic approaches. To move toward more efficient anticancer drug discovery, we have developed an advanced machine-learning-based architecture that allows [...] Read more.
Background/Objectives: Cancer remains one of the leading global health burdens, mainly because of the lack of specificity and off-target toxicity associated with conventional therapeutic approaches. To move toward more efficient anticancer drug discovery, we have developed an advanced machine-learning-based architecture that allows for predictive modeling of anticancer small molecules. Methods: A total of 3600 compounds with experimentally validated IC50 values were systematically processed to derive a comprehensive suite of molecular representations comprising 2D physicochemical descriptors, structural fingerprints, and hybrid descriptor sets generated via the Mordred and PaDEL frameworks. A total of six machine learning algorithms—Random Forest (RF), Extreme Gradient Boosting (XGB), Gradient Boosting (GB), Extra-Trees classifier (ET), Adaptive Boosting (AdaBoost), and Light Gradient Boosting Machine (LightGBM)—were trained and benchmarked via a rigorous model evaluation protocol incorporating 10-fold cross-validation along with multiple performance metrics. Ensemble voting strategies were also examined to assess potential performance. Result: Of all configurations, the XGB-Hybrid architecture emerged as the most robust and generalizable classifier with an AUC of 0.88 and accuracy of 79.11% on the independent test set. To ensure interpretability and mechanistic insight, SHAP-based feature analysis was conducted, by which feature contributions could be quantified and the molecular determinants most influential for anticancer activity discrimination were revealed. Altogether, the current study establishes an XGB-Hybrid framework as technically rigorous, interpretable, and high-performance predictive modeling with the ability to accelerate early-stage anticancer small molecule identification. Conclusions: The study has brought into focus the transformational effect of machine learning in modern computational oncology and rational drug design pipelines. Full article
(This article belongs to the Special Issue Artificial Intelligence-Assisted Drug Discovery)
Show Figures

Figure 1

26 pages, 3829 KB  
Article
Time–Frequency and Spectral Analysis of Welding Arc Sound for Automated SMAW Quality Classification
by Alejandro García Rodríguez, Christian Camilo Barriga Castellanos, Jair Eduardo Rocha-Gonzalez and Everardo Bárcenas
Sensors 2026, 26(8), 2357; https://doi.org/10.3390/s26082357 - 11 Apr 2026
Viewed by 382
Abstract
This study investigates the feasibility of acoustic signal analysis for the assessment of weld bead quality in the shielded metal arc welding (SMAW) process. The work focuses on comparing time-domain acoustic signals and time–frequency spectrogram representations for the classification of welds as accepted [...] Read more.
This study investigates the feasibility of acoustic signal analysis for the assessment of weld bead quality in the shielded metal arc welding (SMAW) process. The work focuses on comparing time-domain acoustic signals and time–frequency spectrogram representations for the classification of welds as accepted or rejected according to standard welding inspection criteria. Two key acoustic descriptors, the fundamental frequency (F0) and the harmonics-to-noise ratio (HNR), were extracted and analyzed to evaluate statistical differences between the two weld quality classes. Statistical tests, including Anderson–Darling, Levene, ANOVA, and Kruskal–Wallis (α = 0.05), revealed significant differences between accepted and rejected welds. Accepted welds exhibited a bimodal HNR distribution associated with transient arc instability at the beginning and end of the bead, whereas rejected welds showed more uniform acoustic behavior throughout the process. Subsequently, the acoustic data were represented using both audio signals and spectrograms and used as inputs for ten supervised machine learning models, including Support Vector Classifier (SVC), Logistic Regression (LR), k-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB), and Naïve Bayes (NB). The results demonstrate that spectrogram-based representations significantly outperform time-domain signals, achieving accuracies of 0.95–0.96, ROC-AUC values above 0.95, and false positive and false negative rates below 6%. These findings indicate that, while scalar acoustic descriptors provide statistically significant insight into weld quality, time–frequency representations combined with machine learning enable a more robust and reliable framework for automated non-destructive evaluation, particularly in manual SMAW processes under realistic operating conditions. Full article
(This article belongs to the Section Sensor Materials)
Show Figures

Figure 1

7 pages, 904 KB  
Proceeding Paper
Predictive Modeling of Malaria Risk Using the Nigerian Demographic and Health Survey Data
by JohnPaul C. Ugwu, Thecla O. Ayoka, Charles O. Nnadi and Wilfred O. Obonga
Eng. Proc. 2026, 124(1), 98; https://doi.org/10.3390/engproc2026124098 - 31 Mar 2026
Viewed by 321
Abstract
Malaria continues to pose a significant public health challenge in Nigeria, yet there has not been much research utilizing machine-learning techniques to forecast malaria risk. This study developed a machine-learning model that predicts malaria risk by leveraging demographic, environmental, and GPS data from [...] Read more.
Malaria continues to pose a significant public health challenge in Nigeria, yet there has not been much research utilizing machine-learning techniques to forecast malaria risk. This study developed a machine-learning model that predicts malaria risk by leveraging demographic, environmental, and GPS data from the Nigerian Demographic and Health Survey (DHS) covering the years 2000 to 2020. The dataset was pre-processed and split into a training set (with 406 respondents) and a test set (with 102 respondents). Random Forest (RF), Gradient Boosting (GB) and Linear Regression (LR) algorithms were employed to assess their predictive performance. The RF stood out with the best accuracy, achieving the lowest mean squared error (MSE = 0.0053) and the highest coefficient of determination (R2 = 0.6364). Thus, RF was recognized as the most effective model for predicting malaria risk. The regression equation with positive coefficients (like population density = 0.0141, travel time = 0.0019, minimum temperature = 0.0082, temperature in January = 0.0265, and dry land surface temp = 0.0368) indicate that higher feature values are associated with increased malaria prevalence, while negative coefficients (such as rainfall = −0.0122, nightlights composite = −0.03, potential evapotranspiration = −0.09 and insecticide treated nets = −0.02) suggest that as the feature increases, the prevalence decreases. This study underscores the potential of the RF approach in improving early predictions of malaria risk and can guide targeted interventions to control malaria in areas at high risk. Full article
(This article belongs to the Proceedings of The 6th International Electronic Conference on Applied Sciences)
Show Figures

Figure 1

23 pages, 2486 KB  
Article
Research on the Prediction Method for Ultimate Bearing Capacity of Circular Concrete-Filled Steel Tubular Columns Based on Random Search-Optimized CatBoost Algorithm
by Zhenyu Wang, Yunqiang Wang, Xiangyu Xu, Zihan Zhang, Yaxing Wei and Dan Luo
Materials 2026, 19(7), 1360; https://doi.org/10.3390/ma19071360 - 30 Mar 2026
Viewed by 414
Abstract
With the development of various emerging structures, concrete-filled steel tubular (CFST) columns have become critical load-bearing components in key infrastructures such as subways and underground utility tunnels. Accurately predicting their ultimate bearing capacity (Nu) is essential for guaranteeing structural safety. [...] Read more.
With the development of various emerging structures, concrete-filled steel tubular (CFST) columns have become critical load-bearing components in key infrastructures such as subways and underground utility tunnels. Accurately predicting their ultimate bearing capacity (Nu) is essential for guaranteeing structural safety. To address the limitations of traditional empirical formulas and code-based calculation approaches, this paper proposes a prediction model for ultimate bearing capacity based on the CatBoost algorithm optimized by Random Search. Furthermore, the marginal contribution of each key feature to the prediction results is measured through interpretability analysis. First, a database containing 438 CFST column ultimate bearing capacity test cases was established, with key parameters such as geometric dimensions and material properties as input variables. Second, the predictive performance of six machine learning algorithms—CatBoost, LightGBM, Random Forest (RF), Gradient Boosting (GB), K-Nearest Neighbors (KNN), and XGBoost—was compared. A five-fold cross-validation integrated with a Random Search strategy was employed for joint hyperparameter optimization. The results show that the optimized CatBoost model significantly outperforms other algorithms and conventional design codes, achieving a coefficient of determination (R2) as high as 0.99 and a root mean square error (RMSE) of 174.29 kN. Furthermore, the SHAP (Shapley Additive exPlanations) method was used to perform global and local interpretability analyses of the prediction model. This not only quantified the individual contribution and interaction effects of each feature parameter on the bearing capacity but also revealed that geometric parameters are the primary influencing factor. This finding confirms a high degree of consistency between the prediction mechanism of the data-driven model and classical mechanical theories, effectively validating the model’s reliability. This study provides an efficient and reliable tool for the optimal design and rapid evaluation of CFST columns and establishes a new data-driven paradigm for the design and reinforcement of key components in underground structures. Full article
Show Figures

Figure 1

43 pages, 1950 KB  
Review
A Comprehensive Review of Machine Learning and Deep Learning Methods for Flood Inundation Mapping
by Abinash Silwal, Anil Subedi, Rajee Tamrakar, Kshitij Dahal, Dewasis Dahal, Kenneth Okechukwu Ekpetere and Mohamed Zhran
Earth 2026, 7(2), 44; https://doi.org/10.3390/earth7020044 - 9 Mar 2026
Cited by 1 | Viewed by 2384
Abstract
Flood inundation mapping (FIM) is essential in disaster risk management, infrastructure planning, and climate adaptation. Traditional hydrodynamic models, such as the Hydrologic Engineering Center’s River Analysis System (HEC-RAS) and LISFLOOD-Floodplain (LISFLOOD-FP), provide physically interpretable flood simulations but are often data- and computation-intensive and [...] Read more.
Flood inundation mapping (FIM) is essential in disaster risk management, infrastructure planning, and climate adaptation. Traditional hydrodynamic models, such as the Hydrologic Engineering Center’s River Analysis System (HEC-RAS) and LISFLOOD-Floodplain (LISFLOOD-FP), provide physically interpretable flood simulations but are often data- and computation-intensive and difficult to scale across regions. In recent years, machine learning (ML) and deep learning (DL) approaches have emerged as data-driven alternatives that leverage remote sensing observations, digital elevation models (DEMs), and hydro-climatic datasets to enable scalable and near-real-time flood mapping. Our review synthesizes recent advances in ML-based flood inundation mapping, categorizing methods into traditional machine learning techniques (e.g., Random Forest (RF), Support Vector Machines (SVM), Gradient Boosting (GB)), deep learning architectures (e.g., Convolutional Neural Networks (CNNs), U-Net, Long Short-Term Memory networks (LSTM)), and emerging hybrid and physics-informed frameworks. We evaluate model performance across flood extent and flood depth estimation tasks, highlighting strengths, limitations, and common benchmarking practices reported in the literature. The review identifies key challenges related to model interpretability, data bias, transferability, and regulatory acceptance, and highlights recent progress in explainable artificial intelligence (XAI), uncertainty-aware modeling, and physics-informed learning as pathways toward operational adoption. By unifying terminology, performance metrics, and methodological comparisons, this review provides a coherent framework for advancing trustworthy, scalable, and decision-relevant flood inundation mapping under increasing climate-driven flood risk. Full article
Show Figures

Graphical abstract

28 pages, 22820 KB  
Article
A Quantitative Assessment of Uncertainty Reduction as a Function of Measurement Campaign Length Using Linear and Machine-Learning MCP Models
by Alejandro Abascal Mendez, Ana Del Castillo Martín, Olga Álvarez Pérez-Aradros, Paulo Henrique Figueiredo Vaz, Ana Patricia Talayero Navales, Roberto Lázaro Gastón and Andrés Llombart Estopiñán
Inventions 2026, 11(2), 23; https://doi.org/10.3390/inventions11020023 - 2 Mar 2026
Viewed by 716
Abstract
This study evaluates the impact of measurement campaign duration on wind resource characterization using three MCP (Measure–Correlate–Predict) models: Total Least Squares (TLS), Multiple Linear Regression (LR), and Quantile Gradient Boosting (GB). The analysis is based on data from 30 meteorological masts (nine primary [...] Read more.
This study evaluates the impact of measurement campaign duration on wind resource characterization using three MCP (Measure–Correlate–Predict) models: Total Least Squares (TLS), Multiple Linear Regression (LR), and Quantile Gradient Boosting (GB). The analysis is based on data from 30 meteorological masts (nine primary and twenty-one secondary masts) installed worldwide across different terrains, with up to twenty-seven months of concurrent wind measurements between primary and secondary masts. Fixed campaign durations of 3, 4, 5, 6, 9, and 12 months were simulated using moving intervals to quantify the effect of measurement length on mean wind speed estimation. This working framework also serves to represent conditions typical of campaigns where LIDAR systems are used to complement meteorological mast deployments, as LIDAR units generally operate for shorter periods due to frequent relocation as part of broader measurement strategies. Wind speed estimation was assessed through metrics such as Mean Absolute Error (MAE), relative uncertainty, and monthly uncertainty reduction, taking into account terrain complexity and correlation coefficient (R2) between masts. Results indicate that extending the measurement period improves the accuracy and consistency of wind speed estimates, with significant reductions in uncertainty observed after six months. Across sites, the average monthly uncertainty reduction ranges from 0.13% to 0.41% of the mean wind speed per additional month of measurements, depending on terrain complexity and inter-mast correlation. Linear models (TLS and LR) consistently show better performance in terms of error and uncertainty reduction compared to GB. Based on an extensive and diverse MCP dataset covering multiple terrains and locations, this study provides empirically derived monthly uncertainty-reduction benchmarks for campaign-length optimisation under different site conditions, contributing to more reliable wind resource assessments and, consequently, energy yield estimates. Full article
Show Figures

Figure 1

14 pages, 1328 KB  
Proceeding Paper
An Intelligent Prediction–Optimization Framework for Free Chlorine Removal from Industrial Wastewater Using Activated Carbon Filtration
by Alisher Rakhimov, Rustam Bozorov, Shuhrat Mutalov, Jaloliddin Eshbobaev, Mirjalol Yusupov, Farida Islomova and Bokhodir Yunusov
Eng. Proc. 2026, 124(1), 50; https://doi.org/10.3390/engproc2026124050 - 26 Feb 2026
Viewed by 410
Abstract
Free chlorine removal from industrial wastewater using activated carbon filtration requires accurate modeling and optimal control to balance treatment efficiency and adsorbent consumption. In this study, a combined experimental–machine learning–optimization framework was developed to predict and optimize residual chlorine concentration in a pilot-scale [...] Read more.
Free chlorine removal from industrial wastewater using activated carbon filtration requires accurate modeling and optimal control to balance treatment efficiency and adsorbent consumption. In this study, a combined experimental–machine learning–optimization framework was developed to predict and optimize residual chlorine concentration in a pilot-scale activated carbon filtration unit. A total of 200 experimental runs were collected using a pilot activated carbon filtration system by varying flow rate, initial chlorine concentration, pressure, pH, temperature, and carbon dose. Two ensemble learning models, Random Forest (RF) and Gradient Boosting (GB), were trained and validated using five-fold cross-validation. Both models exhibited high predictive accuracy, with GB outperforming RF on the full dataset (R2 = 0.9995, Root Mean Square Error (RMSE) = 0.0355 mg·L−1, Mean Absolute Error (MAE) = 0.0276 mg·L−1) and on the independent test set (R2 = 0.9417). Feature importance and partial dependence analyses revealed that the initial chlorine concentration and activated carbon dose were the dominant controlling variables, while increasing flow rate led to higher residual chlorine levels. A multi-objective optimization strategy based on Pareto dominance was implemented using the trained GB model as a surrogate to simultaneously minimize residual chlorine and carbon consumption. The optimal compromise solution corresponded to an activated carbon dose of approximately 51.5 kg and a residual chlorine concentration of 0.156 mg·L−1 at a flow rate of 43.1 m3·h−1. The proposed framework demonstrates a reliable and cost-effective approach for predictive control and sustainable optimization of dechlorination processes in industrial wastewater treatment. Full article
(This article belongs to the Proceedings of The 6th International Electronic Conference on Applied Sciences)
Show Figures

Figure 1

35 pages, 1973 KB  
Article
Efficient Recurrent Multi-Layer Neural Network for Multi-Scale Noise and Activity Drift Mitigation in Wideband Cognitive Radio Networks
by Sunil Jatti and Anshul Tyagi
Algorithms 2026, 19(3), 172; https://doi.org/10.3390/a19030172 - 25 Feb 2026
Viewed by 275
Abstract
Wideband spectrum sensing in Cognitive Radio Networks (CRNs) is challenging due to sparse primary user (PU) activity and noise clustering, which obscure signals and generate false alarms. Hence, a novel “Graph Discrete Wavelet Bayesian Kernel Boosted Decision Self-Attention Clustering Neural Network (GDWB-KBSC-NN)” is [...] Read more.
Wideband spectrum sensing in Cognitive Radio Networks (CRNs) is challenging due to sparse primary user (PU) activity and noise clustering, which obscure signals and generate false alarms. Hence, a novel “Graph Discrete Wavelet Bayesian Kernel Boosted Decision Self-Attention Clustering Neural Network (GDWB-KBSC-NN)” is proposed. When sparse PU activity is masked by irregular interference bursts, traditional sensing algorithms misclassify weak transmissions as noise, leading to low detection reliability. To resolve this, the first hidden layer employs Discrete Wavelet Sparse Bayesian Kernel Analysis (DW-SBK), integrating Discrete Wavelet Packet Transform (DWPT), Sparse Bayesian Learning (SBL), and Kernel PCA. This restores the true sparse pattern of the spectrum, separates interference from actual PU signals, and enhances detection of weak channels. Additionally, PU signals are fragmented due to cross-scale activity drift, where dynamic bandwidth switching and variable burst durations disrupt temporal continuity. Therefore, the second layer incorporates Gradient Boosted Multi-Head Fuzzy Clustering (GB-MHFC), where Gradient Boosted Decision Trees (GBDT) model nonlinear spectral–temporal patterns, Multi-Head Self-Attention (MHSA) captures long- and short-range temporal dependencies, and Fuzzy C-Means Clustering (FCM) groups feature representations into stable PU activity modes, thereby reducing misclassifications and enhancing robustness under highly dynamic CRN conditions. The proposed method demonstrates superior performance with a maximum detection probability of 0.98, classification accuracy of 98%, lowest sensing error of 5.412%, and the fastest sensing time of 3.65 s. Full article
(This article belongs to the Special Issue Energy-Efficient Algorithms for Large-Scale Wireless Sensor Networks)
Show Figures

Figure 1

26 pages, 5842 KB  
Article
Varietal Identification and Yield Estimation in Potatoes Using UAV RGB Imagery in the Southern Highlands of Peru
by Miguel Tueros, Malú Galindo, Jean Alvarez, Jesús Pozo, Patricia Condezo, Rusbel Gutierrez, Rolando Bautista, Walter Mateu, Omar Paitamala and Daniel Matsusaka
AgriEngineering 2026, 8(2), 65; https://doi.org/10.3390/agriengineering8020065 - 12 Feb 2026
Viewed by 829
Abstract
The cultivation of potatoes is essential for rural food security, and the use of Unmanned Aerial Vehicle Red-Green-Blue (UAV-RGB) imagery allows for precise and cost-effective estimation of yield and identification of varieties, overcoming the limitations of manual assessment. We evaluated four INIA varieties [...] Read more.
The cultivation of potatoes is essential for rural food security, and the use of Unmanned Aerial Vehicle Red-Green-Blue (UAV-RGB) imagery allows for precise and cost-effective estimation of yield and identification of varieties, overcoming the limitations of manual assessment. We evaluated four INIA varieties (Bicentenario, Canchán, Shulay and Tahuaqueña) by integrating agronomic measurements (height, number and weight of tubers, leaf health) with color and textural indices derived from RGB orthomosaics. Yield prediction was modeled using Random Forest (RF) and Gradient Boosting (GB); varietal identification was approached with (i) a Convolutional Neural Network (CNN) that classifies RGB images and (ii) classical models such as Random Forest, Support Vector Machines (SVMs), K-Nearest Neighbors (KNNs), Decision Trees and Logistic Regression trained on EfficientNetB0 embeddings. The results showed significant genotypic differences in yield (p < 0.001): Tahuaqueña 13.86 ± 0.27 t ha−1 and Bicentenario 6.65 ± 0.27 t ha−1. The number of tubers (r = 0.52) and plant height (r = 0.23) correlated with yield; RGB indices showed low correlations (r < 0.3) and high redundancy (r > 0.9). RF achieved a better fit (Coefficient of determination, R2 = 0.54; Root Mean Square Error, RMSE = 2.72 t ha−1), excelling in stolon development (R2 = 0.66) and losing precision in maturation due to foliar senescence. In classification, the CNN and RF on embeddings achieved F1-macro ≈ 0.69 and 0.66 (Receiver Operating Characteristic—Area Under the Curve, ROC AUC RF = 0.89), with better identification of Bicentenario and Shulay. We conclude that UAV-RGB is a cost-effective alternative for phenotypic monitoring and varietal selection in high Andean contexts. These findings support the integration of UAV-RGB imagery into breeding and monitoring pipelines in resource-limited Andean systems. Full article
Show Figures

Figure 1

36 pages, 4432 KB  
Article
Investigating Unsafe Pedestrian Behavior at Urban Road Midblock Crossings Using Machine Learning: Lessons from Alexandria, Egypt
by Ahmed Mahmoud Darwish, Sherif Shokry, Maged Zagow, Marwa Elbany, Ali Qabur, Talal Obaid Alshammari, Ahmed Elkafoury and Mohamed Shaaban Alfiqi
Buildings 2026, 16(3), 505; https://doi.org/10.3390/buildings16030505 - 26 Jan 2026
Viewed by 749
Abstract
Examining pedestrian crossing violations at high-risk road midblock crossings has become essential, particularly in high-speed corridors, as a result of accidents at crossings resulting in fatalities. Hence, this article investigates such behavior in Alexandria, Egypt, as a credible case study in a developing [...] Read more.
Examining pedestrian crossing violations at high-risk road midblock crossings has become essential, particularly in high-speed corridors, as a result of accidents at crossings resulting in fatalities. Hence, this article investigates such behavior in Alexandria, Egypt, as a credible case study in a developing country. According to our research methodology, a comprehensive dataset of over 2400 field-observed video recordings was used for real-life data collection. Machine learning (ML) models, such as CatBoost and gradient boosting (GB), were employed to predict crossing decisions. The models showed that risky behavior is strongly influenced by waiting time, crossing time, and the number of crossing attempts. The highest predictive performance was achieved by CatBoost and gradient boosting, indicating strong interpersonal influence within small groups engaging in unsafe road-crossing behavior. In the same context, the Shapley additive explanation (SHAP) values for these variables were 3, 2, and 0.60, respectively. Subsequently, based on SHAP sensitivity analysis, the results show that the total time (s) and age group (40–60 Y) had a significant negative influence on model prediction converging to class 0 (e.g., crossing illegally). The results also showed that shorter exposure times increase the likelihood of crossing illegally. This research work is among the few studies that employ a behavior-based approach to understanding pedestrian behavior at midblock crossings. This study offers actionable insights and valuable information for urban designers and transportation planners when considering the design of midblock crossings. Full article
Show Figures

Figure 1

23 pages, 3238 KB  
Article
Agricultural Injury Severity Prediction Using Integrated Data-Driven Analysis: Global Versus Local Explainability Using SHAP
by Omer Mermer, Yanan Liu, Charles A. Jennissen, Milan Sonka and Ibrahim Demir
Safety 2026, 12(1), 6; https://doi.org/10.3390/safety12010006 - 8 Jan 2026
Cited by 1 | Viewed by 672
Abstract
Despite the agricultural sector’s consistently high injury rates, formal reporting is often limited, leading to sparse national datasets that hinder effective safety interventions. To address this, our study introduces a comprehensive framework leveraging advanced ensemble machine learning (ML) models to predict and interpret [...] Read more.
Despite the agricultural sector’s consistently high injury rates, formal reporting is often limited, leading to sparse national datasets that hinder effective safety interventions. To address this, our study introduces a comprehensive framework leveraging advanced ensemble machine learning (ML) models to predict and interpret the severity of agricultural injuries. We use a unique, manually curated dataset of over 2400 agricultural incidents from AgInjuryNews, a public repository of news reports detailing incidents across the United States. We evaluated six ensemble models, including Gradient Boosting (GB), eXtreme Grading Boosting (XGB), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), Histogram-based Gradient Boosting Regression Trees (HistGBRT), and Random Forest (RF), for their accuracy in classifying injury outcomes as fatal or non-fatal. A key contribution of our work is the novel integration of explainable artificial intelligence (XAI), specifically SHapley Additive exPlanations (SHAP), to overcome the “black-box” nature of complex ensemble models. The models demonstrated strong predictive performance, with most achieving an accuracy of approximately 0.71 and an F1-score of 0.81. Through global SHAP analysis, we identified key factors influencing injury severity across the dataset, such as the presence of helmet use, victim age, and the type of injury agent. Additionally, our application of local SHAP analysis revealed how specific variables like location and the victim’s role can have varying impacts depending on the context of the incident. These findings provide actionable, context-aware insights for developing targeted policy and safety interventions for a range of stakeholders, from first responders to policymakers, offering a powerful tool for a more proactive approach to agricultural safety. Full article
(This article belongs to the Special Issue Farm Safety, 2nd Edition)
Show Figures

Figure 1

22 pages, 1021 KB  
Article
A Multiclass Machine Learning Framework for Detecting Routing Attacks in RPL-Based IoT Networks Using a Novel Simulation-Driven Dataset
by Niharika Panda and Supriya Muthuraman
Future Internet 2026, 18(1), 35; https://doi.org/10.3390/fi18010035 - 7 Jan 2026
Cited by 3 | Viewed by 782
Abstract
The use of resource-constrained Low-Power and Lossy Networks (LLNs), where the IPv6 Routing Protocol for LLNs (RPL) is the de facto routing standard, has increased due to the Internet of Things’ (IoT) explosive growth. Because of the dynamic nature of IoT deployments and [...] Read more.
The use of resource-constrained Low-Power and Lossy Networks (LLNs), where the IPv6 Routing Protocol for LLNs (RPL) is the de facto routing standard, has increased due to the Internet of Things’ (IoT) explosive growth. Because of the dynamic nature of IoT deployments and the lack of in-protocol security, RPL is still quite susceptible to routing-layer attacks like Blackhole, Lowered Rank, version number manipulation, and Flooding despite its lightweight architecture. Lightweight, data-driven intrusion detection methods are necessary since traditional cryptographic countermeasures are frequently unfeasible for LLNs. However, the lack of RPL-specific control-plane semantics in current cybersecurity datasets restricts the use of machine learning (ML) for practical anomaly identification. In order to close this gap, this work models both static and mobile networks under benign and adversarial settings by creating a novel, large-scale multiclass RPL attack dataset using Contiki-NG’s Cooja simulator. To record detailed packet-level and control-plane activity including DODAG Information Object (DIO), DODAG Information Solicitation (DIS), and Destination Advertisement Object (DAO) message statistics along with forwarding and dropping patterns and objective-function fluctuations, a protocol-aware feature extraction pipeline is developed. This dataset is used to evaluate fifteen classifiers, including Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), k-Nearest Neighbors (KNN), Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB), AdaBoost (AB), and XGBoost (XGB) and several ensemble strategies like soft/hard voting, stacking, and bagging, as part of a comprehensive ML-based detection system. Numerous tests show that ensemble approaches offer better generalization and prediction performance. With overfitting gaps less than 0.006 and low cross-validation variance, the Soft Voting Classifier obtains the greatest accuracy of 99.47%, closely followed by XGBoost with 99.45% and Random Forest with 99.44%. Full article
Show Figures

Graphical abstract

27 pages, 7522 KB  
Article
Prediction of the Unconfined Compressive Strength of One-Part Geopolymer-Stabilized Soil Under Acidic Erosion: Comparison of Multiple Machine Learning Models
by Jidong Zhang, Guo Hu, Junyi Zhang and Jun Wu
Materials 2026, 19(1), 209; https://doi.org/10.3390/ma19010209 - 5 Jan 2026
Cited by 1 | Viewed by 543
Abstract
This study employed machine learning to investigate the mechanical behavior of one-part geopolymer (OPG)-stabilized soil subjected to acid erosion. Based on the unconfined compressive strength (UCS) data of acid-eroded OPG-stabilized soil, eight machine learning models, namely, Adaptive Boosting (AdaBoost), Decision Tree (DT), Extra [...] Read more.
This study employed machine learning to investigate the mechanical behavior of one-part geopolymer (OPG)-stabilized soil subjected to acid erosion. Based on the unconfined compressive strength (UCS) data of acid-eroded OPG-stabilized soil, eight machine learning models, namely, Adaptive Boosting (AdaBoost), Decision Tree (DT), Extra Trees (ET), Gradient Boosting (GB), Light Gradient Boosting Machine (LightGBM), Random Forest (RF), Support Vector Machine (SVM), and eXtreme Gradient Boosting (XGBoost), along with hyper-parameter optimization by Genetic Algorithm (GA), were used to predict the degradation of the UCS of OPG-stabilized soils under different durations of acid erosion. The results showed that GA-SVM (R2 = 0.9960, MAE = 0.0289) and GA-XGBoost (R2 = 0.9961, MAE = 0.0282) achieved the highest prediction accuracy. SHAP analysis further revealed that solution pH was the dominant factor influencing UCS, followed by the FA/GGBFS ratio, acid-erosion duration, and finally, acid type. The 2D PDP combined with SEM images showed that the microstructure of samples eroded by HNO3 was marginally denser than that of samples eroded by H2SO4, yielding a slightly higher UCS. At an FA/GGBFS ratio of 0.25, abundant silica and hydration products formed a dense matrix and markedly improved acid resistance. Further increases in FA content reduced hydration products and caused a sharp drop in UCS. Extending the erosion period from 0 to 120 days and decreasing the pH from 4 to 2 enlarged the pore network and diminished hydration products, resulting in the greatest UCS reduction. The results of the study provide a new idea for applying the ML model in geoengineering to predict the UCS performance of geopolymer-stabilized soils under acidic erosion. Full article
(This article belongs to the Section Construction and Building Materials)
Show Figures

Figure 1

37 pages, 4063 KB  
Article
Data-Driven Optimization of Sustainable Asphalt Overlays Using Machine Learning and Life-Cycle Cost Evaluation
by Ghazi Jalal Kashesh, Hasan H. Joni, Anmar Dulaimi, Abbas Jalal Kaishesh, Adnan Adhab K. Al-Saeedi, Tiago Pinto Ribeiro and Luís Filipe Almeida Bernardo
CivilEng 2026, 7(1), 1; https://doi.org/10.3390/civileng7010001 - 26 Dec 2025
Viewed by 959
Abstract
The growing demand for sustainable pavement materials has driven increased interest in asphalt mixtures incorporating recycled crumb rubber (CR). While CR modification enhances mechanical performance and durability, its often increases initial production costs and energy demand. This study develops an integrated framework that [...] Read more.
The growing demand for sustainable pavement materials has driven increased interest in asphalt mixtures incorporating recycled crumb rubber (CR). While CR modification enhances mechanical performance and durability, its often increases initial production costs and energy demand. This study develops an integrated framework that combines machine learning (ML) and economic analysis to identify the optimal balance between performance and cost in CR-modified asphalt overlay mixtures. An experimental dataset of conventional and CR-modified mixtures was used to train and validate multiple ML algorithms, including Random Forest (RF), Gradient Boosting (GB), Artificial Neural Networks (ANNs), and Support Vector Regression (SVR). The RF and ANN models exhibited superior predictive accuracy (R2 > 0.98) for key performance indicators such as Marshall stability, tensile strength ratio, rutting resistance, and resilient modulus. A Cost–Performance Index (CPI) integrating life-cycle cost analysis was developed to quantify trade-offs between performance and economic efficiency. Environmental life-cycle assessment indicated net greenhouse gas reductions of approximately 96 kg CO2-eq per ton of mixture despite higher production-phase emissions. Optimization results indicated that a CR content of approximately 15% and an asphalt binder content of 4.8–5.0% achieve the best performance–cost balance. The study demonstrates that ML-driven optimization provides a powerful, data-based approach for guiding sustainable pavement design and promoting the circular economy in road construction. Full article
Show Figures

Graphical abstract

Back to TopTop