MDPI - Publisher of Open Access Journals

19 pages, 2464 KB

Open AccessArticle

Stacked BiLSTM–Adaboost Collaborative Model: Construction of a Precision Analysis Model for GABA and Vitamin B9 in the Foxtail Millet

by Erhu Guo, Guoliang Wang, Jiahui Hu, Wenfeng Yan, Peiyue Zhao and Aiying Zhang

Agronomy 2025, 15(9), 2077; https://doi.org/10.3390/agronomy15092077 - 29 Aug 2025

Viewed by 192

Abstract

Amid the health-conscious consumption trend, functional foods rich in γ-aminobutyric acid (GABA) and vitamin B9 are gaining prominence. Foxtail millet, a traditional grain naturally abundant in these nutrients, faces quality assessment challenges due to the time-consuming and destructive nature of conventional methods, hindering [...] Read more.

Amid the health-conscious consumption trend, functional foods rich in γ-aminobutyric acid (GABA) and vitamin B9 are gaining prominence. Foxtail millet, a traditional grain naturally abundant in these nutrients, faces quality assessment challenges due to the time-consuming and destructive nature of conventional methods, hindering large-scale screening. This study pioneers the systematic application of hyperspectral imaging (HSI) for nondestructive detection of GABA and vitamin B9 in millet. Utilizing spectral data from 190 samples across 19 varieties, we developed an innovative “coarse-fine” feature wavelength selection strategy. First, interval-based algorithms (iRF, iVISSA) screened highly correlated wavelength subsets. Second, model population analysis (MPA) algorithms (CARS, BOSS) identified optimal core wavelengths, boosting model efficiency and robustness. Based on this, a stacked BiLSTM–Adaboost model was built, integrating bidirectional long short-term memory networks for sequence dependency and adaptive boosting for enhanced generalization. This enables efficient, rapid, nondestructive, and precise nutrient detection. This interdisciplinary breakthrough establishes a novel pathway for millet nutritional assessment, deepens fundamental research, and provides core support for industrial upgrading, breeding, quality control, and functional food development, supporting national health. Full article

(This article belongs to the Section Precision and Digital Agriculture)

► Show Figures

Figure 1

22 pages, 2887 KB

Open AccessArticle

Autoencoder-Assisted Stacked Ensemble Learning for Lymphoma Subtype Classification: A Hybrid Deep Learning and Machine Learning Approach

by Roseline Oluwaseun Ogundokun, Pius Adewale Owolawi, Chunling Tu and Etienne van Wyk

Tomography 2025, 11(8), 91; https://doi.org/10.3390/tomography11080091 - 18 Aug 2025

Viewed by 340

Abstract

Background: Accurate subtype identification of lymphoma cancer is crucial for effective diagnosis and treatment planning. Although standard deep learning algorithms have demonstrated robustness, they are still prone to overfitting and limited generalization, necessitating more reliable and robust methods. Objectives: This study presents an [...] Read more.

Background: Accurate subtype identification of lymphoma cancer is crucial for effective diagnosis and treatment planning. Although standard deep learning algorithms have demonstrated robustness, they are still prone to overfitting and limited generalization, necessitating more reliable and robust methods. Objectives: This study presents an autoencoder-augmented stacked ensemble learning (SEL) framework integrating deep feature extraction (DFE) and ensembles of machine learning classifiers to improve lymphoma subtype identification. Methods: Convolutional autoencoder (CAE) was utilized to obtain high-level feature representations of histopathological images, followed by dimensionality reduction via Principal Component Analysis (PCA). Various models were utilized for classifying extracted features, i.e., Random Forest (RF), Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), AdaBoost, and Extra Trees classifiers. A Gradient Boosting Machine (GBM) meta-classifier was utilized in an SEL approach to further fine-tune final predictions. Results: All the models were tested using accuracy, area under the curve (AUC), and Average Precision (AP) metrics. The stacked ensemble classifier performed better than all the individual models with a 99.04% accuracy, 0.9998 AUC, and 0.9996 AP, far exceeding what regular deep learning (DL) methods would achieve. Of standalone classifiers, MLP (97.71% accuracy, 0.9986 AUC, 0.9973 AP) and Random Forest (96.71% accuracy, 0.9977 AUC, 0.9953 AP) provided the best prediction performance, while AdaBoost was the poorest performer (68.25% accuracy, 0.8194 AUC, 0.6424 AP). PCA and t-SNE plots confirmed that DFE effectively enhances class discrimination. Conclusion: This study demonstrates a highly accurate and reliable approach to lymphoma classification by using autoencoder-assisted ensemble learning, reducing the misclassification rate and significantly enhancing the accuracy of diagnosis. AI-based models are designed to assist pathologists by providing interpretable outputs such as class probabilities and visualizations (e.g., Grad-CAM), enabling them to understand and validate predictions in the diagnostic workflow. Future studies should enhance computational efficacy and conduct multi-centre validation studies to confirm the model’s generalizability on extensive collections of histopathological datasets. Full article

► Show Figures

Figure 1

24 pages, 6356 KB

Open AccessArticle

Sandy Beach Extraction Method Based on Multi-Source Data and Feature Optimization: A Case in Fujian Province, China

by Jie Meng, Duanyang Xu, Zexing Tao and Quansheng Ge

Remote Sens. 2025, 17(16), 2754; https://doi.org/10.3390/rs17162754 - 8 Aug 2025

Viewed by 472

Abstract

Sandy beaches are vital geomorphic units with ecological, social, and economic significance, playing a key role in coastal protection and ecosystem regulation. However, they are increasingly threatened by climate change and human activities, highlighting the need for large-scale, high-precision monitoring to support sustainable [...] Read more.

Sandy beaches are vital geomorphic units with ecological, social, and economic significance, playing a key role in coastal protection and ecosystem regulation. However, they are increasingly threatened by climate change and human activities, highlighting the need for large-scale, high-precision monitoring to support sustainable management. Existing remote-sensing-based sandy beach extraction methods face challenges such as suboptimal feature selection and reliance on single data sources, limiting their generalization and accuracy. This study proposes a novel sandy beach extraction framework that integrates multi-source data, feature optimization, and collaborative modeling, with Fujian Province, China, as the study area. The framework combines Sentinel-1/2 imagery, nighttime light data, and terrain data to construct a comprehensive feature set containing 44 spectrum, index, polarization, texture, and terrain variables. The optimal feature subsets are selected using the Recursive Feature Elimination (RFE) algorithm. Six machine learning models—Random Forest (RF), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), Gradient Boosting Machine (GBM), Adaptive Boosting (AdaBoost), and Categorical Boosting (CatBoost)—along with an ensemble learning model, are employed for comparative analysis and performance optimization. The results indicate the following. (1) All models achieved the best performance when integrating all five types of features, with the average overall F1-score and accuracy reaching 0.9714 and 0.9733, respectively. (2) The number of optimal features selected by RFE varied by model, ranging from 19 to 36. The ten most important features across models were Band 2 (B2), Elevation, Band 3 (B3), VVVH_SUM, Spatial Average (SAVG), VH, Enhanced Water Index (EWI), Slope, Variance (VAR), and Normalized Difference Vegetation Index (NDVI). (3) The ensemble learning model outperformed all others, achieving an average overall accuracy, precision, recall, and F1-score of 0.9750, 0.9733, 0.9725, and 0.9734, respectively, under the optimal feature subset. A total of 555 sandy beaches were extracted in Fujian Province, covering an area of 43.60 km² with a total perimeter of 1263.59 km. This framework demonstrates strong adaptability and robustness in complex coastal environments, providing a scalable solution for intelligent sandy beach monitoring and refined resource management. Full article

(This article belongs to the Section Ocean Remote Sensing)

► Show Figures

Figure 1

18 pages, 1777 KB

Open AccessArticle

Machine Learning in Sensory Analysis of Mead—A Case Study: Ensembles of Classifiers

by Krzysztof Przybył, Daria Cicha-Wojciechowicz, Natalia Drabińska and Małgorzata Anna Majcher

Molecules 2025, 30(15), 3199; https://doi.org/10.3390/molecules30153199 - 30 Jul 2025

Viewed by 331

Abstract

The aim was to explore using machine learning (including cluster mapping and k-means methods) to classify types of mead based on sensory analysis and aromatic compounds. Machine learning is a modern tool that helps with detailed analysis, especially because verifying aromatic compounds is [...] Read more.

The aim was to explore using machine learning (including cluster mapping and k-means methods) to classify types of mead based on sensory analysis and aromatic compounds. Machine learning is a modern tool that helps with detailed analysis, especially because verifying aromatic compounds is challenging. In the first stage, a cluster map analysis was conducted, allowing for the exploratory identification of the most characteristic features of mead. Based on this, k-means clustering was performed to evaluate how well the identified sensory features align with logically consistent groups of observations. In the next stage, experiments were carried out to classify the type of mead using algorithms such as Random Forest (RF), adaptive boosting (AdaBoost), Bootstrap aggregation (Bagging), K-Nearest Neighbors (KNN), and Decision Tree (DT). The analysis revealed that the RF and KNN algorithms were the most effective in classifying mead based on sensory characteristics, achieving the highest accuracy. In contrast, the AdaBoost algorithm consistently produced the lowest accuracy results. However, the Decision Tree algorithm achieved the highest accuracy value (0.909), demonstrating its potential for precise classification based on aroma characteristics. The error matrix analysis also indicated that acacia mead was easier for the algorithms to identify than tilia or buckwheat mead. The results show the potential of combining an exploratory approach (cluster map with the k-means method) with machine learning. It is also important to focus on selecting and optimizing classification models used in practice because, as the results so far indicate, choosing the right algorithm greatly affects the success of mead identification. Full article

(This article belongs to the Special Issue Analytical Technologies and Intelligent Applications in Future Food)

► Show Figures

Graphical abstract

17 pages, 1149 KB

Open AccessArticle

The Relationship Between Smartphone and Game Addiction, Leisure Time Management, and the Enjoyment of Physical Activity: A Comparison of Regression Analysis and Machine Learning Models

by Sevinç Namlı, Bekir Çar, Ahmet Kurtoğlu, Eda Yılmaz, Gönül Tekkurşun Demir, Burcu Güvendi, Batuhan Batu and Monira I. Aldhahi

Healthcare 2025, 13(15), 1805; https://doi.org/10.3390/healthcare13151805 - 25 Jul 2025

Viewed by 551

Abstract

Background/Objectives: Smartphone addiction (SA) and gaming addiction (GA) have become risk factors for individuals of all ages in recent years. Especially during adolescence, it has become very difficult for parents to control this situation. Physical activity and the effective use of free time [...] Read more.

Background/Objectives: Smartphone addiction (SA) and gaming addiction (GA) have become risk factors for individuals of all ages in recent years. Especially during adolescence, it has become very difficult for parents to control this situation. Physical activity and the effective use of free time are the most important factors in eliminating such addictions. This study aimed to test a new machine learning method by combining routine regression analysis with the gradient-boosting machine (GBM) and random forest (RF) methods to analyze the relationship between SA and GA with leisure time management (LTM) and the enjoyment of physical activity (EPA) among adolescents. Methods: This study presents the results obtained using our developed GBM + RF hybrid model, which incorporates LTM and EPA scores as inputs for predicting SA and GA, following the preprocessing of data collected from 1107 high school students aged 15–19 years. The results were compared with those obtained using routine regression results and the lasso, ElasticNet, RF, GBM, AdaBoost, bagging, support vector regression (SVR), K-nearest neighbors (KNN), multi-layer perceptron (MLP), and light gradient-boosting machine (LightGBM) models. In the GBM + RF model, probability scores obtained from GBM were used as input to RF to produce final predictions. The performance of the models was evaluated using the R², mean absolute error (MAE), and mean squared error (MSE) metrics. Results: Classical regression analyses revealed a significant negative relationship between SA scores and both LTM and EPA scores. Specifically, as LTM and EPA scores increased, SA scores decreased significantly. In contrast, GA scores showed a significant negative relationship only with LTM scores, whereas EPA was not a significant determinant of GA. In contrast to the relatively low explanatory power of classical regression models, ML algorithms have demonstrated significantly higher prediction accuracy. The best performance for SA prediction was achieved using the Hybrid GBM + RF model (MAE = 0.095, MSE = 0.010, R² = 0.9299), whereas the SVR model showed the weakest performance (MAE = 0.310, MSE = 0.096, R² = 0.8615). Similarly, the Hybrid GBM + RF model also showed the highest performance for GA prediction (MAE = 0.090, MSE = 0.014, R² = 0.9699). Conclusions: These findings demonstrate that classical regression analyses have limited explanatory power in capturing complex relationships between variables, whereas ML algorithms, particularly our GBM + RF hybrid model, offer more robust and accurate modeling capabilities for multifactorial cognitive and performance-related predictions. Full article

(This article belongs to the Special Issue Physical Activity and Fitness in the Health Promotion of Children and Adolescents)

► Show Figures

Figure 1

23 pages, 4594 KB

Open AccessArticle

Ensemble Machine Learning Approaches for Bathymetry Estimation in Multi-Spectral Images

by Kazi Aminul Islam, Omar Abul-Hassan, Hongfang Zhang, Victoria Hill, Blake Schaeffer, Richard Zimmerman and Jiang Li

Geomatics 2025, 5(3), 34; https://doi.org/10.3390/geomatics5030034 - 22 Jul 2025

Cited by 1 | Viewed by 426

Abstract

Traditional bathymetry measures require a large number of human hours, and many bathymetry records are obsolete or missing. Automated measures of bathymetry would reduce costs and increase accessibility for research and applications. In this paper, we optimized a recent machine learning model, named [...] Read more.

Traditional bathymetry measures require a large number of human hours, and many bathymetry records are obsolete or missing. Automated measures of bathymetry would reduce costs and increase accessibility for research and applications. In this paper, we optimized a recent machine learning model, named CatBoostOpt, to estimate bathymetry based on high-resolution WorldView-2 (WV-2) multi-spectral optical satellite images. CatBoostOpt was demonstrated across the Florida Big Bend coastline, where the model learned correlations between in situ sound Navigation and Ranging (Sonar) bathymetry measurements and the corresponding multi-spectral reflectance values in WV-2 images to map bathymetry. We evaluated three different feature transformations as inputs for bathymetry estimation, including raw reflectance, log-linear, and log-ratio transforms of the raw reflectance value in WV-2 images. In addition, we investigated the contribution of each spectral band and found that utilizing all eight spectral bands in WV-2 images offers the best solution for handling complex water quality conditions. We compared CatBoostOpt with linear regression (LR), support vector machine (SVM), random forest (RF), AdaBoost, gradient boosting, and deep convolutional neural network (DCNN). CatBoostOpt with log-ratio transformed reflectance achieved the best performance with an average root mean square error (RMSE) of 0.34 and coefficient of determination (

R^{2}

) of 0.87. Full article

(This article belongs to the Special Issue Advances in Ocean Mapping and Hydrospatial Applications)

► Show Figures

Figure 1

17 pages, 4494 KB

Open AccessArticle

A Fault Detection Method for Multi-Sensor Data of Spring Circuit Breakers Based on the RF-Adaboost Algorithm

by Chuang Wang, Peijie Cong, Sifan Yu, Jing Yuan, Nian Lv, Yu Ling, Zheng Peng, Haoyan Zhang and Hongwei Mei

Energies 2025, 18(14), 3890; https://doi.org/10.3390/en18143890 - 21 Jul 2025

Viewed by 1115

Abstract

In the context of increasing the complexity and intelligence of modern power systems, traditional maintenance approaches for circuit breakers have shown limitations in meeting both reliability and economic requirements. This paper proposes a multi-sensor data fusion fault detection method based on the RF-Adaboost [...] Read more.

In the context of increasing the complexity and intelligence of modern power systems, traditional maintenance approaches for circuit breakers have shown limitations in meeting both reliability and economic requirements. This paper proposes a multi-sensor data fusion fault detection method based on the RF-Adaboost algorithm for spring-operated circuit breakers. By integrating pressure, speed, coil current, and energy storage motor sensors into the mechanism, multi-source operational data are acquired and processed via denoising and feature extraction techniques. A fault detection model is then constructed using the RF-Adaboost classifier. The experimental results demonstrate that the proposed method achieves over 96% accuracy in identifying typical fault states such as coil voltage deviation, reset spring fatigue, and closing spring degradation, outperforming conventional approaches. These results validate the model’s effectiveness and robustness in diagnosing complex mechanical failures in circuit breakers. Full article

(This article belongs to the Special Issue Advanced Control and Monitoring of High Voltage Power Systems)

► Show Figures

Figure 1

26 pages, 6787 KB

Open AccessArticle

Frost Resistance Prediction of Concrete Based on Dynamic Multi-Stage Optimisation Algorithm

by Xuwei Dong, Jiashuo Yuan and Jinpeng Dai

Algorithms 2025, 18(7), 441; https://doi.org/10.3390/a18070441 - 18 Jul 2025

Viewed by 302

Abstract

Concrete in cold areas is often subjected to a freeze–thaw cycle period, and a harsh environment will seriously damage the structure of concrete and shorten its life. The frost resistance of concrete is primarily evaluated by relative dynamic elastic modulus and mass loss [...] Read more.

Concrete in cold areas is often subjected to a freeze–thaw cycle period, and a harsh environment will seriously damage the structure of concrete and shorten its life. The frost resistance of concrete is primarily evaluated by relative dynamic elastic modulus and mass loss rate. To predict the frost resistance of concrete more accurately, based on the four ensemble learning models of random forest (RF), adaptive boosting (AdaBoost), categorical boosting (CatBoost), and extreme gradient boosting (XGBoost), this paper optimises the ensemble learning models by using a dynamic multi-stage optimisation algorithm (DMSOA). These models are trained using 7090 datasets, which use nine features as input variables; relative dynamic elastic modulus (RDEM) and mass loss rate (MLR) as prediction indices; and six indices of the coefficient of determination (R²), mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), correlation coefficient (CC), and standard deviation ratio (SDR) are selected to evaluate the models. The results show that the DMSOA-CatBoost model exhibits the best prediction performance. The R² of RDEM and MLR are 0.864 and 0.885, respectively, which are 6.40% and 11.15% higher than those of the original CatBoost model. Moreover, the model performs better in error control, with significantly lower MSE, RMSE, and MAE and stronger generalization ability. Additionally, compared with the two mainstream optimisation algorithms (SCA and AOA), DMSOA-CatBoost also has obvious advantages in prediction accuracy and stability. Related work in this paper has a certain significance for improving the durability and quality of concrete, which is conducive to predicting the performance of concrete in cold conditions faster and more accurately to optimise the concrete mix ratio whilst saving on engineering cost. Full article

► Show Figures

Figure 1

21 pages, 3168 KB

Open AccessArticle

Prediction on Slip Modulus of Screwed Connection for Timber–Concrete Composite Structures Based on Machine Learning

by Wen-Wu Lu, Yu-Wei Chen, Ji-Gang Xu, Hui-Feng Yang, Hao-Tian Tao, Wei Zheng and Ben-Kai Shi

Buildings 2025, 15(14), 2458; https://doi.org/10.3390/buildings15142458 - 13 Jul 2025

Viewed by 586

Abstract

Screwed connections are widely adopted in timber–concrete composite (TCC) structures. Owing to the diverse connection configurations and complex shear mechanisms, existing empirical models or theoretical formulas cannot accurately and efficiently predict the shear modulus of a screwed connection. Therefore, this study develops machine [...] Read more.

Screwed connections are widely adopted in timber–concrete composite (TCC) structures. Owing to the diverse connection configurations and complex shear mechanisms, existing empirical models or theoretical formulas cannot accurately and efficiently predict the shear modulus of a screwed connection. Therefore, this study develops machine learning (ML) algorithms to accurately predict the slip modulus. A data set including 222 sets of testing results was established by collecting the values of the slip modulus and associated ten features. Four ML methods, including decision tree (DT), random forest (RF), adaptive boosting machine (AdaBoost), and gradient boosting regression tree (GBRT), are adopted to develop the ML algorithm. The Shapley Additive Explanation (SHAP) framework was employed to interpret the effects of related features on the slip modulus. GBRT demonstrated the best accuracy compared with the other three ML methods in terms of four popular quantitative metrics. Moreover, all ML methods showed an evident accuracy advantage compared to existing analytical methods. Through a SHAP analysis, it was found that concrete strength, screw inclination, timber density, and timber type have a large impact on the slip modulus of a screwed connection compared to other input features. Full article

(This article belongs to the Special Issue Performance Analysis of Timber Composite Structures)

► Show Figures

Figure 1

16 pages, 4887 KB

Open AccessArticle

Composition Design of a Novel High-Temperature Titanium Alloy Based on Data Augmentation Machine Learning

by Xinpeng Fu, Boya Li, Binguo Fu, Tianshun Dong and Jingkun Li

Materials 2025, 18(13), 3099; https://doi.org/10.3390/ma18133099 - 30 Jun 2025

Viewed by 551

Abstract

The application fields of high-temperature titanium alloys are mainly concentrated in the aerospace, defense and military industries, such as the high-temperature parts of rocket and aircraft engines, missile cases, tail rudders, etc., which can greatly reduce the weight of aircraft while resisting high [...] Read more.

The application fields of high-temperature titanium alloys are mainly concentrated in the aerospace, defense and military industries, such as the high-temperature parts of rocket and aircraft engines, missile cases, tail rudders, etc., which can greatly reduce the weight of aircraft while resisting high temperatures. However, traditional high-temperature titanium alloys containing multiple types of elements (more than six) have a complex impact on the solidification, deformation, and phase transformation processes of the alloys, which greatly increases the difficulty of casting and deformation manufacturing of aerospace and military components. Therefore, developing low-component high-temperature titanium alloys suitable for hot processing and forming is urgent. This study used data augmentation (Gaussian noise) to expedite the development of a novel quinary high-temperature titanium alloy. Utilizing data augmentation, the generalization abilities of four machine learning models (XGBoost, RF, AdaBoost, Lasso) were effectively improved, with the XGBoost model demonstrating superior prediction accuracy (with an R² value of 0.94, an RMSE of 53.31, and an MAE of 42.93 in the test set). Based on this model, a novel Ti-7.2Al-1.8Mo-2.0Nb-0.4Si (wt.%) alloy was designed and experimentally validated. The UTS of the alloy at 600 °C was 629 MPa, closely aligning with the value (649 MPa) predicted by the model, with an error of 3.2%. Compared to as-cast Ti1100 and Ti6242S alloy (both containing six elements), the novel quinary alloy has considerable high-temperature (600 °C) mechanical properties and fewer components. The microstructure analysis revealed that the designed alloy was an α+β type alloy, featuring a typical Widmanstätten structure. The fracture form of the alloy was a mixture of brittle and ductile fracture at both room and high temperatures. Full article

(This article belongs to the Section Metals and Alloys)

► Show Figures

Graphical abstract

39 pages, 4402 KB

Open AccessArticle

Machine Learning and Deep Learning Approaches for Predicting Diabetes Progression: A Comparative Analysis

by Oluwafisayo Babatope Ayoade, Seyed Shahrestani and Chun Ruan

Electronics 2025, 14(13), 2583; https://doi.org/10.3390/electronics14132583 - 26 Jun 2025

Viewed by 1006

Abstract

The global burden of diabetes mellitus (DM) continues to escalate, posing significant challenges to healthcare systems worldwide. This study compares machine learning (ML) and deep learning (DL) methods, their hybrids, and ensemble strategies for predicting the health outcomes of diabetic patients. This work [...] Read more.

The global burden of diabetes mellitus (DM) continues to escalate, posing significant challenges to healthcare systems worldwide. This study compares machine learning (ML) and deep learning (DL) methods, their hybrids, and ensemble strategies for predicting the health outcomes of diabetic patients. This work aims to find the best solutions that strike a balance between computational efficiency and accurate prediction. The study systematically assessed a range of predictive models, including sophisticated DL techniques and conventional ML algorithms, based on computational efficiency and performance indicators. The study assessed prediction accuracy, processing speed, scalability, resource consumption, and interpretability using publicly accessible diabetes datasets. It methodically evaluates the selected models using key performance indicators (KPIs), training times, and memory usage. AdaBoost had the highest F1-score (0.74) on PIMA-768, while RF excelled on PIMA-2000 (~0.73). An RNN led the 3-class BRFSS survey (0.44), and a feed-forward DNN excelled on the binary BRFSS subset (0.45), while RF also achieved perfect accuracy on the EMR dataset (1.00) confirming that model performance is tightly coupled to each dataset’s scale, feature mix and label structure. The results highlight how lightweight, interpretable ML and DL models work in resource-constrained environments and for real-time health analytics. The study also compares its results with existing prediction models, confirming the benefits of selected ML approaches in enhancing diabetes-related medical outcomes that are substantial for practical implementation, providing a reliable and efficient framework for automated diabetes prediction to support initiative-taking disease management techniques and tailored treatment. The study concludes the essentiality of conducting a thorough assessment and validation of the model using current institutional datasets as this enhances accuracy, security, and confidence in AI-assisted healthcare decision-making. Full article

(This article belongs to the Special Issue Artificial Intelligence Methods for Biomedical Data Processing)

► Show Figures

Figure 1

24 pages, 2527 KB

Open AccessArticle

ISELDP: An Enhanced Dropout Prediction Model Using a Stacked Ensemble Approach for In-Session Learning Platforms

by Saad Alghamdi, Ben Soh and Alice Li

Electronics 2025, 14(13), 2568; https://doi.org/10.3390/electronics14132568 - 25 Jun 2025

Viewed by 474

Abstract

High dropout rates remain a significant challenge in Massive Open Online Courses (MOOCs), making early identification of at-risk students crucial. This study introduces a novel approach called In-Session Stacked Ensemble Learning for Dropout Prediction (ISELDP), which predicts student dropout during course sessions by [...] Read more.

High dropout rates remain a significant challenge in Massive Open Online Courses (MOOCs), making early identification of at-risk students crucial. This study introduces a novel approach called In-Session Stacked Ensemble Learning for Dropout Prediction (ISELDP), which predicts student dropout during course sessions by combining multiple base learners—Adaptive Boosting (AdaBoost), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Gradient Boosting—into a stacked ensemble with a Multi-Layer Perceptron (MLP) serving as the meta-learner. To optimise model performance, hyperparameters were tuned using Grid Search. The proposed method was evaluated under two scenarios using in-session student interaction data, one with imbalanced data and another with balanced data. Results demonstrate that ISELDP achieves an average accuracy of 88%, outperforming individual baseline models with improvements of up to 2% in accuracy and 2.4% in F1-score. Full article

► Show Figures

Figure 1

18 pages, 3319 KB

Open AccessArticle

Prediction of Flexural Bearing Capacity of Aluminum-Alloy-Reinforced RC Beams Based on Machine Learning

by Chunmei Mo, Jun Huang, Junzhong Huang, Tian Li and Yanxi Yang

Symmetry 2025, 17(6), 944; https://doi.org/10.3390/sym17060944 - 13 Jun 2025

Viewed by 435

Abstract

The strengthening of reinforced concrete (RC) beams with aluminum alloy was typically implemented in a symmetrical configuration. To evaluate the flexural performance of strengthened beams, four machine learning (ML)-based models, namely Random Forest (RF), Xtreme Gradient Boosting (XGBoost), Adaptive Boosting (Adaboost), and Light [...] Read more.

The strengthening of reinforced concrete (RC) beams with aluminum alloy was typically implemented in a symmetrical configuration. To evaluate the flexural performance of strengthened beams, four machine learning (ML)-based models, namely Random Forest (RF), Xtreme Gradient Boosting (XGBoost), Adaptive Boosting (Adaboost), and Light Gradient Boosting Machine (LightGBM), were developed for predicting the flexural bearing capacity of aluminum-alloy-strengthened RC beams. A total of 124 experimental samples were collected from the literature to establish a database for the prediction models, with 70% and 30% of the data allocated as the training and testing sets, respectively. The K-fold cross-validation method and random search method were used to adjust the hyperparameters of the algorithm, thereby improving the performance of the models. The effectiveness of the models was evaluated through statistical indicators, including the coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE). Additionally, absolute error boxplots and Taylor diagrams were used for statistical comparisons of the ML models. SHAP (Shapley Additive Explanations) was employed to analyze the importance of each input parameter in the predictive capability of the ML models and further examine the influence of feature variables on the model prediction results. The results showed that the predicted values of all models had a good correlation with the experimental values, especially the LightGBM model, which can effectively predict the flexural bearing capacity behavior of aluminum-alloy-strengthened RC beams. The research achievements provided a reliable prediction framework for optimizing aluminum-alloy-strengthened concrete structures and offered references for the design of future strengthened structures. Full article

(This article belongs to the Section Engineering and Materials)

► Show Figures

Figure 1

29 pages, 973 KB

Open AccessArticle

Connected Vehicles Security: A Lightweight Machine Learning Model to Detect VANET Attacks

by Muawia A. Elsadig, Abdelrahman Altigani, Yasir Mohamed, Abdul Hakim Mohamed, Akbar Kannan, Mohamed Bashir and Mousab A. E. Adiel

World Electr. Veh. J. 2025, 16(6), 324; https://doi.org/10.3390/wevj16060324 - 11 Jun 2025

Viewed by 2195

Abstract

Vehicular ad hoc networks (VANETs) aim to manage traffic, prevent accidents, and regulate various parts of traffic. However, owing to their nature, the security of VANETs remains a significant concern. This study provides insightful information regarding VANET vulnerabilities and attacks. It investigates a [...] Read more.

Vehicular ad hoc networks (VANETs) aim to manage traffic, prevent accidents, and regulate various parts of traffic. However, owing to their nature, the security of VANETs remains a significant concern. This study provides insightful information regarding VANET vulnerabilities and attacks. It investigates a number of security models that have recently been introduced to counter VANET security attacks with a focus on machine learning detection methods. This confirms that several challenges remain unsolved. Accordingly, this study introduces a lightweight machine learning model with a gain information feature selection method to detect VANET attacks. A balanced version of the well-known and recent dataset CISDS2017 was developed by applying a random oversampling technique. The developed dataset was used to train, test, and evaluate the proposed model. In other words, two layers of enhancements were applied—using a suitable feature selection technique and fixing the dataset imbalance problem. The results show that the proposed model, which is based on the Random Forest (RF) classifier, achieved excellent performance in terms of classification accuracy, computational cost, and classification error. It achieved an accuracy rate of 99.8%, outperforming all benchmark classifiers, including AdaBoost, decision tree (DT), K-nearest neighbors (KNNs), and multi-layer perceptron (MLP). To the best of our knowledge, this model outperforms all the existing classification techniques. In terms of processing cost, it consumes the least processing time, requiring only 69%, 59%, 35%, and 1.4% of the AdaBoost, DT, KNN, and MLP processing times, respectively. It causes negligible classification errors. Full article

(This article belongs to the Special Issue Internet of Vehicles and Autonomous Connected Vehicle: Privacy and Security)

► Show Figures

Figure 1

28 pages, 2698 KB

Open AccessArticle

Comparative Analysis of Machine Learning Methods with Chaotic AdaBoost and Logistic Mapping for Real-Time Sensor Fusion in Autonomous Vehicles: Enhancing Speed and Acceleration Prediction Under Uncertainty

by Mehmet Bilban and Onur İnan

Sensors 2025, 25(11), 3485; https://doi.org/10.3390/s25113485 - 31 May 2025

Viewed by 728

Abstract

This study presents a novel artificial intelligence-driven architecture for real-time sensor fusion in autonomous vehicles (AVs), leveraging Apache Kafka and MongoDB for synchronous and asynchronous data processing to enhance resilience against sensor failures and dynamic conditions. We introduce Chaotic AdaBoost (CAB), an advanced [...] Read more.

This study presents a novel artificial intelligence-driven architecture for real-time sensor fusion in autonomous vehicles (AVs), leveraging Apache Kafka and MongoDB for synchronous and asynchronous data processing to enhance resilience against sensor failures and dynamic conditions. We introduce Chaotic AdaBoost (CAB), an advanced variant of AdaBoost that integrates a logistic chaotic map into its weight update process, overcoming the limitations of deterministic ensemble methods. CAB is evaluated alongside k-Nearest Neighbors (kNNs), Artificial Neural Networks (ANNs), standard AdaBoost (AB), Gradient Boosting (GBa), and Random Forest (RF) for speed and acceleration prediction using CARLA simulator data. CAB achieves a superior 99.3% accuracy (MSE: 0.018 for acceleration, 0.010 for speed; MAE: 0.020 for acceleration, 0.012 for speed; R²: 0.993 for acceleration, 0.997 for speed), a mean Time-To-Collision (TTC) of 3.2 s, and jerk of 0.15 m/s³, outperforming AB (98.5%, MSE: 0.15, TTC: 2.8 s, jerk: 0.22 m/s³), GB (99.1%), ANN (98.2%), RF (97.5%), and kNN (87.0%). This logistic map-enhanced adaptability, reducing MSE by 88% over AB, ensures robust anomaly detection and data fusion under uncertainty, critical for AV safety and comfort. Despite a 20% increase in training time (72 s vs. 60 s for AB), CAB’s integration with Kafka’s high-throughput streaming maintains real-time efficacy, offering a scalable framework that advances operational reliability and passenger experience in autonomous driving. Full article

(This article belongs to the Topic Information Sensing Technology for Intelligent/Driverless Vehicle, 2nd Edition)

► Show Figures

Figure 1

Search Results (233)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (233)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI