Saved Queries

by Simona Correra, Arnar Evgení Gunnarsson, Marco Recenti, Francesco Mercaldo, Vittoria Nardone, Antonella Santone, Halldór Jónsson and Paolo Gargiulo

Diagnostics 2025, 15(16), 2098; https://doi.org/10.3390/diagnostics15162098 - 20 Aug 2025

Abstract

Objective: This study introduces an explainable, radiomics-based machine learning framework for the automated classification of sarcoma tumors using MRI. The approach aims to empower clinicians, reducing dependence on subjective image interpretation. Methods: A total of 186 MRI scans from 86 patients diagnosed with bone and soft tissue sarcoma were manually segmented to isolate tumor regions and corresponding healthy tissue. From these segmentations, 851 handcrafted radiomic features were extracted, including wavelet-transformed descriptors. A Random Forest classifier was trained to distinguish between tumor and healthy tissue, with hyperparameter tuning performed through nested cross-validation. To ensure transparency and interpretability, model behavior was explored through Feature Importance analysis and Local Interpretable Model-agnostic Explanations (LIME). Results: The model achieved an F1-score of 0.742, with an accuracy of 0.724 on the test set. LIME analysis revealed that texture and wavelet-based features were the most influential in driving the model’s predictions. Conclusions: By enabling accurate and interpretable classification of sarcomas in MRI, the proposed method provides a non-invasive approach to tumor classification, supporting an earlier, more personalized and precision-driven diagnosis. This study highlights the potential of explainable AI to assist in more secure clinical decision-making. Full article

(This article belongs to the Special Issue New Trends in Musculoskeletal Imaging)

14 pages, 2463 KiB

Open AccessArticle

Gesture-Based Secure Authentication System Using Triboelectric Nanogenerator Sensors

by Doohyun Han, Kun Kim, Jaehee Shin and Jinhyoung Park

Sensors 2025, 25(16), 5170; https://doi.org/10.3390/s25165170 - 20 Aug 2025

Abstract

This study presents a gesture-based authentication system utilizing triboelectric nanogenerator (TENG) sensors. As self-powered devices capable of generating high-voltage outputs without external power, TENG sensors are well-suited for low-power IoT sensors and smart device applications. The proposed system recognizes single tap, double tap, and holding gestures. The electrical characteristics of the sensor were evaluated under varying pressure conditions, confirming a linear relationship between applied force and output voltage. These results demonstrate the sensor’s high sensitivity and precision. A threshold-based classification algorithm was developed by analyzing signal features enabling accurate gesture recognition in real time. To enhance the practicality and scalability of the system, the algorithm was further configured to automatically segment raw sensor signals into gesture intervals and assign corresponding labels. From these segments, time-domain statistical features were extracted to construct a training dataset. A random forest classifier trained on this dataset achieved a high classification accuracy of 98.15% using five-fold cross-validation. The system reduces security risks commonly associated with traditional keypad input, offering a user-friendly and reliable authentication interface. This work confirms the feasibility of TENG-based gesture recognition for smart locks, IoT authentication devices, and wearable electronics, with future improvements expected through AI-based signal processing and multi-sensor integration. Full article

(This article belongs to the Special Issue Wearable Electronics and Self-Powered Sensors)

►▼ Show Figures

Figure 1

26 pages, 6361 KiB

Open AccessArticle

Improving the Generalization Performance of Debris-Flow Susceptibility Modeling by a Stacking Ensemble Learning-Based Negative Sample Strategy

by Jiayi Li, Jialan Zhang, Jingyuan Yu, Yongbo Chu and Haijia Wen

Water 2025, 17(16), 2460; https://doi.org/10.3390/w17162460 - 19 Aug 2025

Abstract

To address the negative sample selection bias and limited interpretability of traditional debris-flow event susceptibility models, this study proposes a framework that enhances generalization by integrating negative sample screening via a stacking ensemble model with an interpretable random forest. Using Wenchuan County, Sichuan Province, as the study area, 19 influencing factors were selected, encompassing topographic, geological, environmental, and anthropogenic variables. First, a stacking ensemble—comprising logistic regression (LR), decision tree (DT), gradient boosting decision tree (GBDT), and random forest (RF)—was employed as a preliminary classifier to identify very low-susceptibility areas as reliable negative samples, achieving a balanced 1:1 ratio of positive to negative instances. Subsequently, a stacking–random forest model (Stacking-RF) was trained for susceptibility zonation, and SHAP (Shapley additive explanations) was applied to quantify each factor’s contribution. The results show that: (1) the stacking ensemble achieved a test-set AUC (area under the receiver operating characteristic curve) of 0.9044, confirming its effectiveness in screening dependable negative samples; (2) the random forest model attained a test-set AUC of 0.9931, with very high-susceptibility zones—covering 15.86% of the study area—encompassing 92.3% of historical debris-flow events; (3) SHAP analysis identified the distance to a road and point-of-interest (POI) kernel density as the primary drivers of debris-flow susceptibility. The method quantified nonlinear impact thresholds, revealing significant susceptibility increases when road distance was less than 500 m or POI kernel density ranged between 50 and 200 units/km²; and (4) cross-regional validation in Qingchuan County demonstrated that the proposed model improved the capture rate for high/very high susceptibility areas by 48.86%, improving it from 4.55% to 53.41%, with a site density of 0.0469 events/km² in very high-susceptibility zones. Overall, this framework offers a high-precision and interpretable debris-flow risk management tool, highlights the substantial influence of anthropogenic factors such as roads and land development, and introduces a “negative-sample screening with cross-regional generalization” strategy to support land-use planning and disaster prevention in mountainous regions. Full article

(This article belongs to the Special Issue Intelligent Analysis, Monitoring and Assessment of Debris Flow)

►▼ Show Figures

Figure 1

41 pages, 4171 KiB

Open AccessArticle

Development of a System for Recognising and Classifying Motor Activity to Control an Upper-Limb Exoskeleton

by Artem Obukhov, Mikhail Krasnyansky, Yaroslav Merkuryev and Maxim Rybachok

Appl. Syst. Innov. 2025, 8(4), 114; https://doi.org/10.3390/asi8040114 - 19 Aug 2025

Abstract

This paper addresses the problem of recognising and classifying hand movements to control an upper-limb exoskeleton. To solve this problem, a multisensory system based on the fusion of data from electromyography (EMG) sensors, inertial measurement units (IMUs), and virtual reality (VR) trackers is proposed, which provides highly accurate detection of users’ movements. Signal preprocessing (noise filtering, segmentation, normalisation) and feature extraction were performed to generate input data for regression and classification models. Various machine learning algorithms are used to recognise motor activity, ranging from classical algorithms (logistic regression, k-nearest neighbors, decision trees) and ensemble methods (random forest, AdaBoost, eXtreme Gradient Boosting, stacking, voting) to deep neural networks, including convolutional neural networks (CNNs), gated recurrent units (GRUs), and transformers. The algorithm for integrating machine learning models into the exoskeleton control system is considered. In experiments aimed at abandoning proprietary tracking systems (VR trackers), absolute position regression was performed using data from IMU sensors with 14 regression algorithms: The random forest ensemble provided the best accuracy (mean absolute error = 0.0022 metres). The task of classifying activity categories out of nine types is considered below. Ablation analysis showed that IMU and VR trackers produce a sufficient informative minimum, while adding EMG also introduces noise, which degrades the performance of simpler models but is successfully compensated for by deep networks. In the classification task using all signals, the maximum result (99.2%) was obtained on Transformer; the fully connected neural network generated slightly worse results (98.4%). When using only IMU data, fully connected neural network, Transformer, and CNN–GRU networks provide 100% accuracy. Experimental results confirm the effectiveness of the proposed architectures for motor activity classification, as well as the use of a multi-sensor approach that allows one to compensate for the limitations of individual types of sensors. The obtained results make it possible to continue research in this direction towards the creation of control systems for upper exoskeletons, including those used in rehabilitation and virtual simulation systems. Full article

►▼ Show Figures

Figure 1

20 pages, 2173 KiB

Open AccessArticle

Pain State Classification of Stiff Knee Joint Using Electromyogram for Robot-Based Post-Fracture Rehabilitation Training

by Yang Zheng, Dimao He, Yuan He, Xiangrui Kong, Xiaochen Fan, Min Li, Guanghua Xu and Jichao Yin

Sensors 2025, 25(16), 5142; https://doi.org/10.3390/s25165142 - 19 Aug 2025

Abstract

Knee joint stiffness occurs and severely limits its range of motion (ROM) after facture around the knee. During mobility training, knee joints need to be flexed to the maximum angle position (maxAP) that can induce pain at an appropriate level in order to pull apart intra-articular adhesive structures while avoiding secondary injuries. However, the maxAP varies with training and is mostly determined by the pain level of patients. In this study, the feasibility of utilizing electromyogram (EMG) activities to detect maxAP was investigated. Specifically, the maxAP detection was converted into a binary classification between pain level three of the numerical rating scales (pain) and below (painless) according to clinical requirements. Firstly, 12 post-fracture patients with knee joint stiffness participated in Experiment I, with a therapist performing routine mobility training and EMG signals being recorded from knee flexors and extensors. The results showed that the extracted EMG features were significantly different between the pain and painless states. Then, the maxAP estimation performance was tested on a knee rehabilitation robot in Experiment II, with another seven patients being involved. The support vector machine and random forest models were used to classify between pain and painless states and obtained a mean accuracy of 87.90% ± 4.55% and 89.10% ± 4.39%, respectively, leading to an average estimation bias of 6.5° ± 5.1° and 4.5° ± 3.5°. These results indicated that the pain-induced EMG can be used to accurately classify pain states for the maxAP estimation in post-fracture mobility training, which can potentially facilitate the application of robotic techniques in fracture rehabilitation. Full article

(This article belongs to the Special Issue Electroencephalogram/Electromyogram-Based Sensing Technologies for Biomedical Applications: Challenges and Possible Applications)

►▼ Show Figures

Figure 1

23 pages, 14694 KiB

Open AccessArticle

PLCNet: A 3D-CNN-Based Plant-Level Classification Network Hyperspectral Framework for Sweetpotato Virus Disease Detection

by Qiaofeng Zhang, Wei Wang, Han Su, Gaoxiang Yang, Jiawen Xue, Hui Hou, Xiaoyue Geng, Qinghe Cao and Zhen Xu

Remote Sens. 2025, 17(16), 2882; https://doi.org/10.3390/rs17162882 - 19 Aug 2025

Abstract

Sweetpotato virus disease (SPVD) poses a significant threat to global sweetpotato production; therefore, early, accurate field-scale detection is necessary. To address the limitations of the currently utilized assays, we propose PLCNet (Plant-Level Classification Network), a rapid, non-destructive SPVD identification framework using UAV-acquired hyperspectral imagery. High-resolution data from early sweetpotato growth stages were processed via three feature selection methods—Random Forest (RF), Minimum Redundancy Maximum Relevance (mRMR), and Local Covariance Matrix (LCM)—in combination with 24 vegetation indices. Variance Inflation Factor (VIF) analysis reduced multicollinearity, yielding an optimized SPVD-sensitive feature set. First, using the RF-selected bands and vegetation indices, we benchmarked four classifiers—Support Vector Machine (SVM), Gradient Boosting Decision Tree (GBDT), Residual Network (ResNet), and 3D Convolutional Neural Network (3D-CNN). Under identical inputs, the 3D-CNN achieved superior performance (OA = 96.55%, Macro F1 = 95.36%, UA_mean = 0.9498, PA_mean = 0.9504), outperforming SVM, GBDT, and ResNet. Second, with the same spectral–spatial features and 3D-CNN backbone, we compared a pixel-level baseline (CropdocNet) against our plant-level PLCNet. CropdocNet exhibited spatial fragmentation and isolated errors, whereas PLCNet’s two-stage pipeline—deep feature extraction followed by connected-component analysis and majority voting—aggregated voxel predictions into coherent whole-plant labels, substantially reducing noise and enhancing biological interpretability. By integrating optimized feature selection, deep learning, and plant-level post-processing, PLCNet delivers a scalable, high-throughput solution for precise SPVD monitoring in agricultural fields. Full article

(This article belongs to the Special Issue Hyperspectral Sensors for Soil Parameters and Crop Parameters Retrieval)

►▼ Show Figures

Figure 1

23 pages, 8824 KiB

Open AccessArticle

Investigating Green View Perception in Non-Street Areas by Combining Baidu Street View and Sentinel-2 Images

by Hongyan Wang, Xianghong Che and Xinru Yang

Sustainability 2025, 17(16), 7485; https://doi.org/10.3390/su17167485 - 19 Aug 2025

Abstract

Urban greening distribution critically impacts residents’ quality of life and environmental sustainability. While the Green View Index (GVI), derived from street view imagery, is widely adopted for urban green space assessment, its limitation lies in the inability to capture non-street-area vegetation. Remote sensing imagery, conversely, provides full-coverage urban vegetation data. This study focuses on Beijing’s Third Ring Road area, employing DeepLabv3+ to calculate a street-view-based GVI as a predictor. Correlations between the GVI and Sentinel-2 spectral bands, along with two vegetation indices, such as the Normalized Difference Vegetation Index (NDVI) and Fractional Vegetation Cover (FVC), were analyzed under varying buffer radius. Regression and classification models were subsequently developed for GVI prediction. The optimal classifier was then applied to estimate green perception levels in non-street zones. The results demonstrated that (1) at a 25 m buffer radius, the near-infrared band, NDVI, and FVC exhibited the highest correlations with the GVI, reaching 0.553, 0.75, and 0.752, respectively. (2) Among the five machine learning regression models evaluated, the random forest algorithm demonstrated superior performance in GVI estimation, achieving a coefficient of determination (R²) of 0.787, with a root mean square error (RMSE) of 0.063 and a mean absolute error (MAE) of 0.045. (3) When evaluating categorical perception levels of urban greenery, the Extremely Randomized Trees classifier (Extra Trees) demonstrated superior performance in green vision perception level estimation, achieving an accuracy (ACC) score of 0.652. (4) The green perception level in non-road areas within Beijing’s Third Ring Road is 56.8%, which is considered relatively poor. Moreover, the green perception level within the Second Ring Road is even lower than that in the area between the Second and Third Ring roads. This study is expected to provide valuable insights and references for the adjustment and optimization of green perception distribution in Beijing, thereby supporting more informed urban planning and the development of sustainable, human-centered green spaces across the city. Full article

(This article belongs to the Special Issue Remote Sensing in Landscape Quality Assessment)

►▼ Show Figures

Figure 1

19 pages, 2268 KiB

Open AccessArticle

Toward the Implementation of Text-Based Web Page Classification and Filtering Solution for Low-Resource Home Routers Using a Machine Learning Approach

by Audronė Janavičiūtė, Agnius Liutkevičius and Nerijus Morkevičius

Electronics 2025, 14(16), 3280; https://doi.org/10.3390/electronics14163280 - 18 Aug 2025

Abstract

Restricting and filtering harmful content on the Internet is a serious problem that is often addressed even at the state and legislative levels. Existing solutions for restricting and filtering online content are usually installed on end-user devices and are easily circumvented and difficult to adapt to larger groups of users with different filtering needs. To mitigate this problem, this study proposed a model of a web page classification and filtering solution suitable for use on home routers or other low-resource web page filtering devices. The proposed system combines the constantly updated web page category list approach with machine learning-based text classification methods. Unlike existing web page filtering solutions, such an approach does not require additional software on the client-side, is more difficult to circumvent for ordinary users and can be implemented using common low-resource routers intended for home and organizations usage. This study evaluated the feasibility of the proposed solution by creating the less resource-demanding implementations of machine learning-based web page classification methods adapted for low-resource home routers that could be used to classify and filter unwanted Internet pages in real-time based on the text of the page. The experimental evaluation of softmax regression, decision tree, random forest, and linear SVM (support vector machine) machine learning methods implemented in the C/C++ programming language was performed using a commercial home router Asus RT-AC85P with 256 MB RAM (random access memory) and MediaTek MT7621AT 880 MHz CPU (central processing unit). The implementation of the linear SVM classifier demonstrated the best accuracy of 0.9198 and required 1.86 s to process a web page. The random forest model was only slightly faster (1.56 s to process a web page), while its accuracy reached only 0.7879. Full article

(This article belongs to the Special Issue Advanced System Architectures and AI-Driven Innovations for Next-Generation Computing)

►▼ Show Figures

Figure 1

22 pages, 4931 KiB

Open AccessArticle

Advanced Cybersecurity Framework for Detecting Fake Data Using Optimized Feature Selection and Stacked Ensemble Learning

by Abrar M. Alajlan

Electronics 2025, 14(16), 3275; https://doi.org/10.3390/electronics14163275 - 18 Aug 2025

Abstract

As smart cities continue to generate vast quantities of data, data integrity is increasingly threatened by instances of fraud. Anomalous or fake data deteriorate the process and have impacts on decision-making systems and predictive analytics. Hence, an effective and intelligent fake data detection model was designed by combining an advanced feature selection method with a robust ensemble classification framework. Initially, the raw data are eliminated by performing normalization, feature transformation, and noise filtering that enhances the reliability of the model. The dimensionality issues are mitigated by eliminating redundant features via the proposed Elite Tuning Strategy-Enhanced Polar Bear Optimization algorithm. It simulates the hunting behavior of polar bears, balancing exploration and exploitation features. The proposed Stacking Ensemble-based Random AdaBoost Quadratic Discriminant model leverages the merits of diverse base learners, including AdaBoost, Quadratic Discriminant Analysis, and Random Forest, that classify the feature subset and the integration of prediction processes with a meta-feature vector-processed meta-classifier such as a multilayer perceptron or logistic regression model that predicts the final outcome. This hierarchical architecture validates resilience against noise and improves generalization and prediction accuracy. Thus, the experimental results show that the proposed method outperforms existing approaches in terms of accuracy, precision, and latency, yielding values of 98.78%, 98.75%, and 16 ms, respectively, using the UNSW-NB15 dataset. Full article

(This article belongs to the Special Issue Digital Security and Privacy Protection: Trends and Applications, 2nd Edition)

►▼ Show Figures

Figure 1

19 pages, 2569 KiB

Open AccessArticle

CNN-Random Forest Hybrid Method for Phenology-Based Paddy Rice Mapping Using Sentinel-2 and Landsat-8 Satellite Images

by Dodi Sudiana, Sayyidah Hanifah Putri, Dony Kushardono, Anton Satria Prabuwono, Josaphat Tetuko Sri Sumantyo and Mia Rizkinia

Computers 2025, 14(8), 336; https://doi.org/10.3390/computers14080336 - 18 Aug 2025

Abstract

The agricultural sector plays a vital role in achieving the second Sustainable Development Goal: “Zero Hunger”. To ensure food security, agriculture must remain resilient and productive. In Indonesia, a major rice-producing country, the conversion of agricultural land for non-agricultural uses poses a serious threat to food availability. Accurate and timely mapping of paddy rice is therefore crucial. This study proposes a phenology-based mapping approach using a Convolutional Neural Network-Random Forest (CNN-RF) Hybrid model with multi-temporal Sentinel-2 and Landsat-8 imagery. Image processing and analysis were conducted using the Google Earth Engine platform. Raw spectral bands and four vegetation indices—NDVI, EVI, LSWI, and RGVI—were extracted as input features for classification. The CNN-RF Hybrid classifier demonstrated strong performance, achieving an overall accuracy of 0.950 and a Cohen’s Kappa coefficient of 0.893. These results confirm the effectiveness of the proposed method for mapping paddy rice in Indramayu Regency, West Java, using medium-resolution optical remote sensing data. The integration of phenological characteristics and deep learning significantly enhances classification accuracy. This research supports efforts to monitor and preserve paddy rice cultivation areas amid increasing land use pressures, contributing to national food security and sustainable agricultural practices. Full article

(This article belongs to the Special Issue Machine Learning Applications in Pattern Recognition)

►▼ Show Figures

Figure 1

22 pages, 2087 KiB

Open AccessArticle

Explainable AI-Based Feature Selection Approaches for Raman Spectroscopy

by Nicola Rossberg, Rekha Gautam, Katarzyna Komolibus, Barry O’Sullivan and Andrea Visentin

Diagnostics 2025, 15(16), 2063; https://doi.org/10.3390/diagnostics15162063 - 18 Aug 2025

Viewed by 26

Abstract

Background: Raman Spectroscopy is a non-invasive technique capable of characterising tissue constituents and detecting conditions such as cancer with high accuracy. Machine learning techniques can automate this task and discover relevant data patterns. However, the high-dimensional, multicollinear nature of Raman data makes their deployment and explainability challenging. A model’s transparency and ability to explain decision pathways have become crucial for medical integration. Consequently, an effective method of feature-reduction while minimising information loss is sought. Methods: Two new feature selection methods for Raman spectroscopy are introduced. These methods are based on explainable deep learning approaches, considering Convolutional Neural Networks and Transformers. Their features are extracted using GradCam and attention scores, respectively. The performance of the extracted features is compared to established feature selection approaches across four classifiers and three datasets. Results: We compared the proposed method against established feature selection approaches over three real-world datasets and different compression levels. Comparable accuracy levels were obtained using only 10% of features. Model-based approaches are the most accurate. Using Convolutional Neural Networks and Random Forest-assigned feature importance performs best when maintaining between 5–20% of features, while LinearSVC with L1 penalisation leads to higher accuracy when selecting only 1% of them. The proposed Convolutional Neural Networks-based GradCam approach has the highest average accuracy. Conclusions: No approach is found to perform best in all scenarios, suggesting that multiple alternatives should be assessed in each application. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

►▼ Show Figures

Figure 1

17 pages, 6221 KiB

Open AccessArticle

Machine Learning-Based Flood Risk Assessment in Urban Watershed: Mapping Flood Susceptibility in Charlotte, North Carolina

by Sujan Shrestha, Dewasis Dahal, Nishan Bhattarai, Sunil Regmi, Roshan Sewa and Ajay Kalra

Geographies 2025, 5(3), 43; https://doi.org/10.3390/geographies5030043 - 18 Aug 2025

Viewed by 56

Abstract

Flood impacts are intensifying due to the increasing frequency and severity of factors such as severe weather events, climate change, and unplanned urbanization. This study focuses on Briar Creek in Charlotte, North Carolina, an area historically affected by flooding. Three machine learning algorithms —bagging (random forest), extreme gradient boosting (XGBoost), and logistic regression—were used to develop a flood susceptibility model that incorporates topographical, hydrological, and meteorological variables. Key predictors included slope, aspect, curvature, flow velocity, flow concentration, discharge, and 8 years of rainfall data. A flood inventory of 750 data points was compiled from historic flood records. The dataset was divided into training (70%) and testing (30%) subsets, and model performance was evaluated using accuracy metrics, confusion matrices, and classification reports. The results indicate that logistic regression outperformed both XGBoost and bagging in terms of predictive accuracy. According to the logistic regression model, the study area was classified into five flood risk zones: 5.55% as very high risk, 8.66% as high risk, 12.04% as moderate risk, 21.56% as low risk, and 52.20% as very low risk. The resulting flood susceptibility map constitutes a valuable tool for emergency preparedness and infrastructure planning in high-risk zones. Full article

►▼ Show Figures

Figure 1

22 pages, 5884 KiB

Open AccessArticle

From Shadows to Signatures: Interpreting Bypass Diode Faults in PV Modules Under Partial Shading Through Data-Driven Models

by Hatice Gül Sezgin-Ugranlı

Electronics 2025, 14(16), 3270; https://doi.org/10.3390/electronics14163270 - 18 Aug 2025

Viewed by 127

Abstract

Bypass diode faults are among the most hard-to-detect but impactful anomalies in photovoltaic (PV) systems, especially under partial shading conditions, where their electrical signatures often resemble those caused by non-critical irradiance variations. This study presents a systematic simulation-based investigation into how different bypass diode fault types—short-circuited, open-circuited, and healthy—affect the electrical behavior of PV strings under diverse irradiance profiles. A high-resolution MATLAB/Simulink model is developed to simulate 27 unique diode fault configurations across multiple shading scenarios, enabling the extraction of key features from resulting I–V curves. These features include global and local maximum power point parameters, open-circuit voltage, and short-circuit current. To address the challenge of feature redundancy and classification ambiguity, a preprocessing step is applied to remove near-duplicate instances and improve model generalization. An artificial neural network (ANN) model is then trained to classify the number of faulty bypass diodes based on these features. Comparative evaluations are conducted with support vector machines and random forests. The results indicate that the ANN achieves the highest test accuracy (93.57%) and average AUC (0.9925), outperforming other classifiers in both robustness and discriminative power. These findings highlight the importance of feature-informed, data-driven approaches for fault detection in PV systems and demonstrate the feasibility of diode fault classification without precise fault localization. Full article

(This article belongs to the Special Issue Renewable Energy Power and Artificial Intelligence)

►▼ Show Figures

Figure 1

22 pages, 2887 KiB

Open AccessArticle

Autoencoder-Assisted Stacked Ensemble Learning for Lymphoma Subtype Classification: A Hybrid Deep Learning and Machine Learning Approach

by Roseline Oluwaseun Ogundokun, Pius Adewale Owolawi, Chunling Tu and Etienne van Wyk

Tomography 2025, 11(8), 91; https://doi.org/10.3390/tomography11080091 - 18 Aug 2025

Viewed by 104

Abstract

Background: Accurate subtype identification of lymphoma cancer is crucial for effective diagnosis and treatment planning. Although standard deep learning algorithms have demonstrated robustness, they are still prone to overfitting and limited generalization, necessitating more reliable and robust methods. Objectives: This study presents an autoencoder-augmented stacked ensemble learning (SEL) framework integrating deep feature extraction (DFE) and ensembles of machine learning classifiers to improve lymphoma subtype identification. Methods: Convolutional autoencoder (CAE) was utilized to obtain high-level feature representations of histopathological images, followed by dimensionality reduction via Principal Component Analysis (PCA). Various models were utilized for classifying extracted features, i.e., Random Forest (RF), Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), AdaBoost, and Extra Trees classifiers. A Gradient Boosting Machine (GBM) meta-classifier was utilized in an SEL approach to further fine-tune final predictions. Results: All the models were tested using accuracy, area under the curve (AUC), and Average Precision (AP) metrics. The stacked ensemble classifier performed better than all the individual models with a 99.04% accuracy, 0.9998 AUC, and 0.9996 AP, far exceeding what regular deep learning (DL) methods would achieve. Of standalone classifiers, MLP (97.71% accuracy, 0.9986 AUC, 0.9973 AP) and Random Forest (96.71% accuracy, 0.9977 AUC, 0.9953 AP) provided the best prediction performance, while AdaBoost was the poorest performer (68.25% accuracy, 0.8194 AUC, 0.6424 AP). PCA and t-SNE plots confirmed that DFE effectively enhances class discrimination. Conclusion: This study demonstrates a highly accurate and reliable approach to lymphoma classification by using autoencoder-assisted ensemble learning, reducing the misclassification rate and significantly enhancing the accuracy of diagnosis. AI-based models are designed to assist pathologists by providing interpretable outputs such as class probabilities and visualizations (e.g., Grad-CAM), enabling them to understand and validate predictions in the diagnostic workflow. Future studies should enhance computational efficacy and conduct multi-centre validation studies to confirm the model’s generalizability on extensive collections of histopathological datasets. Full article

►▼ Show Figures

Figure 1

23 pages, 5632 KiB

Open AccessArticle

Classification of Rockburst Intensity Grades: A Method Integrating k-Medoids-SMOTE and BSLO-RF

by Qinzheng Wu, Bing Dai, Danli Li, Hanwen Jia and Penggang Li

Appl. Sci. 2025, 15(16), 9045; https://doi.org/10.3390/app15169045 - 16 Aug 2025

Viewed by 212

Abstract

Precise forecasting of rockburst intensity categories is vital to safeguarding operational safety and refining design protocols in deep underground engineering. This study proposes an intelligent forecasting framework through the integration of k-medoids-SMOTE and the BSLO-optimized Random Forest (BSLO-RF) algorithm. A curated dataset encompassing 351 rockburst instances, stratified into four intensity grades, was compiled via systematic literature synthesis. To mitigate data imbalance and outlier interference, z-score normalization and k-medoids-SMOTE oversampling were implemented, with t-SNE visualization confirming improved inter-class distinguishability. Notably, the BSLO algorithm was utilized for hyperparameter tuning of the Random Forest model, thereby strengthening its global search and local refinement capabilities. Comparative analyses revealed that the optimized BSLO-RF framework outperformed conventional machine learning methods (e.g., BSLO-SVM, BSLO-BP), achieving an average prediction accuracy of 89.16% on the balanced dataset—accompanied by a recall of 87.5% and F1-score of 0.88. It exhibited superior performance in predicting extreme grades: 93.3% accuracy for Level I (no rockburst) and 87.9% for Level IV (severe rockburst), exceeding BSLO-SVM (75.8% for Level IV) and BSLO-BP (72.7% for Level IV). Field validation via the Zhongnanshan Tunnel project further corroborated its reliability, yielding an 80% prediction accuracy (four out of five cases correctly classified) and verifying its adaptability to complex geological settings. This research introduces a robust intelligent classification approach for rockburst intensity, offering actionable insights for risk assessment and mitigation in deep mining and tunneling initiatives. Full article

(This article belongs to the Special Issue Technologies and Methods for Exploitation of Geological Resources, 2nd Edition)

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 80.

Go to page 1 2 3 4 5

Search Results (3,998)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI