Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (77)

Search Parameters:
Keywords = high-dimensional imbalanced data

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 1296 KB  
Article
Sparse Regularized Autoencoders-Based Radiomics Data Augmentation for Improved EGFR Mutation Prediction in NSCLC
by Muhammad Asif Munir, Reehan Ali Shah, Urooj Waheed, Muhammad Aqeel Aslam, Zeeshan Rashid, Mohammed Aman, Muhammad I. Masud and Zeeshan Ahmad Arfeen
Future Internet 2025, 17(11), 495; https://doi.org/10.3390/fi17110495 - 29 Oct 2025
Abstract
Lung cancer (LC) remains a leading cause of cancer mortality worldwide, where accurate and early identification of gene mutations such as epidermal growth factor receptor (EGFR) is critical for precision treatment. However, machine learning-based radiomics approaches often face challenges due to the small [...] Read more.
Lung cancer (LC) remains a leading cause of cancer mortality worldwide, where accurate and early identification of gene mutations such as epidermal growth factor receptor (EGFR) is critical for precision treatment. However, machine learning-based radiomics approaches often face challenges due to the small and imbalanced nature of the datasets. This study proposes a comprehensive framework based on Generic Sparse Regularized Autoencoders with Kullback–Leibler divergence (GSRA-KL) to generate high-quality synthetic radiomics data and overcome these limitations. A systematic approach generated 63 synthetic radiomics datasets by tuning a novel kl_weight regularization hyperparameter across three hidden-layer sizes, optimized using Optuna for computational efficiency. A rigorous assessment was conducted to evaluate the impact of hyperparameter tuning across 63 synthetic datasets, with a focus on the EGFR gene mutation. This evaluation utilized resemblance-dimension scores (RDS), novel utility-dimension scores (UDS), and t-SNE visualizations to ensure the validation of data quality, revealing that GSRA-KL achieves excellent performance (RDS > 0.45, UDS > 0.7), especially when class distribution is balanced, while remaining competitive with the Tabular Variational Autoencoder (TVAE). Additionally, a comprehensive statistical correlation analysis demonstrated strong and significant monotonic relationships among resemblance-based performance metrics up to moderate scaling (≤1.0*), confirming the robustness and stability of inter-metric associations under varying configurations. Complementary computational cost evaluation further indicated that moderate kl_weight values yield an optimal balance between reconstruction accuracy and resource utilization, with Spearman correlations revealing improved reconstruction quality (MSE ρ=0.78, p<0.001) at reduced computational overhead. The ablation-style analysis confirmed that including the KL divergence term meaningfully enhances the generative capacity of GSRA-KL over its baseline counterpart. Furthermore, the GSRA-KL framework achieved substantial improvements in computational efficiency compared to prior PSO-based optimization methods, resulting in reduced memory usage and training time. Overall, GSRA-KL represents an incremental yet practical advancement for augmenting small and imbalanced high-dimensional radiomics datasets, showing promise for improved mutation prediction and downstream precision oncology studies. Full article
Show Figures

Figure 1

20 pages, 7276 KB  
Article
Semantic Segmentation of Coral Reefs Using Convolutional Neural Networks: A Case Study in Kiritimati, Kiribati
by Dominica E. Harrison, Gregory P. Asner, Nicholas R. Vaughn, Calder E. Guimond and Julia K. Baum
Remote Sens. 2025, 17(21), 3529; https://doi.org/10.3390/rs17213529 - 24 Oct 2025
Viewed by 216
Abstract
Habitat complexity plays a critical role in coral reef ecosystems by enhancing habitat availability, increasing ecological resilience, and offering coastal protection. Structure-from-motion (SfM) photogrammetry has become a standard approach for quantifying habitat complexity in reef monitoring programs. However, a major bottleneck remains in [...] Read more.
Habitat complexity plays a critical role in coral reef ecosystems by enhancing habitat availability, increasing ecological resilience, and offering coastal protection. Structure-from-motion (SfM) photogrammetry has become a standard approach for quantifying habitat complexity in reef monitoring programs. However, a major bottleneck remains in the two-dimensional (2D) classification of benthic cover in three-dimensional (3D) models, where experts are required to manually annotate individual colonies and identify coral species or taxonomic groups. With recent advances in deep learning and computer vision, automated classification of benthic habitats is possible. While some semi-automated tools exist, they are often limited in scope or do not provide semantic segmentation. In this investigation, we trained a convolutional neural network with the ResNet101 architecture on three years (2015, 2017, and 2019) of human-annotated 2D orthomosaics from Kiritimati, Kiribati. Our model accuracy ranged from 71% to 95%, with an overall accuracy of 84% and a mean intersection of union of 0.82, despite highly imbalanced training data, and it demonstrated successful generalizability when applied to new, untrained 2023 plots. Successful automation depends on training data that captures local ecological variation. As coral monitoring efforts move toward standardized workflows, locally developed models will be key to achieving fully automated, high-resolution classification of benthic communities across diverse reef environments. Full article
Show Figures

Figure 1

35 pages, 3978 KB  
Article
A Dynamic Surrogate-Assisted Hybrid Breeding Algorithm for High-Dimensional Imbalanced Feature Selection
by Yujun Ma, Binjing Liao and Zhiwei Ye
Symmetry 2025, 17(10), 1735; https://doi.org/10.3390/sym17101735 - 14 Oct 2025
Viewed by 242
Abstract
With the growing complexity of high-dimensional imbalanced datasets in critical fields such as medical diagnosis and bioinformatics, feature selection has become essential to reduce computational costs, alleviate model bias, and improve classification performance. DS-IHBO, a dynamic surrogate-assisted feature selection algorithm integrating relevance-based redundant [...] Read more.
With the growing complexity of high-dimensional imbalanced datasets in critical fields such as medical diagnosis and bioinformatics, feature selection has become essential to reduce computational costs, alleviate model bias, and improve classification performance. DS-IHBO, a dynamic surrogate-assisted feature selection algorithm integrating relevance-based redundant feature filtering and an improved hybrid breeding algorithm, is presented in this paper. Departing from traditional surrogate-assisted approaches that use static approximations, DS-IHBO employs a dynamic surrogate switching mechanism capable of adapting to diverse data distributions and imbalance ratios through multiple surrogate units built via clustering. It enhances the hybrid breeding algorithm with asymmetric stratified population initialization, adaptive differential operators, and t-distribution mutation strategies to strengthen its global exploration and convergence accuracy. Tests on 12 real-world imbalanced datasets (4–98% imbalance) show that DS-IHBO achieves a 3.48% improvement in accuracy, a 4.80% improvement in F1 score, and an 83.85% reduction in computational time compared with leading methods. These results demonstrate its effectiveness for high-dimensional imbalanced feature selection and strong potential for real-world applications. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

29 pages, 3803 KB  
Article
Spatio-Temporal Coupling of Carbon Efficiency, Carbon Sink, and High-Quality Development in the Greater Chang-Zhu-Tan Urban Agglomeration: Patterns and Influences
by Yong Guo, Lang Yi, Jianbo Zhao, Guangyu Zhu and Dan Sun
Sustainability 2025, 17(19), 8957; https://doi.org/10.3390/su17198957 - 9 Oct 2025
Cited by 1 | Viewed by 254
Abstract
Under the framework of the “dual carbon” goals, promoting the coordinated development of carbon emission efficiency, carbon sink capacity, and high-quality growth has become a critical issue for regional sustainability. Using panel data from 2006 to 2021, this study systematically investigates the three-dimensional [...] Read more.
Under the framework of the “dual carbon” goals, promoting the coordinated development of carbon emission efficiency, carbon sink capacity, and high-quality growth has become a critical issue for regional sustainability. Using panel data from 2006 to 2021, this study systematically investigates the three-dimensional coupling coordination among carbon emission efficiency, carbon sink capacity, and high-quality development in the Greater Chang-Zhu-Tan urban agglomeration. The spatiotemporal evolution, spatial correlation characteristics, and influencing factors of the coupling coordination were also explored. The results indicate that the coupling coordination system exhibits an evolutionary trend of overall stability with localized differentiation. The overall coupling degree remains in the “running-in” stage, while the coordination level is still in a marginally coordinated state. Spatially, the pattern has shifted from “northern leadership” to “multi-polar support,” with Yueyang achieving intermediate coordination, four cities including Changde reaching primary coordination, and three cities including Loudi remaining imbalanced. Spatial correlation has weakened from significant to insignificant, with Xiangtan showing a “low–low” cluster and Hengyang displaying a “high–low” cluster. The evolution of hot and cold spots has moved from marked differentiation to a more balanced distribution, as reflected by the disappearance of cold spots. The empirical analysis confirms a three-dimensional coupling mechanism: ecologically rich regions attain high coordination through carbon sink synergies; economically advanced areas achieve decoupling through innovation-driven development; while traditional industrial cities, despite facing the “green paradox,” demonstrate potential for leapfrog progress through transformation. Among the influencing factors, industrial structure upgrading emerged as the primary driver of spatial differentiation, though with a negative impact. Government support also exhibited a negative effect, whereas the interaction between environmental regulation and both government support and economic development was found to be significant. Full article
Show Figures

Figure 1

25 pages, 666 KB  
Article
Continual Learning for Intrusion Detection Under Evolving Network Threats
by Chaoqun Guo, Xihan Li, Jubao Cheng, Shunjie Yang and Huiquan Gong
Future Internet 2025, 17(10), 456; https://doi.org/10.3390/fi17100456 - 4 Oct 2025
Viewed by 456
Abstract
In the face of ever-evolving cyber threats, modern intrusion detection systems (IDS) must achieve long-term adaptability without sacrificing performance on previously encountered attacks. Traditional IDS approaches often rely on static training assumptions, making them prone to forgetting old patterns, underperforming in label-scarce conditions, [...] Read more.
In the face of ever-evolving cyber threats, modern intrusion detection systems (IDS) must achieve long-term adaptability without sacrificing performance on previously encountered attacks. Traditional IDS approaches often rely on static training assumptions, making them prone to forgetting old patterns, underperforming in label-scarce conditions, and struggling with imbalanced class distributions as new attacks emerge. To overcome these limitations, we present a continual learning framework tailored for adaptive intrusion detection. Unlike prior methods, our approach is designed to operate under real-world network conditions characterized by high-dimensional, sparse traffic data and task-agnostic learning sequences. The framework combines three core components: a clustering-based memory strategy that selectively retains informative historical samples using DP-Means; multi-level knowledge distillation that aligns current and previous model states at output and intermediate feature levels; and a meta-learning-driven class reweighting mechanism that dynamically adjusts to shifting attack distributions. Empirical evaluations on benchmark intrusion detection datasets demonstrate the framework’s ability to maintain high detection accuracy while effectively mitigating forgetting. Notably, it delivers reliable performance in continually changing environments where the availability of labeled data is limited, making it well-suited for real-world cybersecurity systems. Full article
Show Figures

Figure 1

19 pages, 3159 KB  
Article
Optimizing Traffic Accident Severity Prediction with a Stacking Ensemble Framework
by Imad El Mallahi, Jamal Riffi, Hamid Tairi, Nikola S. Nikolov, Mostafa El Mallahi and Mohamed Adnane Mahraz
World Electr. Veh. J. 2025, 16(10), 561; https://doi.org/10.3390/wevj16100561 - 1 Oct 2025
Viewed by 434
Abstract
Road traffic crashes (RTCs) have emerged as a major global cause of fatalities, with the number of accident-related deaths rising rapidly each day. To mitigate this issue, it is essential to develop early prediction methods that help drivers and riders understand accident statistics [...] Read more.
Road traffic crashes (RTCs) have emerged as a major global cause of fatalities, with the number of accident-related deaths rising rapidly each day. To mitigate this issue, it is essential to develop early prediction methods that help drivers and riders understand accident statistics relevant to their region. These methods should consider key factors such as speed limits, compliance with traffic signs and signals, pedestrian crossings, right-of-way rules, weather conditions, driver negligence, fatigue, and the impact of excessive speed on RTC occurrences. Raising awareness of these factors enables individuals to exercise greater caution, thereby contributing to accident prevention. A promising approach to improving road traffic accident severity classification is the stacking ensemble method, which leverages multiple machine learning models. This technique addresses challenges such as imbalanced datasets and high-dimensional features by combining predictions from various base models into a meta-model, ultimately enhancing classification accuracy. The ensemble approach exploits the diverse strengths of different models, capturing multiple aspects of the data to improve predictive performance. The effectiveness of stacking depends on the careful selection of base models with complementary strengths, ensuring robust and reliable predictions. Additionally, advanced feature engineering and selection techniques can further optimize the model’s performance. Within the field of artificial intelligence, various machine learning (ML) techniques have been explored to support decision making in tackling RTC-related issues. These methods aim to generate precise reports and insights. However, the stacking method has demonstrated significantly superior performance compared to existing approaches, making it a valuable tool for improving road safety. Full article
Show Figures

Figure 1

11 pages, 919 KB  
Proceeding Paper
Implementation of Predictive Analytics in Healthcare Using Hybrid Deep Learning Models
by Poonam Kargotra, Irfan Ramzan Parray, Arun Malik and Ivana Lucia Kharisma
Eng. Proc. 2025, 107(1), 67; https://doi.org/10.3390/engproc2025107067 - 8 Sep 2025
Viewed by 1832
Abstract
Predictive analytics has emerged as a powerful tool for improving decision-making in healthcare, particularly in disease prediction and patient management. However, conventional architectures may find it difficult to handle various features of healthcare data, such as high dimensionality and ineffective measures to handle [...] Read more.
Predictive analytics has emerged as a powerful tool for improving decision-making in healthcare, particularly in disease prediction and patient management. However, conventional architectures may find it difficult to handle various features of healthcare data, such as high dimensionality and ineffective measures to handle unstructured data. This work examines the shortcomings of the traditional ML strategy by fusing deep learning approaches with the existing models in an improved predictive performance. Specifically, we propose three hybrid models: (1) Random Forest and Neural Networks (RF + NN), (2) XGBoost and Neural Networks (XGBoost + NN), and (3) Autoencoder and Random Forest (Autoencoder + RF). The goal is to compare these models’ ability to predict healthcare outcomes using standard performance metrics, which include the measures of accuracy, precision, recall, and F1-score. An important research gap revealed from the literature review is that most models tend to have higher precision at the cost of recall and vice versa. Our proposed hybrid models combine the strengths of feature selection from traditional algorithms (RF, XGBoost) with the advanced pattern recognition capabilities of Neural Networks (NNs) and autoencoders, aiming for a more balanced predictive performance. The RF + NN model produces the highest accuracy at 96.81%, with precise accuracy at 90.48% and accurate precision at 70.08%. Nevertheless, the accuracy of a slightly lower XGBoost + NN model of 96.75% showed better actual capability of identifying true positives than false positives, with 73.54% recall. From our results it is evident that the best model in terms of precision was the Autoencoder + RF model, with a precision of 91.36%; it was however the worst in recall, with only 66.22%. Accordingly, these findings imply that for the same level of predictive accuracy, the hybrid models are better in handling imbalanced problems and provide directions for better healthcare predictive systems in the future. Full article
Show Figures

Figure 1

20 pages, 2671 KB  
Article
Multivariate Time Series Anomaly Detection Based on Inverted Transformer with Multivariate Memory Gate
by Yuan Ma, Weiwei Liu, Changming Xu, Luyi Bai, Ende Zhang and Junwei Wang
Entropy 2025, 27(9), 939; https://doi.org/10.3390/e27090939 - 8 Sep 2025
Viewed by 1107
Abstract
In the industrial IoT, it is vital to detect anomalies in multivariate time series, yet it faces numerous challenges, including highly imbalanced datasets, complex and high-dimensional data, and large disparities across variables. Despite the recent surge in proposals for deep learning-based methods, these [...] Read more.
In the industrial IoT, it is vital to detect anomalies in multivariate time series, yet it faces numerous challenges, including highly imbalanced datasets, complex and high-dimensional data, and large disparities across variables. Despite the recent surge in proposals for deep learning-based methods, these approaches typically treat the multivariate data at each point in time as a unique token, weakening the personalized features and dependency relationships between variables. As a result, their performance tends to degrade under highly imbalanced conditions, and reconstruction-based models are prone to overfitting abnormal patterns, leading to excessive reconstruction of anomalous inputs. In this paper, we propose ITMMG, an inverted Transformer with a multivariate memory gate. ITMMG employs an inverted token embedding strategy and multivariate memory to capture deep dependencies among variables and the normal patterns of individual variables. The experimental results obtained demonstrate that the proposed method exhibits superior performance in terms of detection accuracy and robustness compared with existing baseline methods across a range of standard time series anomaly detection datasets. This significantly reduces the probability of misclassifying anomalous samples during reconstruction. Full article
(This article belongs to the Section Information Theory, Probability and Statistics)
Show Figures

Figure 1

27 pages, 2279 KB  
Article
HQRNN-FD: A Hybrid Quantum Recurrent Neural Network for Fraud Detection
by Yao-Chong Li, Yi-Fan Zhang, Rui-Qing Xu, Ri-Gui Zhou and Yi-Lin Dong
Entropy 2025, 27(9), 906; https://doi.org/10.3390/e27090906 - 27 Aug 2025
Cited by 1 | Viewed by 1018
Abstract
Detecting financial fraud is a critical aspect of modern intelligent financial systems. Despite the advances brought by deep learning in predictive accuracy, challenges persist—particularly in capturing complex, high-dimensional nonlinear features. This study introduces a novel hybrid quantum recurrent neural network for fraud detection [...] Read more.
Detecting financial fraud is a critical aspect of modern intelligent financial systems. Despite the advances brought by deep learning in predictive accuracy, challenges persist—particularly in capturing complex, high-dimensional nonlinear features. This study introduces a novel hybrid quantum recurrent neural network for fraud detection (HQRNN-FD). The model utilizes variational quantum circuits (VQCs) incorporating angle encoding, data reuploading, and hierarchical entanglement to project transaction features into quantum state spaces, thereby facilitating quantum-enhanced feature extraction. For sequential analysis, the model integrates a recurrent neural network (RNN) with a self-attention mechanism to effectively capture temporal dependencies and uncover latent fraudulent patterns. To mitigate class imbalance, the synthetic minority over-sampling technique (SMOTE) is employed during preprocessing, enhancing both class representation and model generalizability. Experimental evaluations reveal that HQRNN-FD attains an accuracy of 0.972 on publicly available fraud detection datasets, outperforming conventional models by 2.4%. In addition, the framework exhibits robustness against quantum noise and improved predictive performance with increasing qubit numbers, validating its efficacy and scalability for imbalanced financial classification tasks. Full article
(This article belongs to the Special Issue Quantum Computing in the NISQ Era)
Show Figures

Figure 1

31 pages, 2542 KB  
Article
ECR-MobileNet: An Imbalanced Largemouth Bass Parameter Prediction Model with Adaptive Contrastive Regression and Dependency-Graph Pruning
by Hao Peng, Cheng Ouyang, Lin Yang, Jingtao Deng, Mingyu Tan, Yahui Luo, Wenwu Hu, Pin Jiang and Yi Wang
Animals 2025, 15(16), 2443; https://doi.org/10.3390/ani15162443 - 20 Aug 2025
Viewed by 625
Abstract
The precise, non-destructive monitoring of fish length and weight is a core technology for advancing intelligent aquaculture. However, this field faces dual challenges: traditional contact-based measurements induce stress and yield loss. In addition, existing computer vision methods are hindered by prediction biases from [...] Read more.
The precise, non-destructive monitoring of fish length and weight is a core technology for advancing intelligent aquaculture. However, this field faces dual challenges: traditional contact-based measurements induce stress and yield loss. In addition, existing computer vision methods are hindered by prediction biases from imbalanced data and the deployment bottleneck of balancing high accuracy with model lightweighting. This study aims to overcome these challenges by developing an efficient and robust deep learning framework. We propose ECR-MobileNet, a lightweight framework built on MobileNetV3-Small. It features three key innovations: an efficient channel attention (ECA) module to enhance feature discriminability, an original adaptive multi-scale contrastive regression (AMCR) loss function that extends contrastive learning to multi-dimensional regression for length and weight simultaneously to mitigate data imbalance, and a dependency-graph-based (DepGraph) structured pruning technique that synergistically optimizes model size and performance. On our multi-scene largemouth bass dataset, the pruned ECR-MobileNet-P model comprehensively outperformed 14 mainstream benchmarks. It achieved an R2 of 0.9784 and a root mean square error (RMSE) of 0.4296 cm for length prediction, as well as an R2 of 0.9740 and an RMSE of 0.0202 kg for weight prediction. The model’s parameter count is only 0.52 M, with a computational load of 0.07 giga floating-point operations per second (GFLOPs) and a CPU latency of 10.19 ms, achieving Pareto optimality. This study provides an edge-deployable solution for stress-free biometric monitoring in aquaculture and establishes an innovative methodological paradigm for imbalanced regression and task-oriented model compression. Full article
(This article belongs to the Section Aquatic Animals)
Show Figures

Figure 1

30 pages, 1637 KB  
Article
Life Insurance Fraud Detection: A Data-Driven Approach Utilizing Ensemble Learning, CVAE, and Bi-LSTM
by Markapurapu John Dana Ebinezer and Bondalapu Chaitanya Krishna
Appl. Sci. 2025, 15(16), 8869; https://doi.org/10.3390/app15168869 - 12 Aug 2025
Viewed by 939
Abstract
Insurance fraud detection is a significant challenge due to increasing fraudulent claims, class imbalance, and the increasing complexity of fraudulent behaviour. Traditional machine learning models often struggle to generalize effectively when applied to high-dimensional and imbalanced datasets. This study proposes a data-driven framework [...] Read more.
Insurance fraud detection is a significant challenge due to increasing fraudulent claims, class imbalance, and the increasing complexity of fraudulent behaviour. Traditional machine learning models often struggle to generalize effectively when applied to high-dimensional and imbalanced datasets. This study proposes a data-driven framework for intelligent fraud detection employing three distinct modelling strategies: chaotic variational autoencoders (CVAEs), idirectional long short-term memory (Bi-LSTM), and a hybrid random forest + Bi-LSTM technique. This study aims to evaluate and compare the effectiveness of generative, sequential, and ensemble-based models in identifying rare fraudulent claims within created datasets of 4000 life insurance applications containing 83 features. Following extensive preprocessing and model training, CVAEs achieved the highest accuracy (83.75%) but failed to detect many fraudulent cases due to its low recall (3.28). The Bi-LSTM model outperformed the CVAEs in recall (5.98%) and F1-score, effectively capturing temporal dependencies within the data. The hybrid RF + Bi-LSTM model matched Bi–LSTM in recall but showed more stable ROC and precision–recall curves, indicating robustness and misinterpretability. This hybrid approach balances the strengths of feature-driven and sequential modelling, making it suitable for operational deployment. While Bi–LSTM achieved the best statistical performance, the hybrid model offers enhanced reliability in threshold-sensitive fraud applications. Full article
Show Figures

Figure 1

24 pages, 2667 KB  
Article
Transformer-Driven Fault Detection in Self-Healing Networks: A Novel Attention-Based Framework for Adaptive Network Recovery
by Parul Dubey, Pushkar Dubey and Pitshou N. Bokoro
Mach. Learn. Knowl. Extr. 2025, 7(3), 67; https://doi.org/10.3390/make7030067 - 16 Jul 2025
Viewed by 1238
Abstract
Fault detection and remaining useful life (RUL) prediction are critical tasks in self-healing network (SHN) environments and industrial cyber–physical systems. These domains demand intelligent systems capable of handling dynamic, high-dimensional sensor data. However, existing optimization-based approaches often struggle with imbalanced datasets, noisy signals, [...] Read more.
Fault detection and remaining useful life (RUL) prediction are critical tasks in self-healing network (SHN) environments and industrial cyber–physical systems. These domains demand intelligent systems capable of handling dynamic, high-dimensional sensor data. However, existing optimization-based approaches often struggle with imbalanced datasets, noisy signals, and delayed convergence, limiting their effectiveness in real-time applications. This study utilizes two benchmark datasets—EFCD and SFDD—which represent electrical and sensor fault scenarios, respectively. These datasets pose challenges due to class imbalance and complex temporal dependencies. To address this, we propose a novel hybrid framework combining Attention-Augmented Convolutional Neural Networks (AACNN) with transformer encoders, enhanced through Enhanced Ensemble-SMOTE for balancing the minority class. The model captures spatial features and long-range temporal patterns and learns effectively from imbalanced data streams. The novelty lies in the integration of attention mechanisms and adaptive oversampling in a unified fault-prediction architecture. Model evaluation is based on multiple performance metrics, including accuracy, F1-score, MCC, RMSE, and score*. The results show that the proposed model outperforms state-of-the-art approaches, achieving up to 97.14% accuracy and a score* of 0.419, with faster convergence and improved generalization across both datasets. Full article
Show Figures

Figure 1

34 pages, 2216 KB  
Article
An Optimized Transformer–GAN–AE for Intrusion Detection in Edge and IIoT Systems: Experimental Insights from WUSTL-IIoT-2021, EdgeIIoTset, and TON_IoT Datasets
by Ahmad Salehiyan, Pardis Sadatian Moghaddam and Masoud Kaveh
Future Internet 2025, 17(7), 279; https://doi.org/10.3390/fi17070279 - 24 Jun 2025
Cited by 4 | Viewed by 1559
Abstract
The rapid expansion of Edge and Industrial Internet of Things (IIoT) systems has intensified the risk and complexity of cyberattacks. Detecting advanced intrusions in these heterogeneous and high-dimensional environments remains challenging. As the IIoT becomes integral to critical infrastructure, ensuring security is crucial [...] Read more.
The rapid expansion of Edge and Industrial Internet of Things (IIoT) systems has intensified the risk and complexity of cyberattacks. Detecting advanced intrusions in these heterogeneous and high-dimensional environments remains challenging. As the IIoT becomes integral to critical infrastructure, ensuring security is crucial to prevent disruptions and data breaches. Traditional IDS approaches often fall short against evolving threats, highlighting the need for intelligent and adaptive solutions. While deep learning (DL) offers strong capabilities for pattern recognition, single-model architectures often lack robustness. Thus, hybrid and optimized DL models are increasingly necessary to improve detection performance and address data imbalance and noise. In this study, we propose an optimized hybrid DL framework that combines a transformer, generative adversarial network (GAN), and autoencoder (AE) components, referred to as Transformer–GAN–AE, for robust intrusion detection in Edge and IIoT environments. To enhance the training and convergence of the GAN component, we integrate an improved chimp optimization algorithm (IChOA) for hyperparameter tuning and feature refinement. The proposed method is evaluated using three recent and comprehensive benchmark datasets, WUSTL-IIoT-2021, EdgeIIoTset, and TON_IoT, widely recognized as standard testbeds for IIoT intrusion detection research. Extensive experiments are conducted to assess the model’s performance compared to several state-of-the-art techniques, including standard GAN, convolutional neural network (CNN), deep belief network (DBN), time-series transformer (TST), bidirectional encoder representations from transformers (BERT), and extreme gradient boosting (XGBoost). Evaluation metrics include accuracy, recall, AUC, and run time. Results demonstrate that the proposed Transformer–GAN–AE framework outperforms all baseline methods, achieving a best accuracy of 98.92%, along with superior recall and AUC values. The integration of IChOA enhances GAN stability and accelerates training by optimizing hyperparameters. Together with the transformer for temporal feature extraction and the AE for denoising, the hybrid architecture effectively addresses complex, imbalanced intrusion data. The proposed optimized Transformer–GAN–AE model demonstrates high accuracy and robustness, offering a scalable solution for real-world Edge and IIoT intrusion detection. Full article
Show Figures

Figure 1

22 pages, 6402 KB  
Article
A Study on Airborne Hyperspectral Tree Species Classification Based on the Synergistic Integration of Machine Learning and Deep Learning
by Dabing Yang, Jinxiu Song, Chaohua Huang, Fengxin Yang, Yiming Han and Ruirui Wang
Forests 2025, 16(6), 1032; https://doi.org/10.3390/f16061032 - 19 Jun 2025
Viewed by 857
Abstract
Against the backdrop of global climate change and increasing ecological pressure, the refined monitoring of forest resources and accurate tree species identification have become essential tasks for sustainable forest management. Hyperspectral remote sensing, with its high spectral resolution, shows great promise in tree [...] Read more.
Against the backdrop of global climate change and increasing ecological pressure, the refined monitoring of forest resources and accurate tree species identification have become essential tasks for sustainable forest management. Hyperspectral remote sensing, with its high spectral resolution, shows great promise in tree species classification. However, traditional methods face limitations in extracting joint spatial–spectral features, particularly in complex forest environments, due to the “curse of dimensionality” and the scarcity of labeled samples. To address these challenges, this study proposes a synergistic classification approach that combines the spatial feature extraction capabilities of deep learning with the generalization advantages of machine learning. Specifically, a 2D convolutional neural network (2DCNN) is integrated with a support vector machine (SVM) classifier to enhance classification accuracy and model robustness under limited sample conditions. Using UAV-based hyperspectral imagery collected from a typical plantation area in Fuzhou City, Jiangxi Province, and ground-truth data for labeling, a highly imbalanced sample split strategy (1:99) is adopted. The 2DCNN is further evaluated in conjunction with six classifiers—CatBoost, decision tree (DT), k-nearest neighbors (KNN), LightGBM, random forest (RF), and SVM—for comparison. The 2DCNN-SVM combination is identified as the optimal model. In the classification of Masson pine, Chinese fir, and eucalyptus, this method achieves an overall accuracy (OA) of 97.56%, average accuracy (AA) of 97.47%, and a Kappa coefficient of 0.9665, significantly outperforming traditional approaches. The results demonstrate that the 2DCNN-SVM model offers superior feature representation and generalization capabilities in high-dimensional, small-sample scenarios, markedly improving tree species classification accuracy in complex forest settings. This study validates the model’s potential for application in small-sample forest remote sensing and provides theoretical support and technical guidance for high-precision tree species identification and dynamic forest monitoring. Full article
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)
Show Figures

Figure 1

36 pages, 1717 KB  
Article
Generative Adversarial and Transformer Network Synergy for Robust Intrusion Detection in IoT Environments
by Pardis Sadatian Moghaddam, Ali Vaziri, Sarvenaz Sadat Khatami, Francisco Hernando-Gallego and Diego Martín
Future Internet 2025, 17(6), 258; https://doi.org/10.3390/fi17060258 - 12 Jun 2025
Cited by 5 | Viewed by 1431
Abstract
Intrusion detection in the Internet of Things (IoT) environments is increasingly critical due to the rapid proliferation of connected devices and the growing sophistication of cyber threats. Traditional detection methods often fall short in identifying multi-class attacks, particularly in the presence of high-dimensional [...] Read more.
Intrusion detection in the Internet of Things (IoT) environments is increasingly critical due to the rapid proliferation of connected devices and the growing sophistication of cyber threats. Traditional detection methods often fall short in identifying multi-class attacks, particularly in the presence of high-dimensional and imbalanced IoT traffic. To address these challenges, this paper proposes a novel hybrid intrusion detection framework that integrates transformer networks with generative adversarial networks (GANs), aiming to enhance both detection accuracy and robustness. In the proposed architecture, the transformer component effectively models temporal and contextual dependencies within traffic sequences, while the GAN component generates synthetic data to improve feature diversity and mitigate class imbalance. Additionally, an improved non-dominated sorting biogeography-based optimization (INSBBO) algorithm is employed to fine-tune the hyper-parameters of the hybrid model, further enhancing learning stability and detection performance. The model is trained and evaluated on the CIC-IoT-2023 and TON_IoT dataset, which contains a diverse range of real-world IoT traffic and attack scenarios. Experimental results show that our hybrid framework consistently outperforms baseline methods, in both binary and multi-class intrusion detection tasks. The transformer-GAN achieves a multi-class classification accuracy of 99.67%, with an F1-score of 99.61%, and an area under the curve (AUC) of 99.80% in the CIC-IoT-2023 dataset, and achieves 98.84% accuracy, 98.79% F1-score, and 99.12% AUC on the TON_IoT dataset. The superiority of the proposed model was further validated through statistically significant t-test results, lower execution time compared to baselines, and minimal standard deviation across runs, indicating both efficiency and stability. The proposed framework offers a promising approach for enhancing the security and resilience of next-generation IoT systems. Full article
Show Figures

Graphical abstract

Back to TopTop