Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,207)

Search Parameters:
Keywords = imbalanced data

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 5279 KiB  
Article
Optimization-Incorporated Deep Learning Strategy to Automate L3 Slice Detection and Abdominal Segmentation in Computed Tomography
by Seungheon Chae, Seongwon Chae, Tae Geon Kang, Sung Jin Kim and Ahnryul Choi
Bioengineering 2025, 12(4), 367; https://doi.org/10.3390/bioengineering12040367 (registering DOI) - 31 Mar 2025
Abstract
This study introduces a deep learning-based strategy to automatically detect the L3 slice and segment abdominal tissues from computed tomography (CT) images. Accurate measurement of muscle and fat composition at the L3 level is critical as it can serve as a prognostic biomarker [...] Read more.
This study introduces a deep learning-based strategy to automatically detect the L3 slice and segment abdominal tissues from computed tomography (CT) images. Accurate measurement of muscle and fat composition at the L3 level is critical as it can serve as a prognostic biomarker for cancer diagnosis and treatment. However, current manual approaches are time-consuming and prone to class imbalance, since L3 slices constitute only a small fraction of the entire CT dataset. In this study, we propose an optimization-incorporated strategy that integrates augmentation ratio and class weight adjustment as correction design variables within deep learning models. In this retrospective study, the CT dataset was privately collected from 150 prostate cancer and bladder cancer patients at the Department of Urology of Gangneung Asan Hospital. A ResNet50 classifier was used to detect the L3 slice, while standard Unet, Swin-Unet, and SegFormer models were employed to segment abdominal tissues. Bayesian optimization determines optimal augmentation ratios and class weights, mitigating the imbalanced distribution of L3 slices and abdominal tissues. Evaluation of CT data from 150 prostate and bladder cancer patients showed that the optimized models reduced the slice detection error to approximately 0.68 ± 1.26 slices and achieved a Dice coefficient of up to 0.987 ± 0.001 for abdominal tissue segmentation-improvements over the models that did not consider correction design variables. This study confirms that balancing class distribution and properly tuning model parameters enhances performance. The proposed approach may provide reliable and automated biomarkers for early cancer diagnosis and personalized treatment planning. Full article
Show Figures

Figure 1

23 pages, 1178 KiB  
Article
A Novel Methodology to Develop Mining Stope Stability Graphs on Imbalanced Datasets Using Probabilistic Approaches
by Lucas de Almeida Gama Paixao, William Pratt Rogers and Erisvaldo Bitencourt de Jesus
Mining 2025, 5(2), 24; https://doi.org/10.3390/mining5020024 (registering DOI) - 30 Mar 2025
Viewed by 35
Abstract
Predicting and analyzing the stability of underground stopes is critical for ensuring worker safety, reducing dilution, and maintaining operational efficiency in mining. Traditional stability graphs are widely used but often criticized for oversimplifying the stability phenomenon and relying on subjective classifications. Additionally, the [...] Read more.
Predicting and analyzing the stability of underground stopes is critical for ensuring worker safety, reducing dilution, and maintaining operational efficiency in mining. Traditional stability graphs are widely used but often criticized for oversimplifying the stability phenomenon and relying on subjective classifications. Additionally, the imbalanced nature of stope stability datasets poses challenges for traditional machine learning and statistical models, which often bias predictions toward the majority class. This study proposes a novel methodology for developing site-specific stability graphs using probabilistic modeling and machine learning techniques, addressing the limitations of traditional graphs and the challenges of imbalanced datasets. The approach includes rebalancing of the dataset using the Synthetic Minority Over-Sampling Technique (SMOTE) and feature selection using permutation importance to identify key features that impact instability, using those to construct a bi-dimensional stability graph that provides both improved performance and interpretability. The results indicate that the proposed graph outperforms traditional stability graphs, particularly in identifying unstable stopes, even under highly imbalanced data conditions, highlighting the importance of operational and geometric variables in stope stability, providing actionable insights for mine planners. Conclusively, this study demonstrates the potential for integrating modern probabilistic techniques into mining geotechnics, paving the way for more accurate and adaptive stability assessment tools. Future work includes extending the methodology to multi-mine datasets and exploring dynamic stability graph frameworks. Full article
Show Figures

Figure 1

20 pages, 1969 KiB  
Article
SlantNet: A Lightweight Neural Network for Thermal Fault Classification in Solar PV Systems
by Hrach Ayunts, Sos Agaian and Artyom Grigoryan
Electronics 2025, 14(7), 1388; https://doi.org/10.3390/electronics14071388 - 30 Mar 2025
Viewed by 43
Abstract
The rapid growth of solar photovoltaic (PV) installations worldwide has increased the need for the effective monitoring and maintenance of these vital renewable energy assets. PV systems are crucial in reducing greenhouse gas emissions and diversifying electricity generation. However, they often experience faults [...] Read more.
The rapid growth of solar photovoltaic (PV) installations worldwide has increased the need for the effective monitoring and maintenance of these vital renewable energy assets. PV systems are crucial in reducing greenhouse gas emissions and diversifying electricity generation. However, they often experience faults and damage during manufacturing or operation, significantly impacting their performance, while thermal infrared imaging provides a promising non-invasive method for detecting common defects such as hotspots, cracks, and bypass diode failures, current deep learning approaches for fault classification generally rely on computationally intensive architectures or closed-source solutions, constraining their practical use in real-time situations involving low-resolution thermal data. To tackle these challenges, we introduce SlantNet, a lightweight neural network crafted to classify thermal PV defects efficiently and accurately. At its core, SlantNet incorporates an innovative Slant Convolution (SC) layer that utilizes slant transformation to enhance directional feature extraction and capture subtle thermal gradient variations essential for fault detection. We complement this architectural advancement with a thermal-specific image enhancement augmentation strategy that employs adaptive contrast adjustments to bolster model robustness under the noisy and class-imbalanced conditions typically encountered in field applications. Extensive experimental validation on a comprehensive solar panel defect detection benchmark dataset showcases SlantNet’s exceptional performance. Our method achieves a 95.1% classification accuracy while reducing computational overhead by approximately 60% compared to leading models. Full article
Show Figures

Figure 1

20 pages, 4435 KiB  
Article
OMAL: A Multi-Label Active Learning Approach from Data Streams
by Qiao Fang, Chen Xiang, Jicong Duan, Benallal Soufiyan, Changbin Shao, Xibei Yang, Sen Xu and Hualong Yu
Entropy 2025, 27(4), 363; https://doi.org/10.3390/e27040363 - 29 Mar 2025
Viewed by 59
Abstract
With the rapid growth of digital computing, communication, and storage devices applied in various real-world scenarios, more and more data have been collected and stored to drive the development of machine learning techniques. It is also noted that the data that emerge in [...] Read more.
With the rapid growth of digital computing, communication, and storage devices applied in various real-world scenarios, more and more data have been collected and stored to drive the development of machine learning techniques. It is also noted that the data that emerge in real-world applications tend to become more complex. In this study, we regard a complex data type, i.e., multi-label data, acquired with a time constraint in a dynamic online scenario. Under such conditions, constructing a learning model has to face two challenges: it requires dynamically adapting the variances in label correlations and imbalanced data distributions and it requires more labeling consumptions. To solve these two issues, we propose a novel online multi-label active learning (OMAL) algorithm that considers simultaneously adopting uncertainty (using the average entropy of prediction probabilities) and diversity (using the average cosine distance between feature vectors) as an active query strategy. Specifically, to focus on label correlations, we use a classifier chain (CC) as the multi-label learning model and design a label co-occurrence ranking strategy to arrange label sequence in CC. To adapt the naturally imbalanced distribution of the multi-label data, we select weight extreme learning machine (WELM) as the basic binary-class classifier in CC. The experimental results on ten benchmark multi-label datasets that were transformed into streams show that our proposed method is superior to several popular static multi-label active learning algorithms in terms of both the Macro-F1 and Micro-F1 metrics, indicating its specifical adaptions in the dynamic data stream environment. Full article
(This article belongs to the Section Signal and Data Analysis)
Show Figures

Figure 1

24 pages, 24485 KiB  
Article
Impact of Image Preprocessing and Crack Type Distribution on YOLOv8-Based Road Crack Detection
by Luxin Fan, Saihong Tang, Mohd Khairol Anuar b. Mohd Ariffin, Mohd Idris Shah Ismail and Xinming Wang
Sensors 2025, 25(7), 2180; https://doi.org/10.3390/s25072180 - 29 Mar 2025
Viewed by 104
Abstract
Road crack detection is crucial for ensuring pavement safety and optimizing maintenance strategies. This study investigated the impact of image preprocessing methods and dataset balance on the performance of YOLOv8s-based crack detection. Four datasets (CFD, Crack500, CrackTree200, and CrackVariety) were evaluated using three [...] Read more.
Road crack detection is crucial for ensuring pavement safety and optimizing maintenance strategies. This study investigated the impact of image preprocessing methods and dataset balance on the performance of YOLOv8s-based crack detection. Four datasets (CFD, Crack500, CrackTree200, and CrackVariety) were evaluated using three image formats: RGB, grayscale (five conversion methods), and binarized images. The experimental results indicate that RGB images consistently achieved the highest detection accuracy, confirming that preserving color-based contrast and texture information benefits YOLOv8’s feature extraction. Grayscale conversion showed dataset-dependent variations, with different methods performing best on different datasets, while binarization generally degraded detection accuracy, except in the balanced CrackVariety dataset. Furthermore, this study highlights that dataset balance significantly impacts model performance, as imbalanced datasets (CFD, Crack500, CrackTree200) led to biased predictions favoring dominant crack classes. In contrast, CrackVariety’s balanced distribution resulted in more stable and generalized detection. These findings suggest that dataset balance has a greater influence on detection accuracy than preprocessing methods. Future research should focus on data augmentation and resampling strategies to mitigate class imbalance, as well as explore multi-modal fusion approaches for further performance enhancements. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

18 pages, 1501 KiB  
Article
Tree Species Classification at the Pixel Level Using Deep Learning and Multispectral Time Series in an Imbalanced Context
by Florian Mouret, David Morin, Milena Planells and Cécile Vincent-Barbaroux
Remote Sens. 2025, 17(7), 1190; https://doi.org/10.3390/rs17071190 - 27 Mar 2025
Viewed by 173
Abstract
This paper investigates tree species classification using the Sentinel-2 multispectral satellite image time series (SITS). Despite its importance for many applications and users, such mapping is often unavailable or outdated. The value of using SITS to classify tree species on a large scale [...] Read more.
This paper investigates tree species classification using the Sentinel-2 multispectral satellite image time series (SITS). Despite its importance for many applications and users, such mapping is often unavailable or outdated. The value of using SITS to classify tree species on a large scale has been demonstrated in numerous studies. However, many methods proposed in the literature still rely on a standard machine learning algorithm, usually the random forest (RF) algorithm. Our analysis shows that the use of deep learning (DL) models can lead to a significant improvement in classification results, especially in an imbalanced context where the RF algorithm tends to predict the majority class. In our case study in central France with 10 tree species, we obtained an overall accuracy (OA) of around 95% and an F1-macro score of around 80% using three different benchmark DL architectures (fully connected, convolutional, and attention-based networks). In contrast, using the RF algorithm, the OA and F1 scores obtained were 92% and 60%, indicating that the minority classes are poorly classified. Our results also show that DL models are robust to imbalanced data, although small improvements can be obtained by specifically addressing this issue. Validation on independent in situ data shows that all models struggle to predict in areas not well covered by training data, but even in this situation, the RF algorithm is largely outperformed by deep learning models for minority classes. The proposed framework can be easily implemented as a strong baseline, even with a limited amount of reference data. Full article
Show Figures

Figure 1

17 pages, 8265 KiB  
Article
Automated Foveal Avascular Zone Segmentation in Optical Coherence Tomography Angiography Across Multiple Eye Diseases Using Knowledge Distillation
by Peter Racioppo, Aya Alhasany, Nhuan Vu Pham, Ziyuan Wang, Giulia Corradetti, Gary Mikaelian, Yannis M. Paulus, SriniVas R. Sadda and Zhihong Hu
Bioengineering 2025, 12(4), 334; https://doi.org/10.3390/bioengineering12040334 - 23 Mar 2025
Viewed by 264
Abstract
Optical coherence tomography angiography (OCTA) is a noninvasive imaging technique used to visualize retinal blood flow and identify changes in vascular density and enlargement or distortion of the foveal avascular zone (FAZ), which are indicators of various eye diseases. Although several automated FAZ [...] Read more.
Optical coherence tomography angiography (OCTA) is a noninvasive imaging technique used to visualize retinal blood flow and identify changes in vascular density and enlargement or distortion of the foveal avascular zone (FAZ), which are indicators of various eye diseases. Although several automated FAZ detection and segmentation algorithms have been developed for use with OCTA, their performance can vary significantly due to differences in data accessibility of OCTA in different retinal pathologies, and differences in image quality in different subjects and/or different OCTA devices. For example, data from subjects with direct macular damage, such as in age-related macular degeneration (AMD), are more readily available in eye clinics, while data on macular damage due to systemic diseases like Alzheimer’s disease are often less accessible; data from healthy subjects may have better OCTA quality than subjects with ophthalmic pathologies. Typically, segmentation algorithms make use of convolutional neural networks and, more recently, vision transformers, which make use of both long-range context and fine-grained detail. However, transformers are known to be data-hungry, and may overfit small datasets, such as those common for FAZ segmentation in OCTA, to which there is limited access in clinical practice. To improve model generalization in low-data or imbalanced settings, we propose a multi-condition transformer-based architecture that uses four teacher encoders to distill knowledge into a shared base model, enabling the transfer of learned features across multiple datasets. These include intra-modality distillation using OCTA datasets from four ocular conditions: healthy aging eyes, Alzheimer’s disease, AMD, and diabetic retinopathy; and inter-modality distillation incorporating color fundus photographs of subjects undergoing laser photocoagulation therapy. Our multi-condition model achieved a mean Dice Index of 83.8% with pretraining, outperforming single-condition models (mean of 83.1%) across all conditions. Pretraining on color fundus photocoagulation images improved the average Dice Index by a small margin on all conditions except AMD (1.1% on single-condition models, and 0.1% on multi-condition models). Our architecture demonstrates potential for broader applications in detecting and analyzing ophthalmic and systemic diseases across diverse imaging datasets and settings. Full article
(This article belongs to the Special Issue AI in OCT (Optical Coherence Tomography) Image Analysis)
Show Figures

Figure 1

16 pages, 5234 KiB  
Article
Edge and Node Enhancement Graph Convolutional Network: Imbalanced Graph Node Classification Method Based on Edge-Node Collaborative Enhancement
by Jiadong Tian, Jiali Lin and Dagang Li
Mathematics 2025, 13(7), 1038; https://doi.org/10.3390/math13071038 - 22 Mar 2025
Viewed by 156
Abstract
In addressing the issue of node classification with imbalanced data distribution, traditional models exhibit significant limitations. Conventional improvement methods, such as node replication or weight adjustment, often focus solely on nodes, neglecting connection relationships. However, numerous studies have demonstrated that optimizing edge distribution [...] Read more.
In addressing the issue of node classification with imbalanced data distribution, traditional models exhibit significant limitations. Conventional improvement methods, such as node replication or weight adjustment, often focus solely on nodes, neglecting connection relationships. However, numerous studies have demonstrated that optimizing edge distribution can improve the quality of node embeddings. In this paper, we propose the Edge and Node Collaborative Enhancement method (ENE-GCN). This method identifies potentially associated node pairs by similarity measures and constructs a hybrid adjacency matrix, which enlarges the fitting space of node embedding. Subsequently, an adversarial generation strategy is employed to augment the minority class nodes, thereby constructing a balanced sample set. Compared to existing methods, our approach achieves collaborative enhancement of both edges and nodes in a concise manner, improving embedding quality and balancing the training scenario. Experimental comparisons on four public graph datasets reveal that, compared to baseline methods, our proposed method achieves notable improvements in Recall and AUC metrics, particularly in sparsely connected datasets. Full article
Show Figures

Figure 1

22 pages, 7589 KiB  
Article
GDP Estimation by Integrating Qimingxing-1 Nighttime Light, Street-View Imagery, and Points of Interest: An Empirical Study in Dongguan City
by Zejia Chen, Chengzhi Zhang, Suixuan Qiu and Jinyao Lin
Remote Sens. 2025, 17(7), 1127; https://doi.org/10.3390/rs17071127 - 21 Mar 2025
Viewed by 149
Abstract
In the context of economic globalization, the issue of imbalanced regional development has become increasingly prominent. Misreporting in traditional economic censuses has made it difficult to accurately reflect economic conditions, increasing the demand for precise GDP estimation. While nighttime light data, point of [...] Read more.
In the context of economic globalization, the issue of imbalanced regional development has become increasingly prominent. Misreporting in traditional economic censuses has made it difficult to accurately reflect economic conditions, increasing the demand for precise GDP estimation. While nighttime light data, point of interest (POI) data, and street-view imagery (SVI) have been utilized in economic research, each data source has limitations when used independently. Furthermore, previous studies have rarely used high-resolution (over 30 m) nighttime light data. To address these limitations, we constructed both random forest and decision tree models and compared different indicator combinations for estimating GDP at the town scale in Dongguan: (1) Qimingxing-1 nighttime light data only; (2) Qimingxing-1 nighttime light and SVI data; and (3) Qimingxing-1 nighttime light, SVI, and POI data. The random forest model performed better than the decision tree, with its correlation coefficient improving from 0.9604 (nighttime light only) to 0.9710 (nighttime light and SVI) and reaching 0.9796 with full integration. Moreover, the Friedman test and SHAP values further demonstrated the reliability of our model. These findings indicate that the integrated model provides a more accurate reflection of economic development levels and offers a more effective tool for regional economic estimation. Full article
Show Figures

Figure 1

17 pages, 1513 KiB  
Article
Cascade-Based Input-Doubling Classifier for Predicting Survival in Allogeneic Bone Marrow Transplants: Small Data Case
by Ivan Izonin, Roman Tkachenko, Nazarii Hovdysh, Oleh Berezsky, Kyrylo Yemets and Ivan Tsmots
Computation 2025, 13(4), 80; https://doi.org/10.3390/computation13040080 - 21 Mar 2025
Viewed by 156
Abstract
In the field of transplantology, where medical decisions are heavily dependent on complex data analysis, the challenge of small data has become increasingly prominent. Transplantology, which focuses on the transplantation of organs and tissues, requires exceptional accuracy and precision in predicting outcomes, assessing [...] Read more.
In the field of transplantology, where medical decisions are heavily dependent on complex data analysis, the challenge of small data has become increasingly prominent. Transplantology, which focuses on the transplantation of organs and tissues, requires exceptional accuracy and precision in predicting outcomes, assessing risks, and tailoring treatment plans. However, the inherent limitations of small datasets present significant obstacles. This paper introduces an advanced input-doubling classifier designed to improve survival predictions for allogeneic bone marrow transplants. The approach utilizes two artificial intelligence tools: the first Probabilistic Neural Network generates output signals that expand the independent attributes of an augmented dataset, while the second machine learning algorithm performs the final classification. This method, based on the cascading principle, facilitates the development of novel algorithms for preparing and applying the enhanced input-doubling technique to classification tasks. The proposed method was tested on a small dataset within transplantology, focusing on binary classification. Optimal parameters for the method were identified using the Dual Annealing algorithm. Comparative analysis of the improved method against several existing approaches revealed a substantial improvement in accuracy across various performance metrics, underscoring its practical benefits Full article
(This article belongs to the Special Issue Artificial Intelligence Applications in Public Health: 2nd Edition)
Show Figures

Figure 1

24 pages, 689 KiB  
Article
Topic Classification of Interviews on Emergency Remote Teaching
by Spyridon Tzimiris, Stefanos Nikiforos, Maria Nefeli Nikiforos, Despoina Mouratidis and Katia Lida Kermanidis
Information 2025, 16(4), 253; https://doi.org/10.3390/info16040253 - 21 Mar 2025
Viewed by 293
Abstract
This study explores the application of transformer-based language models for automated Topic Classification in qualitative datasets from interviews conducted in Modern Greek. The interviews captured the views of parents, teachers, and school directors regarding Emergency Remote Teaching. Identifying key themes in this kind [...] Read more.
This study explores the application of transformer-based language models for automated Topic Classification in qualitative datasets from interviews conducted in Modern Greek. The interviews captured the views of parents, teachers, and school directors regarding Emergency Remote Teaching. Identifying key themes in this kind of interview is crucial for informed decision-making in educational policies. Each dataset was segmented into sentences and labeled with one out of four topics. The dataset was imbalanced, presenting additional complexity for the classification task. The GreekBERT model was fine-tuned for Topic Classification, with preprocessing including accent stripping, lowercasing, and tokenization. The findings revealed GreekBERT’s effectiveness in achieving balanced performance across all themes, outperforming conventional machine learning models. The highest evaluation metric achieved was a macro-F1-score of 0.76, averaged across all classes, highlighting the effectiveness of the proposed approach. This study contributes the following: (i) datasets capturing diverse educational community perspectives in Modern Greek, (ii) a comparative evaluation of conventional ML models versus transformer-based models, (iii) an investigation of how domain-specific language enhances the performance and accuracy of Topic Classification models, showcasing their effectiveness in specialized datasets and the benefits of fine-tuned GreekBERT for such tasks, and (iv) capturing the complexities of ERT through an empirical investigation of the relationships between extracted topics and relevant variables. These contributions offer reliable, scalable solutions for policymakers, enabling data-driven educational policies to address challenges in remote learning and enhance decision-making based on comprehensive qualitative evidence. Full article
Show Figures

Figure 1

22 pages, 4846 KiB  
Article
Drilling Condition Identification Method for Imbalanced Datasets
by Yibing Yu, Huilin Yang, Fengjia Peng and Xi Wang
Appl. Sci. 2025, 15(6), 3362; https://doi.org/10.3390/app15063362 - 19 Mar 2025
Viewed by 79
Abstract
To address the challenges posed by class imbalance and temporal dependency in drilling condition data and enhance the accuracy of condition identification, this study proposes an integrated method combining feature engineering, data resampling, and deep learning model optimization. Firstly, a feature selection strategy [...] Read more.
To address the challenges posed by class imbalance and temporal dependency in drilling condition data and enhance the accuracy of condition identification, this study proposes an integrated method combining feature engineering, data resampling, and deep learning model optimization. Firstly, a feature selection strategy based on weighted symmetrical uncertainty is employed, assigning higher weights to critical features that distinguish minority classes, thereby enhancing class contrast and improving the classification capability of the model. Secondly, a sliding-window-based Synthetic Minority Oversampling Technique (SMOTE) algorithm is developed, which generates new minority-class samples while preserving temporal dependencies, achieving balanced data distribution among classes. Finally, a coupled model integrating bidirectional long short-term memory (BiLSTM) networks and gated recurrent units (GRUs) is constructed. The BiLSTM component captures global contextual information, while the GRU efficiently learns features from complex sequential data. The proposed approach was validated using logging data from 14 wells and compared against existing models, including RNN, CNN, FCN, and LSTM. The experimental results demonstrated that the proposed method achieved classification F1 score improvements of 8.95%, 9.58%, 10.25%, and 8.59%, respectively, over these traditional models. Additionally, classification loss values were reduced by 0.32, 0.3315, 0.2893, and 0.2246, respectively. These findings underscore the significant improvements in both accuracy and balance achieved by the proposed method for drilling condition identification. The results indicate that the proposed approach effectively addresses class imbalance and temporal dependency issues in drilling condition data, substantially enhancing classification performance for complex sequential data. This work provides a practical and efficient solution for drilling condition recognition. Full article
Show Figures

Figure 1

27 pages, 7641 KiB  
Article
Generating Synthetic Datasets with Deep Learning Models for Human Physical Fatigue Analysis
by Arsalan Lambay, Ying Liu, Phillip Morgan and Ze Ji
Machines 2025, 13(3), 235; https://doi.org/10.3390/machines13030235 - 13 Mar 2025
Viewed by 208
Abstract
There has been a growth of collaborative robots in Industry 5.0 due to the research in automation involving human-centric workplace design. It has had a substantial impact on industrial processes; however, physical exertion in human workers is still an issue, requiring solutions that [...] Read more.
There has been a growth of collaborative robots in Industry 5.0 due to the research in automation involving human-centric workplace design. It has had a substantial impact on industrial processes; however, physical exertion in human workers is still an issue, requiring solutions that combine technological innovation with human-centric development. By analysing real-world data, machine learning (ML) models can detect physical fatigue. However, sensor-based data collection is frequently used, which is often expensive and constrained. To overcome this gap, synthetic data generation (SDG) uses methods such as tabular generative adversarial networks (GANs) to produce statistically realistic datasets that improve machine learning model training while providing scalability and cost-effectiveness. This study presents an innovative approach utilising conditional GAN with auxiliary conditioning to generate synthetic datasets with essential features for detecting human physical fatigue in industrial scenarios. This approach allows us to enhance the SDG process by effectively handling the heterogeneous and imbalanced nature of human fatigue data, which includes tabular, categorical, and time-series data points. These generated datasets will be used to train specialised ML models, such as ensemble models, to learn from the original dataset from the extracted feature and then identify signs of physical fatigue. The trained ML model will undergo rigorous testing using authentic, real-world data to evaluate its sensitivity and specificity in recognising how closely generated data match with actual human physical fatigue within industrial settings. This research aims to provide researchers with an innovative method to tackle data-driven ML challenges of data scarcity and further enhance ML technology’s efficiency through training on SD. This study not only provides an approach to create complex realistic datasets but also helps in bridging the gap of Industry 5.0 data challenges for the purpose of innovations and worker well-being by improving detection capabilities. Full article
Show Figures

Figure 1

19 pages, 709 KiB  
Article
Design Particularities of Quadrature Chaos Shift Keying Communication System with Enhanced Noise Immunity for IoT Applications
by Darja Cirjulina, Ruslans Babajans and Deniss Kolosovs
Entropy 2025, 27(3), 296; https://doi.org/10.3390/e27030296 - 12 Mar 2025
Viewed by 215
Abstract
This article is devoted to the investigation of synchronization noise immunity in quadrature chaos shift keying (QCSK) communication systems and its profound impact on system performance. The study focuses on Colpitts and Vilnius chaos oscillators in different synchronization configurations, and the reliability of [...] Read more.
This article is devoted to the investigation of synchronization noise immunity in quadrature chaos shift keying (QCSK) communication systems and its profound impact on system performance. The study focuses on Colpitts and Vilnius chaos oscillators in different synchronization configurations, and the reliability of the system in the particular configuration is assessed using the bit error rate (BER) estimation. The research considers synchronization imbalances and demonstrates their effect on the accuracy of data detection and overall transmission stability. The article proposes an approach for optimal bit detection in the case of imbalanced synchronization and correlated chaotic signals in data transmission. The study practically shows the importance of the proposed decision-making technique, revealing that certain adjustments can significantly enhance system noise resilience. Full article
Show Figures

Figure 1

32 pages, 2561 KiB  
Article
A Novel Explainable Attention-Based Meta-Learning Framework for Imbalanced Brain Stroke Prediction
by Inam Abousaber
Sensors 2025, 25(6), 1739; https://doi.org/10.3390/s25061739 - 11 Mar 2025
Viewed by 291
Abstract
The accurate prediction of brain stroke is critical for effective diagnosis and management, yet the imbalanced nature of medical datasets often hampers the performance of conventional machine learning models. To address this challenge, we propose a novel meta-learning framework that integrates advanced hybrid [...] Read more.
The accurate prediction of brain stroke is critical for effective diagnosis and management, yet the imbalanced nature of medical datasets often hampers the performance of conventional machine learning models. To address this challenge, we propose a novel meta-learning framework that integrates advanced hybrid resampling techniques, ensemble-based classifiers, and explainable artificial intelligence (XAI) to enhance predictive performance and interpretability. The framework employs SMOTE and SMOTEENN for handling class imbalance, dynamic feature selection to reduce noise, and a meta-learning approach combining predictions from Random Forest and LightGBM, and further refined by a deep learning-based meta-classifier. The model uses SHAP (Shapley Additive Explanations) to provide transparent insights into feature contributions, increasing trust in its predictions. Evaluated on three datasets, DF-1, DF-2, and DF-3, the proposed framework consistently outperformed state-of-the-art methods, achieving accuracy and F1-Score of 0.992189 and 0.992579 on DF-1, 0.980297 and 0.981916 on DF-2, and 0.981901 and 0.983365 on DF-3. These results validate the robustness and effectiveness of the approach, significantly improving the detection of minority-class instances while maintaining overall performance. This work establishes a reliable solution for stroke prediction and provides a foundation for applying meta-learning and explainable AI to other imbalanced medical prediction tasks. Full article
(This article belongs to the Collection Deep Learning in Biomedical Informatics and Healthcare)
Show Figures

Figure 1

Back to TopTop