Journal Description
Big Data and Cognitive Computing
Big Data and Cognitive Computing
is an international, peer-reviewed, open access journal on big data and cognitive computing published monthly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: JCR - Q1 (Computer Science, Theory and Methods) / CiteScore - Q1 (Computer Science Applications)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 23.1 days after submission; acceptance to publication is undertaken in 4.6 days (median values for papers published in this journal in the second half of 2025).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
- Journal Cluster of Artificial Intelligence: AI, AI in Medicine, Algorithms, BDCC, MAKE, MTI, Stats, Virtual Worlds and Computers.
Impact Factor:
4.4 (2024);
5-Year Impact Factor:
4.2 (2024)
Latest Articles
Evaluating Computational Approaches for Harmful Content Analysis: Promise, Pitfalls and Tools for Responsible Research
Big Data Cogn. Comput. 2026, 10(5), 143; https://doi.org/10.3390/bdcc10050143 (registering DOI) - 2 May 2026
Abstract
This manuscript develops and demonstrates a practical framework for evaluating automated classifiers used in communication research, using harmful language detection as an illustrative case. We combine (a) a structured review of documentation practices for 27 publicly available classifiers and their associated annotation processes
[...] Read more.
This manuscript develops and demonstrates a practical framework for evaluating automated classifiers used in communication research, using harmful language detection as an illustrative case. We combine (a) a structured review of documentation practices for 27 publicly available classifiers and their associated annotation processes with (b) a cross-dataset evaluation that re-tests each model beyond its original training context. Across 27 datasets, we extract and compare reporting on construct definitions, annotator instructions, and inter-annotator agreement, and we quantify generalization by applying each model to multiple out-of-domain test sets. We also benchmark a contemporary large language model (GPT-5) under a consistent prompting protocol to illustrate how LLM-based classification compares to fine-tuned classifiers. Results show that documentation is uneven and often insufficient for theory-driven measurement, inter-annotator agreement varies widely across datasets, and cross-dataset performance frequently drops substantially relative to within-dataset evaluations. Building on these findings and existing validation guidance, we provide a reusable checklist and decision flow to help researchers select, justify, and report classifier-based measures in ways that support transparency and cumulative science. Recommendations for researchers, reviewers, and journal editors stress aligning model selection with standards of validity, reliability, and transparency.
Full article
(This article belongs to the Section Data Mining and Machine Learning)
►
Show Figures
Open AccessArticle
Towards Improved Clinical Adoption of AI Segmentation Models: Benchmarking High-Performance Models for Resource-Constrained Settings
by
Emmanuel Chibuikem Nnadozie, Susana Merino-Caviedes, Daniel A. de Luis-Román, Marcos Martín-Fernández and Carlos Alberola-López
Big Data Cogn. Comput. 2026, 10(5), 142; https://doi.org/10.3390/bdcc10050142 (registering DOI) - 2 May 2026
Abstract
High-performance medical segmentation models are often benchmarked on high-end GPUs. Such benchmarks do not provide useful performance insights for point-of-care low-end devices. This work, firstly, posits that to achieve improved clinical adoption of AI-powered segmentation models, especially in reduced manpower settings like rural
[...] Read more.
High-performance medical segmentation models are often benchmarked on high-end GPUs. Such benchmarks do not provide useful performance insights for point-of-care low-end devices. This work, firstly, posits that to achieve improved clinical adoption of AI-powered segmentation models, especially in reduced manpower settings like rural hospitals, we need benchmarks that provide actionable insights on the degree to which high-performance models address five deployment constraints viz: resource-effectiveness for low-end computing devices, clinically acceptable accuracy, clinically compatible execution times, localization of user data, and user-based finetuning. In this work, five state-of-the-art foundation segmentation models and one target-specific model were systematically evaluated on three multi-organ medical datasets. Furthermore, the best-ranking foundation model and target-specific model were benchmarked on three low-end devices. Our findings show that lightweight foundation models provided the best performance trade-off and are easily user-fine-tuned on custom datasets. Target-specific models provide high accuracy out-of-the-box, but may require significant optimisation to deliver comparably fast execution times and user-based finetuning on low-end devices. The methods and results from this research provide actionable insights on high-performance medical segmentation models for low-end computing devices, as a necessary step towards improved adoption in resource-limited clinical settings.
Full article
(This article belongs to the Special Issue Next-Generation Medical Image Analysis: Multimodal, Decentralized, Fair and Reasoning-Centric Approaches)
Open AccessArticle
BERT-Based Models for Normalization of Adverse Drug Event Expressions in Social Media to Standard Medical Terminology for Drug Safety Analysis
by
Fan Dong, Wenjing Guo, Jie Liu, Ann Varghese, Weida Tong, Tucker A. Patterson and Huixiao Hong
Big Data Cogn. Comput. 2026, 10(5), 141; https://doi.org/10.3390/bdcc10050141 (registering DOI) - 2 May 2026
Abstract
Social media platforms host abundant and timely descriptions of medication experiences that can complement traditional pharmacovigilance systems. Yet the linguistic informality of these data presents a major challenge for mapping adverse drug event (ADE) expressions to standardized medical terminology. In this study, we
[...] Read more.
Social media platforms host abundant and timely descriptions of medication experiences that can complement traditional pharmacovigilance systems. Yet the linguistic informality of these data presents a major challenge for mapping adverse drug event (ADE) expressions to standardized medical terminology. In this study, we developed BERT-based language models to classify ADE mentions from social media into MedDRA System Organ Classes (SOCs). Using the SMM4H and CADEC corpora, as well as their combination, we performed 20 iterations of 20% holdout validation for 3-, 6-, 22-, and 25-SOC classification tasks with a selected fixed training configuration (learning rate, batch size, and training epochs) based on training-loss convergence. The models achieved accuracies ranging from 75% to 94%, demonstrating strong performance for SOC-level classification of noisy and informal ADE expressions under the evaluated settings. These results are based on a controlled mention-level evaluation using deduplicated adverse drug event strings and do not establish document-level or real-world deployment generalization. This work provides a systematic evaluation of BERT-based models for SOC-level classification of ADEs and demonstrates consistent performance within the evaluated datasets and label granularities. While direct comparison with prior studies is limited by differences in datasets and evaluation protocols, the results demonstrate that transformer-based models can effectively classify ADEs into SOCs. These findings support the use of transformer-based normalization for SOC-level aggregation of user-reported adverse events and their integration into large-scale social media pharmacovigilance pipelines as a downstream component under controlled conditions.
Full article
(This article belongs to the Section Data Mining and Machine Learning)
►▼
Show Figures

Figure 1
Open AccessArticle
BWT-Enhanced Compression for GIS Raster Data: A Hybrid AV1-Inspired Approach with Burrows–Wheeler Transform
by
Yair Wiseman
Big Data Cogn. Comput. 2026, 10(5), 140; https://doi.org/10.3390/bdcc10050140 - 1 May 2026
Abstract
The AVIF (AV1 Image File Format) is a modern, royalty-free image format that leverages the AV1 video codec for superior compression efficiency, supporting both lossy and lossless modes. Its entropy encoding relies on a multi-symbol context-adaptive arithmetic coder (range coding with adaptive cumulative
[...] Read more.
The AVIF (AV1 Image File Format) is a modern, royalty-free image format that leverages the AV1 video codec for superior compression efficiency, supporting both lossy and lossless modes. Its entropy encoding relies on a multi-symbol context-adaptive arithmetic coder (range coding with adaptive cumulative distribution functions (CDFs)), which is effective for general imagery but may not optimally exploit the repetitive structures common in Geographic Information System (GIS) maps/data. This paper proposes replacing AVIF’s entropy encoder with the Burrows–Wheeler Transform (BWT), a reversible preprocessing algorithm that rearranges data to create runs of similar symbols, enhancing subsequent compression. We detail the technical steps for modification, drawing from AV1’s open-source implementation, and explain why BWT is advantageous for GIS raster maps/data, which often feature large uniform areas, limited color palettes, and spatial redundancies. Empirical evidence from related studies on BWT-based image compression shows improvements in lossless scenarios, potentially considerably reducing file sizes over standard methods while preserving data integrity critical for geospatial analysis. This swap could improve storage, transmission, and processing efficiency in GIS applications, such as remote sensing and cartography. The discussion includes challenges like computational overhead and compatibility, with recommendations for implementations. The resulting BWT-AVIF hybrid produces a non-standard AV1 bit-stream that is not compliant with the AV1 or AVIF specifications and therefore requires custom decoders. It is presented here as a research prototype for GIS-specific compression rather than a compliant AVIF extension.
Full article
(This article belongs to the Special Issue Intelligent Communication and Sensor Networks for Advanced Signal Processing)
►▼
Show Figures

Figure 1
Open AccessArticle
A Hybrid Artificial Intelligence Framework for Reliable and Seamless Vertical Handover in Next-Generation Heterogeneous Networks
by
Sunisa Kunarak
Big Data Cogn. Comput. 2026, 10(5), 139; https://doi.org/10.3390/bdcc10050139 - 29 Apr 2026
Abstract
Next-generation heterogeneous wireless networks (HetNets) comprising LTE macro-cells, 5G New Radio (NR) small cells, and WiFi 6 access points aim to provide seamless connectivity under diverse mobility scenarios. However, vertical handover (VHO) remains a performance bottleneck because of the highly variable radio environments,
[...] Read more.
Next-generation heterogeneous wireless networks (HetNets) comprising LTE macro-cells, 5G New Radio (NR) small cells, and WiFi 6 access points aim to provide seamless connectivity under diverse mobility scenarios. However, vertical handover (VHO) remains a performance bottleneck because of the highly variable radio environments, dynamic user mobility, stringent quality of service (QoS) requirements, and the coexistence of multi-tier access technologies. Existing handover approaches based on deep learning and deep reinforcement learning (DRL) suffer from limitations: deep learning models lack decision-making capabilities, whereas DRL models, particularly deep Q-network (DQN)-based policies, face Q-value overestimation and unstable convergence. To overcome these limitations, this paper introduces a Hybrid deep double-Q networks (DDQN)–bidirectional long short-term memory (Bi-LSTM) Framework that integrates bi-directional mobility prediction and DRL-based adaptive decision-making. The Bi-LSTM module captures forward and backward temporal dependencies and predicts future Received Signal Strength (RSS) trajectories, mobility dynamics, and cell-edge transitions. The DDQN module stabilizes the action value estimation, mitigates overestimation bias, and enables context-aware handover decisions. A multi-tier simulation environment consisting of LTE, 5G NR, and WiFi 6 networks was developed using realistic path loss, shadowing, interference, and mobility models. Extensive evaluations demonstrated substantial improvements in mobility prediction accuracy, handover stability, radio link reliability, throughput efficiency, and latency reduction compared to conventional RSS-based and DQN-based schemes. The findings highlight the effectiveness of integrating predictive intelligence with reinforcement learning for reliable mobility management in 5G-Advanced and emerging 6G networks.
Full article
Open AccessArticle
GPU-TOPSIS: A Complete Vectorized and Parallel Reformulation of the TOPSIS Method for Large-Scale Multi-Criteria Decision Making
by
Latifa Boubekri, Hassnae Aberkane, Mohammed Chaouki Abounaima and Loubna Lamrini
Big Data Cogn. Comput. 2026, 10(5), 138; https://doi.org/10.3390/bdcc10050138 - 28 Apr 2026
Abstract
The TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) method is one of the most widely used multi-criteria decision-making (MCDM) approaches in industrial, financial, and scientific fields. However, its sequential computational cost of O(m × n), where m denotes the number
[...] Read more.
The TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) method is one of the most widely used multi-criteria decision-making (MCDM) approaches in industrial, financial, and scientific fields. However, its sequential computational cost of O(m × n), where m denotes the number of alternatives and n the number of criteria, becomes prohibitive when decision matrices have several million rows. Despite its geometric interpretability and simplicity, classical TOPSIS faces two key computational bottlenecks at scale: (i) Euclidean distance calculations O(m × n) dominating the total cost, and (ii) the O(m × log m) sorting step, both inherently sequential and memory-bound on CPUs. To overcome these limitations, we propose GPU-TOPSIS, a fully vectorized and parallel reformulation of TOPSIS based on tensor execution on graphics processing units (GPUs), whose main contributions are: (i) a formally correct reformulation of TOPSIS as a GPU tensor pipeline preserving mathematical fidelity to the original method; (ii) a two-pass fragment-processing algorithm guaranteeing exact mathematical equivalence with monolithic TOPSIS, while reducing the memory footprint from O(m × n) to O(mt × n), where mt < m is the size of each independently processed fragment; (iii) three independent implementations on CuPy, PyTorch, and TensorFlow, ensuring the framework’s portability and genericity. Experimental evaluations on real data from the Amazon Products 2023 dataset, using matrices of up to 200 million alternatives (via the 2-pass formulation), demonstrate speedups of up to 4.75× compared to the reference CPU implementation (NumPy), with inter-backend score differences below 5 × 10−8 and 100% ranking overlap across all tested Top-K thresholds. A perturbation sensitivity analysis of the criteria weights and cross-backend consistency tests confirms that GPU acceleration fully preserves robustness and decision reliability, making GPU-TOPSIS a practical, open, and reproducible solution for large-scale multi-criteria decision making in Big Data environments.
Full article
Open AccessReview
A Review of Key Technologies for Systems Based on Non-Volatile Memory
by
Yuhan Zhang, Zehang Wang, Yuanfang Chen, Chunfeng Du and Jing Chen
Big Data Cogn. Comput. 2026, 10(5), 137; https://doi.org/10.3390/bdcc10050137 - 27 Apr 2026
Abstract
With the continuous growth of data-intensive applications and artificial intelligence workloads, traditional dynamic random access memory (DRAM) is increasingly struggling to meet demands in terms of capacity scale, energy consumption constraints, and data retention after power failure. Consequently, non-volatile memory (NVM) has emerged
[...] Read more.
With the continuous growth of data-intensive applications and artificial intelligence workloads, traditional dynamic random access memory (DRAM) is increasingly struggling to meet demands in terms of capacity scale, energy consumption constraints, and data retention after power failure. Consequently, non-volatile memory (NVM) has emerged as a crucial technology for bridging the gap between the memory and storage layers. However, due to inherent differences in write life, read–write performance variations, and consistency guarantee after failure, the systematic application of NVM still faces a series of challenges. Addressing these issues, this paper takes as its starting point the adaptation of medium characteristics and system design, and summarizes the research progress in aspects such as write optimization, consistency and security coordination mechanisms, data structure modification under hybrid memory architecture, and cross-layer resource collaboration. It also conducts an in-depth analysis of representative solutions and evaluation methods. The review results show that current research has shifted from improving a single performance bottleneck to multi-mechanism collaborative optimization. Various technical approaches have proven complementary in alleviating write amplification, enhancing persistence efficiency, and optimizing access patterns. This paper demonstrates that achieving stable and scalable application of NVM requires establishing a more systematic collaborative design concept between durability, security, and performance. As AI training workloads and big data analytics place increasing demands on memory bandwidth and persistence, the techniques surveyed here provide a foundational basis for next-generation memory-centric computing infrastructures.
Full article
(This article belongs to the Special Issue Internet Intelligence for Cybersecurity)
Open AccessArticle
A Robust Ensemble Learning Approach to URL-Based Phishing Webpage Detection
by
Abdellah Rezoug and Mohamed Bader-el-den
Big Data Cogn. Comput. 2026, 10(5), 136; https://doi.org/10.3390/bdcc10050136 - 27 Apr 2026
Abstract
The proliferation of online fraud has resulted in substantial financial damage to individuals and organizations alike, with web phishing emerging as one of the most pervasive and harmful attack vectors. In response, this paper proposes the Stacking Ensemble Models Generator (SEMG), a URL-based
[...] Read more.
The proliferation of online fraud has resulted in substantial financial damage to individuals and organizations alike, with web phishing emerging as one of the most pervasive and harmful attack vectors. In response, this paper proposes the Stacking Ensemble Models Generator (SEMG), a URL-based phishing detection approach that leverages a multi-objective Genetic Algorithm to jointly optimize Precision and Recall in the selection and configuration of stacking ensemble models. An initial pool of base learners is trained on labeled datasets and subsequently evolved through genetic operators toward a globally optimal ensemble. Experimental evaluation across five datasets sourced from Mendeley and UCI repositories demonstrates that SEMG consistently surpasses individual base learners and compares favorably against existing methods, attaining performance across all metrics on D2 while matching or exceeding state-of-the-art results on the remaining benchmarks. These outcomes underscore the framework’s robustness and its potential for deployment in real-world phishing detection systems.
Full article
(This article belongs to the Section Data Mining and Machine Learning)
►▼
Show Figures

Figure 1
Open AccessArticle
Enhancing Adversarial Transferability via Fourier-Based Input Transformation
by
Zilin Tian, Xin Wang, Yunfei Long and Liguo Zhang
Big Data Cogn. Comput. 2026, 10(5), 135; https://doi.org/10.3390/bdcc10050135 - 27 Apr 2026
Abstract
Adversarial transferability makes black-box attacks practical and exposes weaknesses of deep neural networks for computer vision, image recognition, and visual understanding. Among various transferability-enhancing methods, input transformation is one of the most effective strategies. However, existing methods often ignore the decoupling of style
[...] Read more.
Adversarial transferability makes black-box attacks practical and exposes weaknesses of deep neural networks for computer vision, image recognition, and visual understanding. Among various transferability-enhancing methods, input transformation is one of the most effective strategies. However, existing methods often ignore the decoupling of style and semantics in the input image, as well as the need for customized transformation strategies, resulting in limited performance gains or suboptimal outcomes. In this paper, we propose a novel Fourier-based perspective for input transformation generalization in the context of vision adversarial attacks. The main observations are that the Fourier amplitude captures stylistic information and the phase encompasses richer semantics which are crucial for visual understanding. Motivated by this, we develop a Fourier-based strategy, which performs a stylistic transform and semantic mixup on the input examples to improve transferability. To avoid inconsistent semantics of augmented images for the surrogate model, we mix the original images with the augmentations to maintain semantic consistency and mitigate imprecise gradients. Extensive experiments on ImageNet-compatible datasets demonstrate that our method consistently outperforms existing input transformation attacks.
Full article
(This article belongs to the Section Artificial Intelligence and Multi-Agent Systems)
►▼
Show Figures

Figure 1
Open AccessArticle
A Physically Regularized Control-Oriented State Model and Nonlinear Model Predictive Control Framework for an Ice Rink Refrigeration System
by
Alexander A. Karmanov and Petr V. Nikitin
Big Data Cogn. Comput. 2026, 10(5), 134; https://doi.org/10.3390/bdcc10050134 - 26 Apr 2026
Abstract
Energy-intensive refrigeration systems require predictive models that remain informative under counterfactual control trajectories, not only on archived operation. This paper develops a control-oriented multi-step state model and a nonlinear model predictive control framework for an indoor ice-rink refrigeration system. Historical state, control, and
[...] Read more.
Energy-intensive refrigeration systems require predictive models that remain informative under counterfactual control trajectories, not only on archived operation. This paper develops a control-oriented multi-step state model and a nonlinear model predictive control framework for an indoor ice-rink refrigeration system. Historical state, control, and exogenous variables are encoded jointly with an admissible future control trajectory, and a normalized thermal-balance residual is added to the training objective. A lightweight conditioned transformer predicts ice temperature, return-glycol temperature, supply-glycol temperature, and compressor power over a 30 min horizon. The selected weakly regularized model with regularization coefficient λphys = 0.001 decreases the normalized thermal-balance root-mean-square error on the horizon tail by 30.29% relative to the base model while increasing the average ice-temperature root-mean-square error by only 1.90%. In a surrogate-based counterfactual four-day evaluation, the resulting nonlinear model predictive controller reduces predicted daily energy by 4.84%, terminal violation share by 17.32%, mean absolute terminal ice-temperature deviation by 18.74%, and the mean objective value by 30.82% relative to historical admissible setpoint tracking. The mean full control cycle time is 0.0311 s, confirming real-time feasibility for a 5 min supervisory update interval. All controller results are surrogate-based rather than field-deployed and therefore represent receding-horizon benchmark results under learned-model evaluation, not realized field savings.
Full article
(This article belongs to the Section Data Mining and Machine Learning)
Open AccessArticle
Mamba-Based Video Analysis for Blood Pressure Estimation
by
Walaa Othman, Batol Hamoud, Nikolay Shilov, Alexey Kashevnik and Alexander Mayatin
Big Data Cogn. Comput. 2026, 10(5), 133; https://doi.org/10.3390/bdcc10050133 - 26 Apr 2026
Abstract
Blood pressure monitoring is important for overall health assessment, yet traditional cuff-based methods are intrusive and unsuitable for continuous monitoring. This paper proposes a contactless approach for blood pressure estimation from facial videos using a bidirectional Mamba-based architecture with uncertainty quantification. Our method
[...] Read more.
Blood pressure monitoring is important for overall health assessment, yet traditional cuff-based methods are intrusive and unsuitable for continuous monitoring. This paper proposes a contactless approach for blood pressure estimation from facial videos using a bidirectional Mamba-based architecture with uncertainty quantification. Our method processes 64-frame video segments through a hierarchical 3D convolutional encoder to extract spatiotemporal features, then applies bidirectional state-space modeling to capture temporal dynamics efficiently. The model was evaluated on the Vitals for Vision (V4V) dataset, achieving mean absolute errors of 13.15 mmHg for systolic and 9.56 mmHg for diastolic blood pressure, outperforming prior methods while requiring significantly fewer computational resources than attention-based approaches. While these results do not meet clinical-grade diagnostic standards, they demonstrate the feasibility of contactless blood pressure estimation for non-clinical applications such as wellness monitoring, preliminary health screening, and continuous remote observation, where unobtrusive and computationally efficient monitoring is desirable.
Full article
(This article belongs to the Section Data Mining and Machine Learning)
Open AccessArticle
Adversarial Evaluation of Large Language Models for Building Robust Offensive Language Detection in Moroccan Arabic
by
Soufiyan Ouali, Kanza Raisi, Asmaa Mourhir, El Habib Nfaoui and Said El Garouani
Big Data Cogn. Comput. 2026, 10(5), 132; https://doi.org/10.3390/bdcc10050132 - 24 Apr 2026
Abstract
Offensive language detection is crucial for ensuring safe and inclusive digital environments. Identifying harmful content protects users and supports healthier online interactions. Despite advances in transformer-based models, particularly Large Language Models (LLMs), their application to this task remains underexplored for low-resource languages such
[...] Read more.
Offensive language detection is crucial for ensuring safe and inclusive digital environments. Identifying harmful content protects users and supports healthier online interactions. Despite advances in transformer-based models, particularly Large Language Models (LLMs), their application to this task remains underexplored for low-resource languages such as Moroccan Arabic, especially compared with high-resource languages. This study evaluates the performance of various open- and closed-source LLMs for offensive language detection in Moroccan Darija. The evaluated models include general-purpose LLMs such as LLaMA, Mistral, and Gemma, as well as Arabic-focused models such as ArabianGPT, Falcon Arabic, and Atlas-Chat. We also experiment with reasoning models such as DeepSeek and GPT-4. Beyond traditional evaluation metrics, we investigate the robustness of these LLMs and examine the impact of adversarial training on their performance. Moreover, we contribute to the field by creating a large, high-quality dataset. Our evaluation revealed that GPT-4o Mini achieved the best overall performance, reaching an F1-score of 88%. However, robustness testing under black-box and white-box adversarial attacks exposed notable vulnerabilities, with attack success rates reaching 30%, thereby highlighting the need for enhancement. Despite the complex morphology and linguistic variability of Moroccan Darija, adversarial training resulted in a notable improvement in both overall model performance and robustness against adversarial attacks, yielding an average increase of 20.89% in resistance to attacks. Furthermore, this approach enabled GPT-4o Mini to achieve an F1-score of 91%, surpassing the current state-of-the-art performance by 6%. These results highlight the importance of incorporating adversarial approaches in low-resource dialectal settings to effectively address linguistic variability and data scarcity.
Full article
(This article belongs to the Special Issue Natural Language Processing Applications in Big Data)
►▼
Show Figures

Figure 1
Open AccessArticle
FEM-Based Hybrid Compression Framework with Pipeline Implementation for Efficient Deep Neural Networks on Tiny ImageNet
by
Areej Hamza, Amel Tuama and Asraf Mohamed Moubark
Big Data Cogn. Comput. 2026, 10(5), 131; https://doi.org/10.3390/bdcc10050131 - 22 Apr 2026
Abstract
►▼
Show Figures
The high accuracy achieved by deep learning techniques has made them indispensable in computer vision applications. However, their substantial memory demands and high computational complexity limit their deployment in resource-constrained environments. To address this challenge, this study introduces a Feature Enhancement Module (FEM)
[...] Read more.
The high accuracy achieved by deep learning techniques has made them indispensable in computer vision applications. However, their substantial memory demands and high computational complexity limit their deployment in resource-constrained environments. To address this challenge, this study introduces a Feature Enhancement Module (FEM) as part of a unified hybrid compression framework that combines mixed-precision quantization and structured pruning to improve model efficiency. Experimental results on the Tiny ImageNet dataset using ResNet50 and MobileNetV3 architectures demonstrate the strong adaptability and scalability of the proposed approach. Compared with state-of-the-art compression methods, the proposed FEM-based framework achieves up to 6% improvement in Top-1 accuracy, while reducing memory usage by 32.26% and improving inference speed by 66%. Furthermore, the ablation study demonstrates that incorporating the FEM module leads to up to 24% improvement over the baseline model, highlighting its effectiveness. The results further show that FEM effectively preserves inter-channel feature representation stability even under aggressive compression, making it well suited for real-time processing and practical Artificial Intelligence (AI) applications. By maintaining semantic richness while significantly reducing computational cost, the proposed method bridges the gap between high-performance deep models and lightweight, deployable solutions. Overall, the FEM-based hybrid compression framework establishes a scalable and architecture-independent foundation for sustainable deep learning in resource-limited environments.
Full article

Graphical abstract
Open AccessArticle
A Data-Driven Machine Learning Framework for Multi-Criteria ESG Evaluation
by
Zhiyuan Wang, Tristan Lim, Yun Teng and Chongwu Xia
Big Data Cogn. Comput. 2026, 10(5), 130; https://doi.org/10.3390/bdcc10050130 - 22 Apr 2026
Abstract
This study proposes a novel data-driven machine learning (ML) framework for multi-criteria environmental, social, and governance (ESG) evaluation. The framework aims to address the transparency, consistency, and subjectivity limitations of existing ESG evaluation systems by employing a fully data-driven, modular, and ML-supported architecture.
[...] Read more.
This study proposes a novel data-driven machine learning (ML) framework for multi-criteria environmental, social, and governance (ESG) evaluation. The framework aims to address the transparency, consistency, and subjectivity limitations of existing ESG evaluation systems by employing a fully data-driven, modular, and ML-supported architecture. It comprises three main modules: (i) ESG data preprocessing with missing-data imputation by the MissForest algorithm; (ii) a three-plane ESG feature selection workflow that integrates clustering, feature importance, and classification algorithms to identify representative ESG indicators; and (iii) a hybrid weighting and ranking procedure that combines unsupervised principal component analysis (PCA), criteria importance through inter-criteria correlation (CRITIC), and technique for order preference by similarity to ideal solution (TOPSIS) methods. A recent 2024 real-world application involving 57 listed Chinese pharmaceutical and biotechnology companies and 70 ESG indicators demonstrates the framework’s practical utility in producing transparent and objective ESG rankings. The main contributions of this work are fourfold: (1) the development of an end-to-end, entirely data-driven ML framework for ESG evaluation; (2) the introduction of an innovative three-plane ESG feature selection workflow within the framework; (3) the first study for designing a hybrid PCA-CRITIC-TOPSIS approach in ESG weighting and ranking; (4) the validation of the framework through a real-world industry application using recent and authentic ESG data.
Full article
(This article belongs to the Section Data Mining and Machine Learning)
Open AccessReview
Fuzz Driver Generation: A Survey and Outlook from the Perspective of Data Sources
by
Xiao Feng, Shuaibing Lu, Taotao Gu, Yuanping Nie, Qian Yan, Mucheng Yang, Jinyang Chen and Xiaohui Kuang
Big Data Cogn. Comput. 2026, 10(4), 129; https://doi.org/10.3390/bdcc10040129 - 21 Apr 2026
Abstract
Fuzzing is an essential element of software supply chain security governance. Despite its importance, the widespread adoption of library fuzzing is limited by the significant costs associated with constructing fuzz drivers. Without a clear entry point, the reachable path space of the target
[...] Read more.
Fuzzing is an essential element of software supply chain security governance. Despite its importance, the widespread adoption of library fuzzing is limited by the significant costs associated with constructing fuzz drivers. Without a clear entry point, the reachable path space of the target library is determined by the interplay of API call sequences, parameter dependencies, and state constraints. As a result, fuzz drivers must achieve not only successful builds but also provide sufficient semantic context to enable exploration of deeper state machine interactions, thereby avoiding premature stagnation at superficial validation logic. To systematically assess advancements in automated fuzz driver generation, this paper develops a taxonomy organized around the primary data sources used to derive driver-generation constraints, categorizing existing approaches into four technological trajectories: Usage Artifact Mining, Source Code Constraint Inference, Binary Semantics Recovery, and Heterogeneous Data Fusion. Large language models are increasingly integrated into these workflows as generators and as components for constraint alignment and repair. To address inconsistencies in experimental methodologies, this paper introduces a bounded comparability-oriented evaluation perspective focused on three dimensions: validity, reachability-related evidence, and reproducibility and cost. Together with a disclosure and reporting protocol for metric comparability, this perspective clarifies the information needed for cross-study comparison and examines the unique features and inherent limitations of each technical trajectory. Based on these findings, three key directions for future research are identified: facilitating structural evolution in response to coverage plateaus to address deep logic unreachability; coordinating dynamic closed-loop orchestration that utilizes on-demand heterogeneous data retrieval to resolve context challenges; and developing language-agnostic driver representations with pluggable adaptation mechanisms to improve cross-ecosystem portability and scalability.
Full article
(This article belongs to the Special Issue Machine Learning Methodologies and Applications in Cybersecurity Data Analysis)
►▼
Show Figures

Figure 1
Open AccessArticle
A Reservoir Computing Approach for Synchronizing Discrete-Time 3D Chaotic Systems
by
Vismaya V. S, Swetha P, Jubin K. Babu, Diya Gijo, Varada M. T, Adithya K. K, Ekaterina Kopets and Sishu Shankar Muni
Big Data Cogn. Comput. 2026, 10(4), 128; https://doi.org/10.3390/bdcc10040128 - 21 Apr 2026
Abstract
►▼
Show Figures
Reservoir computing (RC) is an efficient framework for processing time-series data. This work investigates the synchronization of two independently trained reservoir computers that, after training, operate without external input from the chaotic system and interact solely through symmetric linear coupling. This approach addresses
[...] Read more.
Reservoir computing (RC) is an efficient framework for processing time-series data. This work investigates the synchronization of two independently trained reservoir computers that, after training, operate without external input from the chaotic system and interact solely through symmetric linear coupling. This approach addresses a gap in existing reservoir computing-based synchronization studies, which predominantly rely on master–slave or system-driven configurations. In this work, we first build and train two reservoir computing models based on 3D nonlinear chaotic maps and hyperchaotic systems and then introduce a symmetric linear coupling mechanism between them. This study demonstrates that reservoir computing can accurately reproduce the short-term dynamics of chaotic systems and provides insight into the interactions between learned dynamical models, while also helping us understand how complex systems connect and operate collectively. We use this systematic approach to establish a framework for understanding how two trained reservoir computers interact under varying coupling strengths, enabling a detailed investigation of their synchronization behavior. To demonstrate the adaptability of the proposed framework to diverse dynamical behaviors, we systematically investigated three discrete chaotic and hyperchaotic systems: (1) discrete 3D sinusoidal map with discrete Lorenz attractor, (2) 3D sinusoidal map with conjoined Lorenz twin attractor, and (3) 3D quadratic hyperchaotic map. For performance evaluation, we trained coupled RCs and computed the synchronization error for different coupling strengths. We also present phase portraits and time-series plots of the attractors and RCs, along with the synchronization error as a function of the coupling strength, thereby demonstrating the possibility of synchronization of two linearly coupled RCs, which are independently trained on discrete, three-dimensional chaotic and hyperchaotic systems.
Full article

Figure 1
Open AccessEditorial
Generative AI and Large Language Models
by
Fabrizio Marozzo and Riccardo Cantini
Big Data Cogn. Comput. 2026, 10(4), 127; https://doi.org/10.3390/bdcc10040127 - 21 Apr 2026
Abstract
In recent years, generative artificial intelligence and, in particular, large language models (LLMs) have rapidly transformed the landscape of data analysis, knowledge extraction, content generation, and intelligent decision support [...]
Full article
(This article belongs to the Special Issue Generative AI and Large Language Models)
Open AccessArticle
Edge Node Deployment for Turbidity Estimation in Farm Ponds
by
Martin Moreno, Iván Trejo-Zúñiga, Víctor Alejandro González-Huitrón, René Francisco Santana-Cruz, Raúl García García and Gabriela Pineda Chacón
Big Data Cogn. Comput. 2026, 10(4), 126; https://doi.org/10.3390/bdcc10040126 - 18 Apr 2026
Abstract
►▼
Show Figures
Image-based AI offers a low-cost alternative to traditional turbidity sensors in farm ponds, yet the prevailing shift toward Vision Transformers (ViTs) critically overlooks two field realities: the chronic scarcity of annotated data (Small Data) and the strict computational limits of edge hardware. This
[...] Read more.
Image-based AI offers a low-cost alternative to traditional turbidity sensors in farm ponds, yet the prevailing shift toward Vision Transformers (ViTs) critically overlooks two field realities: the chronic scarcity of annotated data (Small Data) and the strict computational limits of edge hardware. This study presents a frugal computer vision framework that challenges the need for complex architectures in environmental screening. By systematically benchmarking six deep learning models across a calibrated high-turbidity dataset (200–800 NTU, 700 images) under standardized capture conditions, we demonstrate that traditional Convolutional Neural Networks (CNNs) possess a crucial inductive bias for this task. Specifically, ResNet-50 significantly outperformed modern ViTs in both accuracy (96.3% vs. 80.0%) and data efficiency, effectively capturing spatial scattering patterns without the massive data requirements that hindered transformer convergence. Deployed on a resource-constrained Raspberry Pi 4, the CNN-based system achieved an inference latency of 46 ms, demonstrated in an initial hardware-in-the-loop field proof-of-concept (82.4% agreement under baseline, calm-weather conditions, ). This edge-native approach not only provides actionable spatial turbidity maps to guide on-farm filtration and livestock management decisions but also establishes a critical architectural baseline: under controlled capture protocols, mature CNNs consistently outperform ViTs, establishing them as the optimal architecture for frugal, small-data agricultural Internet of Things (IoT) deployments.
Full article

Graphical abstract
Open AccessArticle
LST-AGCN: A Novel Unified Lightweight Attention Framework for Efficient Skeleton-Based Action Recognition
by
Khadija Lasri, Khalid El Fazazy, Adnane Mohamed Mahraz, Hamid Tairi and Jamal Riffi
Big Data Cogn. Comput. 2026, 10(4), 125; https://doi.org/10.3390/bdcc10040125 - 18 Apr 2026
Abstract
While Graph Convolutional Networks (GCNs) have revolutionized skeleton-based action recognition, existing methods face a critical efficiency–accuracy dilemma: state-of-the-art approaches achieve high performance through computationally expensive multi-stream fusion (joint, bone, joint motion, and bone motion) and deep architectures, limiting real-world deployment on resource-constrained devices.
[...] Read more.
While Graph Convolutional Networks (GCNs) have revolutionized skeleton-based action recognition, existing methods face a critical efficiency–accuracy dilemma: state-of-the-art approaches achieve high performance through computationally expensive multi-stream fusion (joint, bone, joint motion, and bone motion) and deep architectures, limiting real-world deployment on resource-constrained devices. We propose LST-AGCN (Lightweight Spatial–Temporal Attention Graph Convolutional Network), introducing three technical contributions that address this challenge: (1) Unified Attention Module (UAM)—a framework that integrates channel, spatial, and temporal attention through a single compact operation, significantly reducing attention parameters compared to separate attention mechanisms; (2) Depthwise Separable Attention Mechanism (DSAM)—a factorization using depthwise separable convolutions that achieves linear complexity reduction from to in attention operations; and (3) Efficient Topology-Aware Fusion (ETAF)—an adaptive Joint-wise Attention strategy that captures fine-grained spatial relationships without quadratic complexity growth. Extensive experiments on NTU RGB+D 60 and NTU RGB+D 120 datasets demonstrate that LST-AGCN achieves strong performance using only joint modality (86.14%/94.0% and 79.5%/82.0% Top-1 accuracy with 99.0% Top-5 on cross-view) while requiring 14.11 M parameters and 19.02 GFLOPs, delivering efficient inference suitable for edge deployment.
Full article
(This article belongs to the Special Issue Advances in Artificial Intelligence for Computer Vision, Augmented Reality Virtual Reality and Metaverse)
►▼
Show Figures

Figure 1
Open AccessArticle
Understanding the Global Trends of 2025 Through the Defly Compass Methodology
by
Mabel López Bordao, Antonia Ferrer Sapena, Carlos A. Reyes Pérez and Enrique A. Sánchez Pérez
Big Data Cogn. Comput. 2026, 10(4), 124; https://doi.org/10.3390/bdcc10040124 - 17 Apr 2026
Abstract
►▼
Show Figures
This study aims to identify and synthesize the major global trends that shaped 2025 by applying the DeflyCompass methodology to a curated corpus of strategic foresight reports. The study synthesizes insights from 23 strategic reports published by leading international organizations, including the World
[...] Read more.
This study aims to identify and synthesize the major global trends that shaped 2025 by applying the DeflyCompass methodology to a curated corpus of strategic foresight reports. The study synthesizes insights from 23 strategic reports published by leading international organizations, including the World Economic Forum, Accenture, Euromonitor, and major technology firms. Methodologically, DeflyCompass operationalizes a structured hybrid human–AI pipeline comprising the deployment of multi-agent AI systems, automated knowledge graph construction, semantic clustering, and hybrid human–AI validation processes, reducing an initial set of 816 preliminary signals to a validated catalog of 50 high-priority trends across six PESTEL domains: Political, Economic, Social, Technological, Environmental, and Legal/Governance. Key findings indicate that artificial intelligence functions as a systemic enabling technology across all domains, climate and sustainability imperatives permeate multiple domains, geopolitical fragmentation introduces systemic tension, and trust deficits emerge as a critical vulnerability. The study contributes a replicable and scalable framework for global-level strategic foresight that operationalizes human–AI integration within a rigorous expert-driven validation process, complementing existing hybrid analytical approaches in the literature. Implications extend to decision-making in technology governance, sustainability strategy, social adaptation, and scenario planning, highlighting the necessity of integrating AI augmentation with human expertise for effective future-oriented planning.
Full article

Graphical abstract
Journal Menu
► ▼ Journal Menu-
- BDCC Home
- Aims & Scope
- Editorial Board
- Reviewer Board
- Topical Advisory Panel
- Instructions for Authors
- Special Issues
- Topics
- Sections & Collections
- Article Processing Charge
- Indexing & Archiving
- Editor’s Choice Articles
- Most Cited & Viewed
- Journal Statistics
- Journal History
- Journal Awards
- Conferences
- Editorial Office
Journal Browser
► ▼ Journal BrowserHighly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
Actuators, Algorithms, BDCC, Future Internet, JMMP, Machines, Robotics, Systems
Smart Product Design and Manufacturing on Industrial Internet
Topic Editors: Pingyu Jiang, Jihong Liu, Ying Liu, Jihong YanDeadline: 30 June 2026
Topic in
Sensors, Electronics, Technologies, AI, Entropy, Quantum Reports, BDCC
Responsible Classic/Quantum AI Technologies for Industrial Applications
Topic Editors: Youyang Qu, Khandakar Ahmed, Zhiyi TianDeadline: 31 July 2026
Topic in
AI, BDCC, Future Internet, Information, Sustainability
Big Data and Artificial Intelligence, 3rd Edition
Topic Editors: Miltiadis D. Lytras, Andreea Claudia SerbanDeadline: 30 August 2026
Topic in
Computers, Electronics, Future Internet, IoT, Network, Sensors, JSAN, Technologies, BDCC
Challenges and Future Trends of Wireless Networks
Topic Editors: Stefano Scanzio, Ramez Daoud, Jetmir Haxhibeqiri, Pedro SantosDeadline: 30 September 2026
Conferences
Special Issues
Special Issue in
BDCC
Machine Learning Applications in Natural Language Processing
Guest Editors: Ying Weng, Kecheng Liu, Chao LiDeadline: 20 May 2026
Special Issue in
BDCC
Application of Digital Technology in Financial Development
Guest Editors: Wei Li, Michael C. S. WongDeadline: 30 May 2026
Special Issue in
BDCC
Human-Centered and Sustainable Artificial Intelligence: Emerging Perspectives in HCI
Guest Editors: Luciano Alessandro Ipsaro Palesi, Damiano Perri, Kouzeleas SteliosDeadline: 31 May 2026
Special Issue in
BDCC
Internet Intelligence for Cybersecurity
Guest Editor: Hui TianDeadline: 12 June 2026



