You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.

Search for Articles:

Title / Keyword

Author / Affiliation / Email

Journal

Article Type

Advanced Search

Section

Special Issue

Volume

Issue

Number

Page

Logical OperatorOperator

Search Text

Search Type

Cognitive Networks and Text Analysis Identify Anxiety as a Key Dimension of Distress in Genuine Suicide Notes
Image First or Text First? Optimising the Sequencing of Modalities in Large Language Model Prompting and Reasoning Tasks
Enhancing Recommendation Systems with Real-Time Adaptive Learning and Multi-Domain Knowledge Graphs
State of the Art and Future Directions of Small Language Models: A Systematic Review
CNN-Based Framework for Classifying COVID-19, Pneumonia, and Normal Chest X-Rays

Journal Description

Big Data and Cognitive Computing

Big Data and Cognitive Computing is an international, peer-reviewed, open access journal on big data and cognitive computing published monthly online by MDPI.

Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, Inspec, Ei Compendex, and other databases.
Journal Rank: JCR - Q1 (Computer Science, Theory and Methods) / CiteScore - Q1 (Computer Science Applications)
Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 24.5 days after submission; acceptance to publication is undertaken in 4.6 days (median values for papers published in this journal in the first half of 2025).
Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.

Impact Factor: 4.4 (2024); 5-Year Impact Factor: 4.2 (2024)

Imprint Information Journal Flyer Open Access ISSN: 2504-2289

Latest Articles

26 pages, 3558 KB

Open AccessArticle

Application of Inverse Optimization Algorithms in Neural Network Models for Short-Term Stock Price Forecasting

by Ekaterina Gribanova, Roman Gerasimov and Elena Viktorenko

Big Data Cogn. Comput. 2025, 9(9), 235; https://doi.org/10.3390/bdcc9090235 - 9 Sep 2025

This paper introduces novel inverse optimization algorithms (RC and DC) for neural network training in stock price forecasting in an attempt to overcome the traditional gradient descent limitation of local minima convergence. The key novelty is a stochastic algorithm for inverse problems adapted [...] Read more.

This paper introduces novel inverse optimization algorithms (RC and DC) for neural network training in stock price forecasting in an attempt to overcome the traditional gradient descent limitation of local minima convergence. The key novelty is a stochastic algorithm for inverse problems adapted to neural network training, where target function values decrease iteratively through selective weight modification. Experimental analysis used closing price data from 40 Russian companies, comparing traditional activation functions (linear, sigmoid, tanh) with specialized functions (sincos, cloglogm, mish) across perceptrons and single-hidden-layer networks. Key findings show the superiority of the DC method for single-layer networks, while RC proves most effective for hidden-layer networks. The linear activation function with the RC algorithm delivered optimal results in most experiments, challenging conventional nonlinear activation preferences. The optimal architecture, namely, a single hidden layer with two neurons, achieved the best prediction accuracy in 70% of cases. The research confirms that inverse optimization algorithms can provide higher training efficiency than classical gradient methods, offering practical improvements for financial forecasting. Full article

► Show Figures

Figure 1

18 pages, 1041 KB

Open AccessArticle

Hierarchical Discourse-Semantic Modeling for Zero Pronoun Resolution in Chinese

by Tingxin Wei, Jiabin Li, Xiaoling Ye and Weiguang Qu

Big Data Cogn. Comput. 2025, 9(9), 234; https://doi.org/10.3390/bdcc9090234 - 9 Sep 2025

Understanding discourse context is fundamental to human language comprehension. Despite the remarkable progress achieved by Large Language Models, they still struggle with discourse-level anaphora resolution, particularly in Chinese. One major challenge is zero anaphora, a prevalent linguistic phenomenon in which referential elements are [...] Read more.

Understanding discourse context is fundamental to human language comprehension. Despite the remarkable progress achieved by Large Language Models, they still struggle with discourse-level anaphora resolution, particularly in Chinese. One major challenge is zero anaphora, a prevalent linguistic phenomenon in which referential elements are omitted, increasing complexity and ambiguity for computational models. To address this issue, we introduce CDAMR (Chinese Discourse Abstract Meaning Representation), a novel annotated corpus that systematically labels zero pronouns across diverse syntactic positions along with their discourse-level coreference chains. In addition, we present a hierarchical discourse-semantic enhanced model that separately encodes local discourse semantics and global discourse semantics, and models their interactions via structured multi-attention mechanisms. Experiments on both CDAMR and OntoNotes demonstrate the approach’s cross-corpus generalizability and effectiveness, achieving F1 scores of 59.86% and 60.54%, respectively. Ablation studies further confirm that discourse-level semantics significantly enhance zero pronoun resolution. These findings highlight the value of cognitively inspired discourse modeling and the importance of comprehensive discourse annotations for languages with limited explicit syntactic cues such as Chinese. Full article

(This article belongs to the Special Issue Natural Language Processing Applications in Big Data)

► Show Figures

Graphical abstract

14 pages, 2024 KB

Open AccessArticle

Field Robotics Education Through Educational Escape Rooms—A Design Study

by Robert Ross and Matthew Felicetti

Big Data Cogn. Comput. 2025, 9(9), 233; https://doi.org/10.3390/bdcc9090233 - 8 Sep 2025

One challenge faced by many educators is strongly engaging students to improve their intrinsic motivation in learning. This paper describes the design and beta testing of two educational escape rooms targeted towards teaching students concepts related to field robotics—an area in which educational [...] Read more.

One challenge faced by many educators is strongly engaging students to improve their intrinsic motivation in learning. This paper describes the design and beta testing of two educational escape rooms targeted towards teaching students concepts related to field robotics—an area in which educational escape rooms have yet to be used. These table-top activities are designed to strongly engage students with robotics-centric puzzles, a fun narrative, and collaborative problem-solving, with validation provided by an electronic decoder box. The sets of puzzles were beta-tested by teams of academics with a robotics background and by undergraduate students. The results indicate that participants had a high level of enjoyment and intrinsic motivation to partake in the activities, although the difficulty and in-game dynamics of some of the tasks will need to be modified for widespread deployment in the classroom. Full article

(This article belongs to the Special Issue Field Robotics and Artificial Intelligence (AI))

► Show Figures

Figure 1

25 pages, 4064 KB

Open AccessArticle

Variable Working Condition Fault Diagnosis Method for Rotating Machinery Based on Dual-Task Cognitive Cost Sensitivity

by Qianwen Jiang, Jinghua Xu, Shuyou Zhang, Xiaojian Liu and Kang Wang

Big Data Cogn. Comput. 2025, 9(9), 232; https://doi.org/10.3390/bdcc9090232 - 8 Sep 2025

Accurate fault diagnosis of rotating machinery in complex environments and under changing operating conditions remains a key challenge in industrial systems. In this paper, we propose a novel fault diagnosis algorithm named dual-task cognitive cost sensitivity (DCCS), designed for high-accuracy diagnosis of rotary [...] Read more.

Accurate fault diagnosis of rotating machinery in complex environments and under changing operating conditions remains a key challenge in industrial systems. In this paper, we propose a novel fault diagnosis algorithm named dual-task cognitive cost sensitivity (DCCS), designed for high-accuracy diagnosis of rotary bearing faults and small-sample scenarios under variable working conditions. The method integrates four modules: CNN for local feature extraction, LSTM for temporal features, Softmax for classification, and a DCCS-based hyperparameter optimization module. A dual-task learning objective is formulated by combining losses from both full-condition and few-shot variable-condition datasets, with adaptive cost-sensitive weighting to balance learning focus. The integration of cognitive cost sensitivity with transfer learning enhances the model’s adaptability, allowing it to flexibly generalize across different operating conditions. Experiments on the CWRU dataset demonstrate that the method achieves 99.33% accuracy within fewer training epochs and shows strong robustness to noise. Compared with mainstream optimization methods, DCCS offers higher efficiency with reduced computation time. In cross-condition diagnosis, it improves accuracy by up to 10.94 percentage points over the original Alpha Evolution algorithm, effectively addressing the challenge of limited samples in varying environments. Full article

(This article belongs to the Special Issue Smart Manufacturing in the AI Era)

► Show Figures

Figure 1

28 pages, 21851 KB

Open AccessArticle

A Critical Assessment of Modern Generative Models’ Ability to Replicate Artistic Styles

by Andrea Asperti, Franky George, Tiberio Marras, Razvan Ciprian Stricescu and Fabio Zanotti

Big Data Cogn. Comput. 2025, 9(9), 231; https://doi.org/10.3390/bdcc9090231 - 6 Sep 2025

In recent years, advancements in generative artificial intelligence have led to the development of sophisticated tools capable of mimicking diverse artistic styles, opening new possibilities for digital creativity and artistic expression. This paper presents a critical assessment of the style replication capabilities of [...] Read more.

In recent years, advancements in generative artificial intelligence have led to the development of sophisticated tools capable of mimicking diverse artistic styles, opening new possibilities for digital creativity and artistic expression. This paper presents a critical assessment of the style replication capabilities of contemporary generative models, evaluating their strengths and limitations across multiple dimensions. We examine how effectively these models reproduce traditional artistic styles while maintaining structural integrity and compositional balance in the generated images. The analysis is based on a new large dataset of AI-generated works imitating artistic styles of the past, holding potential for a wide range of applications: the “AI-Pastiche” dataset. This study is supported by extensive user surveys, collecting diverse opinions on the dataset and investigating both technical and aesthetic challenges, including the ability to generate outputs that are realistic and visually convincing, the versatility of models in handling a wide range of artistic styles, and the extent to which they adhere to the content and stylistic specifications outlined in prompts, preserving cohesion and integrity in generated images. This paper aims to provide a comprehensive overview of the current state of generative tools in style replication, offering insights into their technical and artistic limitations, potential advancements in model design and training methodologies, and emerging opportunities for enhancing digital artistry, human–AI collaboration, and the broader creative landscape. Full article

► Show Figures

Figure 1

30 pages, 3813 KB

Open AccessArticle

Analysis of the Effect of Attention Mechanism on the Accuracy of Deep Learning Models for Fake News Detection

by Kristína Machová, Marián Mach and Viliam Balara

Big Data Cogn. Comput. 2025, 9(9), 230; https://doi.org/10.3390/bdcc9090230 - 4 Sep 2025

The main objective of the paper is to verify whether the integration of attention mechanisms could improve the effectiveness of online fake news detection models. The models were training using selected deep learning methods, which were suitable for text processing, such as CNN [...] Read more.

The main objective of the paper is to verify whether the integration of attention mechanisms could improve the effectiveness of online fake news detection models. The models were training using selected deep learning methods, which were suitable for text processing, such as CNN (Convolutional Neural Network), LSTM (Lon-short Term Memory), BiLSTM (Bidirectional LSTM), GRU (Gated Recurrent Unit), and transformer. The novelty of the paper lies in the addition of attention mechanisms to each of those models, and comparison of their performance across both datasets, LIAR and WELFake. Afterwards, an analysis of resulting changes in terms of the detection performance was carried out. The paper also describes the issue of toxicity in the online space and how it affects society, the toxicity sources, and methods to tackle it. Furthermore, the article provides a description of individual deep learning methods and the principles of attention mechanism. Finally, it was shown that the attention mechanism can increase the accuracy of basic models for fake news detection; however, the differences are insignificant in the case of the LIAR dataset. The reason for this can be found in the dataset itself. In contrast, the addition of attention mechanism to models on the WELFake dataset showed a significant improvement of results, where the average accuracy was 0.967 and average F1-rate was 0.968. These results were better than the results of experiments with the simple transformer. Comparison of the results showed that it makes sense to enrich the basic neural network models with the attention mechanisms, especially with the multi-head attention mechanism. The key finding is that attention mechanisms can enhance fake news detection performance when applied to high-quality, well-balanced datasets. Full article

► Show Figures

Figure 1

24 pages, 2532 KB

Open AccessArticle

Improved Particle Swarm Optimization Based on Fuzzy Controller Fusion of Multiple Strategies for Multi-Robot Path Planning

by Jialing Hu, Yanqi Zheng, Siwei Wang and Changjun Zhou

Big Data Cogn. Comput. 2025, 9(9), 229; https://doi.org/10.3390/bdcc9090229 - 2 Sep 2025

Robots play a crucial role in experimental smart cities and are ubiquitous in daily life, especially in complex environments where multiple robots are often needed to solve problems collaboratively. Researchers have found that the swarm intelligence optimization algorithm has a better performance in [...] Read more.

Robots play a crucial role in experimental smart cities and are ubiquitous in daily life, especially in complex environments where multiple robots are often needed to solve problems collaboratively. Researchers have found that the swarm intelligence optimization algorithm has a better performance in planning robot paths, but the traditional swarm intelligence algorithm cannot be targeted to solve the robot path planning problem in difficult problem. Therefore, this paper aims to introduce a fuzzy controller, mutation factor, exponential noise, and other strategies on the basis of particle swarm optimization to solve this problem. By judging the moving speed of different particles at different periods of the algorithm, the individual learning factor and social learning factor of the particles are obtained by fuzzy controller, and using the leader particle and random particle, designing a new dynamic balance of mutation factor, with the iterative process of the adaptation value of continuous non-updating counter and continuous updating counter to control the proportion of the elite individuals and random individuals. Finally, using exponential noise to update the matrix of the population every 50 iterations is a way to balance the local search ability and global exploration ability of the algorithm. In order to test the proposed algorithm, the main method of this paper is simulated on simple scenarios, complex scenarios, and random maps consisting of different numbers of static obstacles and dynamic obstacles, and the algorithm proposed in this paper is compared with eight other algorithms. The average path deviation error of the planned paths is smaller; the average distance of untraveled target is shorter; the number of steps of the robot movements is smaller, and the path is shorter, which is superior to the other eight algorithms. This superiority in solving multi-robot cooperative path planning has good practicality in many fields such as logistics and distribution, industrial automation operation, and so on. Full article

(This article belongs to the Special Issue Evolutionary Computation and Artificial Intelligence: Building a Sustainable Future for Smart Cities)

► Show Figures

Figure 1

40 pages, 5180 KB

Open AccessArticle

E-SATNet: Evaluating Student Satisfaction with Lecturer Responses in Asynchronous Online Discussions Using Sentiment and Semantic Similarity Analysis

by Sulis Sandiwarno, Dana Indra Sensuse, Harry Budi Santoso, Deden Sumirat Hidayat, Ally S. Nyamawe and Abdallah Yousif

Big Data Cogn. Comput. 2025, 9(9), 228; https://doi.org/10.3390/bdcc9090228 - 2 Sep 2025

Assessing e-learning students’ satisfaction with lecturers’ interactions in asynchronous forums is essential for enhancing teaching and learning processes. The discussion forum allows students to share comments and ideas with peers or lecturers, stimulating diverse perspectives and improving learning efficacy. However, lecturers’ responses are [...] Read more.

Assessing e-learning students’ satisfaction with lecturers’ interactions in asynchronous forums is essential for enhancing teaching and learning processes. The discussion forum allows students to share comments and ideas with peers or lecturers, stimulating diverse perspectives and improving learning efficacy. However, lecturers’ responses are often similar or redundant to previous students’ comments, limiting feedback depth and potentially reducing students’ perceived value of the interaction. Machine learning classifiers have been widely used to assess satisfaction based on sentiment or semantic similarity. However, integrating sentiment and semantic similarity between students’ comments or opinions and lecturers’ responses in asynchronous online discussion forums has received limited attention and may be improved. Through this research, we propose a novel model called E-learning Satisfaction Assessment using Textual Neural Network (E-SATNet). The E-SATNet model has two main sub-networks. The first sub-network employs a Convolutional Neural Network (CNN) to extract sentiment-related features from students’ reactions to lecturers’ responses. The second sub-network utilizes a Bidirectional Long Short-Term Memory (BiLSTM) to extract semantic features from lecturers’ responses and compute their similarity with the overall discussion content. Evaluation results show that E-SATNet effectively assesses satisfaction, achieving an average F1-score of 88.12. Full article

(This article belongs to the Special Issue Natural Language Processing Applications in Big Data)

► Show Figures

Figure 1

41 pages, 1205 KB

Open AccessArticle

A Novel Framework for Evaluating Polarization in Online Social Networks

by Christopher Buratti, Michele Marchetti, Federica Parlapiano, Domenico Ursino and Luca Virgili

Big Data Cogn. Comput. 2025, 9(9), 227; https://doi.org/10.3390/bdcc9090227 - 1 Sep 2025

In online communities, polarization refers to the phenomenon in which individuals become more divided and extreme in their opinions due to their exposure to specific content. In this paper, we present a network-based framework for evaluating polarization levels in Online Social Networks (OSNs). [...] Read more.

In online communities, polarization refers to the phenomenon in which individuals become more divided and extreme in their opinions due to their exposure to specific content. In this paper, we present a network-based framework for evaluating polarization levels in Online Social Networks (OSNs). Starting from a dataset of comments, our framework creates a network of user interactions and leverages the Louvain algorithm, the Rao’s Quadratic Entropy, and ego networks to assess the polarization level of communities and the most influential users. To test our framework, we leveraged a dataset of tweets about climate change. After performing Extraction, Transformation and Loading activities on the dataset, we evaluated its labels, identified communities, and analyzed their polarization level and that of the most influential users. We also analyzed the ego networks of believers and deniers and the aggressiveness of the corresponding tweets. Our analysis revealed the existence of polarized communities and homophily among the most influential users. It also showed that the type of communication used to disseminate information influences the polarization level of both communities and individual users. These results demonstrate our framework’s ability to support the polarization analysis in OSNs. Full article

(This article belongs to the Special Issue Advances in Complex Networks)

► Show Figures

Figure 1

20 pages, 1357 KB

Open AccessArticle

FedPLDSE: Submodel Extraction for Federated Learning in Heterogeneous Smart City Devices

by Xiaochi Hou, Zhigang Wang, Xinhao Wang and Junfeng Zhao

Big Data Cogn. Comput. 2025, 9(9), 226; https://doi.org/10.3390/bdcc9090226 - 30 Aug 2025

Federated learning enables collaborative model training across distributed devices while preserving data privacy. However, in real-world environments such as smart cities, heterogeneous and resource-constrained edge devices often render existing methods impractical. Low-power sensors and cameras struggle to complete full-model training, while high-performance devices [...] Read more.

Federated learning enables collaborative model training across distributed devices while preserving data privacy. However, in real-world environments such as smart cities, heterogeneous and resource-constrained edge devices often render existing methods impractical. Low-power sensors and cameras struggle to complete full-model training, while high-performance devices remain idly waiting for others. Knowledge distillation approaches rely on public datasets that are rarely available or poorly aligned with urban data, which limits their effectiveness in deployment. These limitations lead to inefficiencies, unstable convergence, and poor adaptability in diverse urban networks. Partial training alleviates some challenges by allowing clients to train submodels tailored to their capacity, but existing methods still incur high computational costs for identifying important parameters and suffer from uneven parameter updates, reducing model effectiveness. To address these challenges, we propose Parameter-Level Dynamic Submodel Extraction (PLDSE), a lightweight and adaptive framework for federated learning. PLDSE estimates parameter importance using gradient-based scores on a server-side validation set, reducing overhead while accurately identifying critical parameters. In addition, it integrates a rolling scheduling mechanism to rotate unselected parameters, ensuring full coverage and consistent model updates. Experiments on CIFAR-10, CIFAR-100, and Fashion-MNIST demonstrate superior accuracy and faster convergence, with PLDSE achieving 62.82% on CIFAR-100 under low heterogeneity and 61.51% under high heterogeneity, outperforming prior methods. Full article

(This article belongs to the Special Issue Evolutionary Computation and Artificial Intelligence: Building a Sustainable Future for Smart Cities)

► Show Figures

Figure 1

19 pages, 1788 KB

Open AccessArticle

Can Telematics Improve Driving Style? The Use of Behavioral Data in Motor Insurance

by Alberto Cevolini, Elena Morotti, Elena Esposito, Lorenzo Romanelli, Riccardo Tisseur and Cristiano Misani

Big Data Cogn. Comput. 2025, 9(9), 225; https://doi.org/10.3390/bdcc9090225 - 29 Aug 2025

Motor insurance can use telematics data not only to understand individual driving style but also to implement innovative coaching strategies that feed back to the drivers, through an app, the aggregated information extracted from the data. The purpose is to encourage an improvement [...] Read more.

Motor insurance can use telematics data not only to understand individual driving style but also to implement innovative coaching strategies that feed back to the drivers, through an app, the aggregated information extracted from the data. The purpose is to encourage an improvement in their driving style. A precondition for this improvement is that drivers are digitally engaged, that is, they interact with the app. This paper proposes a narrow understanding of the term engagement, referring to users’ interactions with the app. This interaction is also a behavior producing specific data that can be tracked and used by insurance companies. Based on the empirical investigation of the dataset of a company selling a telematics motor insurance policy, our research investigates if there is a correlation between engagement with the app and improvement of driving style. The analysis distinguishes different groups of users with different driving abilities, and takes into account time differences. Our findings contribute to clarifying the methodological challenges that must be addressed when exploring engagement and coaching effectiveness in proactive insurance policies. We conclude by discussing the possibility and difficulties of tracking and using second-order behavioral data related to policyholder engagement with the app. Full article

► Show Figures

Figure 1

25 pages, 4657 KB

Open AccessArticle

Identifying Methodological Language in Psychology Abstracts: A Machine Learning Approach Using NLP and Embedding-Based Clustering

by Konstantinos G. Stathakis, George Papageorgiou and Christos Tjortjis

Big Data Cogn. Comput. 2025, 9(9), 224; https://doi.org/10.3390/bdcc9090224 - 29 Aug 2025

Research articles are valuable resources for Information Retrieval and Natural Language Processing (NLP) tasks, offering opportunities to analyze key components of scholarly content. This study investigates the presence of methodological terminology in psychology research over the past 30 years (1995–2024) by applying a [...] Read more.

Research articles are valuable resources for Information Retrieval and Natural Language Processing (NLP) tasks, offering opportunities to analyze key components of scholarly content. This study investigates the presence of methodological terminology in psychology research over the past 30 years (1995–2024) by applying a novel NLP and Machine Learning pipeline to a large corpus of 85,452 abstracts, as well as the extent to which this terminology forms distinct thematic groupings. Combining glossary-based extraction, contextualized language model embeddings, and dual-mode clustering, this study offers a scalable framework for the exploration of methodological transparency in scientific text via deep semantic structures. A curated glossary of 365 method-related keywords served as a gold-standard reference for term identification, using direct and fuzzy string matching. Retrieved terms were encoded with SciBERT, averaging embeddings across contextual occurrences to produce unified vectors. These vectors were clustered using unsupervised and weighted unsupervised approaches, yielding six and ten clusters, respectively. Cluster composition was analyzed using weighted statistical measures to assess term importance within and across groups. A total of 78.16% of the examined abstracts contained glossary terms, with an average of 1.8 term per abstract, highlighting an increasing presence of methodological terminology in psychology and reflecting a shift toward greater transparency in research reporting. This work goes beyond the use of static vectors by incorporating contextual understanding in the examination of methodological terminology, while offering a scalable and generalizable approach to semantic analysis in scientific texts, with implications for meta-research, domain-specific lexicon development, and automated scientific knowledge discovery. Full article

(This article belongs to the Special Issue Machine Learning Applications in Natural Language Processing)

► Show Figures

Figure 1

33 pages, 4621 KB

Open AccessArticle

Data Obfuscation for Privacy-Preserving Machine Learning Using Quantum Symmetry Properties

by Sebastian Raubitzek, Sebastian Schrittwieser, Alexander Schatten and Kevin Mallinger

Big Data Cogn. Comput. 2025, 9(9), 223; https://doi.org/10.3390/bdcc9090223 - 29 Aug 2025

This study introduces a data obfuscation technique that leverages the exponential map of Lie-group generators. Originating from quantum machine learning frameworks, the method injects controlled noise into these generators, deliberately breaking symmetry and obscuring the source data while retaining predictive utility. Experiments on [...] Read more.

This study introduces a data obfuscation technique that leverages the exponential map of Lie-group generators. Originating from quantum machine learning frameworks, the method injects controlled noise into these generators, deliberately breaking symmetry and obscuring the source data while retaining predictive utility. Experiments on open medical datasets show that classifiers trained on obfuscated features match or slightly exceed the baseline accuracy obtained on raw data. This work demonstrates how Lie-group theory can advance privacy in sensitive domains by providing simultaneous data obfuscation and augmentation. Full article

► Show Figures

Figure 1

18 pages, 526 KB

Open AccessArticle

DPBD: Disentangling Preferences via Borrowing Duration for Book Recommendation

by Zhifang Liao, Liping Chen, Yuelan Qi and Fei Li

Big Data Cogn. Comput. 2025, 9(9), 222; https://doi.org/10.3390/bdcc9090222 - 28 Aug 2025

Traditional book recommendation methods predominantly rely on collaborative filtering and context-based approaches. However, existing methods fail to account for the order of users’ book borrowings and the duration they hold them, both of which are crucial indicators reflecting users’ book preferences. To address [...] Read more.

Traditional book recommendation methods predominantly rely on collaborative filtering and context-based approaches. However, existing methods fail to account for the order of users’ book borrowings and the duration they hold them, both of which are crucial indicators reflecting users’ book preferences. To address this challenge, we propose a book recommendation framework called DPBD, which disentangles preferences based on borrowing duration, thereby explicitly modeling temporal patterns in library borrowing behaviors. The DPBD model adopts a dual-path neural architecture comprising the following: (1) The item-level path utilizes self-attention networks to encode historical borrowing sequences while incorporating borrowing duration as an adaptive weighting mechanism for attention score refinement. (2) The feature-level path employs gated fusion modules to effectively aggregate multi-source item attributes (e.g., category and title), followed by self-attention networks to model feature transition patterns. The framework subsequently combines both path representations through fully connected layers to generate user preference embeddings for next-book recommendation. Extensive experiments conducted on two real-world university library datasets demonstrate the superior performance of the proposed DPBD model compared with baseline methods. Specifically, the model achieved 13.67% and 15.75% on HR@1 and 15.75% and 12.90% on NDCG@1 across the two datasets. Full article

► Show Figures

Figure 1

17 pages, 2183 KB

Open AccessArticle

Data-Driven Pseudo-Crack Cognition and Removal for Intelligent Pavement Inspection with Gradient Priority and Self-Attention

by Renping Xie, Lin Liu, Mengyao Chen, Chenxi Pang and Ming Tao

Big Data Cogn. Comput. 2025, 9(9), 221; https://doi.org/10.3390/bdcc9090221 - 27 Aug 2025

Road surface cracks are the most common and significant diseases in concrete pavement inspection. However, the presence of crack-like edges on objects such as water stains, fallen leaves, and ruts often result in the false detection of concrete pavement cracks. To better recognize [...] Read more.

Road surface cracks are the most common and significant diseases in concrete pavement inspection. However, the presence of crack-like edges on objects such as water stains, fallen leaves, and ruts often result in the false detection of concrete pavement cracks. To better recognize pseudo-cracks, we first construct a novel dataset containing real pseudo-crack images for training and evaluation. To distinguish pseudo-cracks within images, a gradient prior is introduced to enhance the network’s perception of the detailed changes in crack edges, thereby improving its crack localization capability. Next, a self-attention mechanism is employed to focus on the extraction of global crack features, effectively mitigating interference from pseudo-crack features. Subsequently, deep global semantic features are fused with shallow detail features through dense connections, enriching feature extraction while circumventing the issue of edge gradient disappearance often encountered in deeper networks. Finally, the concatenation of deep global features with shallow detail features enhances the utilization of effective features, enabling robust pseudo-crack removal and preserving the continuity and integrity of the detected cracks. To validate the effectiveness of the proposed approach, we conduct comparative experiments with several crack detection methods across multiple datasets. The results demonstrate that our method achieves superior performance in both quantitative indicators and visual effects. Full article

► Show Figures

Graphical abstract

27 pages, 57533 KB

Open AccessArticle

Assessing the Influence of Feedback Strategies on Errors in Crowdsourced Annotation of Tumor Images

by Jose Alejandro Libreros, Edwin Gamboa, Erik Henke and Matthias Hirth

Big Data Cogn. Comput. 2025, 9(9), 220; https://doi.org/10.3390/bdcc9090220 - 26 Aug 2025

Crowdsourcing enables the acquisition of distributed human intelligence for solving tasks involving human judgments in scalable ways, with many use cases in various application areas accessing human intelligence. However, crowdworkers completing the tasks may have limited or no background knowledge about the tasks [...] Read more.

Crowdsourcing enables the acquisition of distributed human intelligence for solving tasks involving human judgments in scalable ways, with many use cases in various application areas accessing human intelligence. However, crowdworkers completing the tasks may have limited or no background knowledge about the tasks they solve due to the plethora of various tasks available. Therefore, the tasks—even on a micro scale—also need to include appropriate training for the crowdworkers to enable them to complete them successfully. However, training crowdworkers efficiently in a short time for complex tasks poses a challenge and remains an unresolved issue. This paper addresses this challenge by empirically comparing different training strategies for crowdworkers and evaluating their impact on the crowdworkers’ task results. We perform comparisons between a basic training strategy, a strategy based on previous errors made by other crowdworkers, and the addition of instant feedback during training and task completion. Our results show that adding instant feedback during both the training phase and during the task yields more attention from the workers in difficult tasks and hence reduces errors and improves the results. We conclude that more attention is retained when the content of instant feedback includes information about mistakes made by other crowdworkers previously. Full article

(This article belongs to the Topic Applications of Image and Video Processing in Medical Imaging)

► Show Figures

Figure 1

33 pages, 4547 KB

Open AccessSystematic Review

A Systematic Literature Review of Artificial Intelligence in Prehospital Emergency Care

by Omar Elfahim, Kokou Laris Edjinedja, Johan Cossus, Mohamed Youssfi, Oussama Barakat and Thibaut Desmettre

Big Data Cogn. Comput. 2025, 9(9), 219; https://doi.org/10.3390/bdcc9090219 - 26 Aug 2025

Background: The emergency medical services (EMS) sector, as a complex system, presents substantial hurdles in providing excellent treatment while operating within limited resources, prompting greater adoption of artificial intelligence (AI) as a tool for improving operational efficiency. While AI models have proved beneficial [...] Read more.

Background: The emergency medical services (EMS) sector, as a complex system, presents substantial hurdles in providing excellent treatment while operating within limited resources, prompting greater adoption of artificial intelligence (AI) as a tool for improving operational efficiency. While AI models have proved beneficial in healthcare operations, there is limited explainability and interpretability, as well as a lack of data used in their application and technological advancement. Methods: The scoping review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for scoping reviews, using PubMed, IEEE Xplore, and Web of Science, with a procedure of double screening and extraction. The search included articles published from 2018 to the beginning of 2025. Studies were excluded if they did not explicitly identify an artificial intelligence (AI) component, lacked relevance to emergency department (ED) or prehospital contexts, failed to report measurable outcomes or evaluations, or did not exploit real-world data. We analyzed the data source used, clinical subclasses, AI domains, ML algorithms, their performance, as well as potential roles for large language models (LLMs) in future applications. Results: A comprehensive PRISMA-guided methodology was used to search academic databases, finding 1181 papers on prehospital emergency treatment from 2018 to 2025, with 65 articles identified after an extensive screening procedure. The results reveal a significant increase in AI publications. A notable technological advancement in the application of AI in EMS using different types of data was explored. Conclusions: These findings highlighted that AI and ML have emerged as revolutionary innovations with huge potential in the fields of healthcare and medicine. There are several promising AI interventions that can improve prehospital emergency care, particularly for out-of-hospital cardiac arrest and triage prioritization scenarios. Implications for EMS Practice: Integrating AI methods into prehospital care can optimize the use of available resources, as well as triage and dispatch efficiency. LLMs may have the potential to improve understanding and assist in decision-making under pressure in emergency situations by combining various forms of recorded data. However, there is a need to emphasize continued research and strong collaboration between AI experts and EMS physicians to ensure the safe, ethical, and effective integration of AI into EMS practice. Full article

(This article belongs to the Topic AI for Natural Disasters Detection, Prediction and Modeling)

► Show Figures

Figure 1

29 pages, 848 KB

Open AccessArticle

Applying Additional Auxiliary Context Using Large Language Model for Metaphor Detection

by Takuya Hayashi and Minoru Sasaki

Big Data Cogn. Comput. 2025, 9(9), 218; https://doi.org/10.3390/bdcc9090218 - 25 Aug 2025

Metaphor detection is challenging in natural language processing (NLP) because it requires recognizing nuanced semantic shifts beyond literal meaning, and conventional models often falter when contextual cues are limited. We propose a method to enhance metaphor detection by augmenting input sentences with auxiliary [...] Read more.

Metaphor detection is challenging in natural language processing (NLP) because it requires recognizing nuanced semantic shifts beyond literal meaning, and conventional models often falter when contextual cues are limited. We propose a method to enhance metaphor detection by augmenting input sentences with auxiliary context generated by ChatGPT. In our approach, ChatGPT produces semantically relevant sentences that are inserted before, after, or on both sides of a target sentence, allowing us to analyze the impact of context position and length on classification. Experiments on three benchmark datasets (MOH-X, VUA_All, VUA_Verb) show that this context-enriched input consistently outperforms the no-context baseline across accuracy, precision, recall, and F1-score, with the MOH-X dataset achieving the largest F1 gain. These improvements are statistically significant based on two-tailed t-tests. Our findings demonstrate that generative models can effectively enrich context for metaphor understanding, highlighting context placement and quantity as critical factors. Finally, we outline future directions, including advanced prompt engineering, optimizing context lengths, and extending this approach to multilingual metaphor detection. Full article

► Show Figures

Figure 1

25 pages, 4100 KB

Open AccessArticle

An Adaptive Unsupervised Learning Approach for Credit Card Fraud Detection

by John Adejoh, Nsikak Owoh, Moses Ashawa, Salaheddin Hosseinzadeh, Alireza Shahrabi and Salma Mohamed

Big Data Cogn. Comput. 2025, 9(9), 217; https://doi.org/10.3390/bdcc9090217 - 25 Aug 2025

Credit card fraud remains a major cause of financial loss around the world. Traditional fraud detection methods that rely on supervised learning often struggle because fraudulent transactions are rare compared to legitimate ones, leading to imbalanced datasets. Additionally, the models must be retrained [...] Read more.

Credit card fraud remains a major cause of financial loss around the world. Traditional fraud detection methods that rely on supervised learning often struggle because fraudulent transactions are rare compared to legitimate ones, leading to imbalanced datasets. Additionally, the models must be retrained frequently, as fraud patterns change over time and require new labeled data for retraining. To address these challenges, this paper proposes an ensemble unsupervised learning approach for credit card fraud detection that combines Autoencoders (AEs), Self-Organizing Maps (SOMs), and Restricted Boltzmann Machines (RBMs), integrated with an Adaptive Reconstruction Threshold (ART) mechanism. The ART dynamically adjusts anomaly detection thresholds by leveraging the clustering properties of SOMs, effectively overcoming the limitations of static threshold approaches in machine learning and deep learning models. The proposed models, AE-ASOMs (Autoencoder—Adaptive Self-Organizing Maps) and RBM-ASOMs (Restricted Boltzmann Machines—Adaptive Self-Organizing Maps), were evaluated on the Kaggle Credit Card Fraud Detection and IEEE-CIS datasets. Our AE-ASOM model achieved an accuracy of 0.980 and an F1-score of 0.967, while the RBM-ASOM model achieved an accuracy of 0.975 and an F1-score of 0.955. Compared to models such as One-Class SVM and Isolation Forest, our approach demonstrates higher detection accuracy and significantly reduces false positive rates. In addition to its performance, the model offers considerable computational efficiency with a training time of 200.52 s and memory usage of 3.02 megabytes. Full article

(This article belongs to the Special Issue Transforming Cyber Security Provision Through Utilizing Artificial Intelligence)

► Show Figures

Figure 1

22 pages, 23322 KB

Open AccessArticle

MS-PreTE: A Multi-Scale Pre-Training Encoder for Mobile Encrypted Traffic Classification

by Ziqi Wang, Yufan Qiu, Yaping Liu, Shuo Zhang and Xinyi Liu

Big Data Cogn. Comput. 2025, 9(8), 216; https://doi.org/10.3390/bdcc9080216 - 21 Aug 2025

Mobile traffic classification serves as a fundamental component in network security systems. In recent years, pre-training methods have significantly advanced this field. However, as mobile traffic is typically mixed with third-party services, the deep integration of such shared services results in highly similar [...] Read more.

Mobile traffic classification serves as a fundamental component in network security systems. In recent years, pre-training methods have significantly advanced this field. However, as mobile traffic is typically mixed with third-party services, the deep integration of such shared services results in highly similar TCP flow characteristics across different applications. This makes it challenging for existing traffic classification methods to effectively identify mobile traffic. To address the challenge, we propose MS-PreTE, a two-phase pre-training framework for mobile traffic classification. MS-PreTE introduces a novel multi-level representation model to preserve traffic information from diverse perspectives and hierarchical levels. Furthermore, MS-PreTE incorporates a focal-attention mechanism to enhance the model’s capability in discerning subtle differences among similar traffic flows. Evaluations demonstrate that MS-PreTE achieves state-of-the-art performance on three mobile application datasets, boosting the F1 score for Cross-platform (iOS) to 99.34% (up by 2.1%), Cross-platform (Android) to 98.61% (up by 1.6%), and NUDT-Mobile-Traffic to 87.70% (up by 2.47%). Moreover, MS-PreTE exhibits strong generalization capabilities across four real-world traffic datasets. Full article

(This article belongs to the Special Issue Machine Learning Methodologies and Applications in Cybersecurity Data Analysis)

► Show Figures

Figure 1

More Articles...

Submit to BDCC Review for BDCC

Journal Menu

Journal Browser

► Journal Browser

Highly Accessed Articles

View More...

Latest Books

More Books and Reprints...

E-Mail Alert

News

3 September 2025
Join Us at the MDPI at the University of Toronto Career Fair, 23 September 2025, Toronto, ON, Canada

1 September 2025
MDPI INSIGHTS: The CEO’s Letter #26 – CUJS, Head of Ethics, Open Peer Review, AIS 2025, Reviewer Recognition

11 August 2025
Meet Us at the 18^th European Congress and Exhibition on Advanced Materials and Processes—FEMS EUROMAT 2025, 14–18 September 2025, Granada, Spain

More News & Announcements...

Topics

Propose a Topic

Topic in IJERPH, JPM, Healthcare, BDCC, Applied Sciences, Sensors

eHealth and mHealth: Challenges and Prospects, 2nd Edition Topic Editors: Antonis Billis, Manuel Dominguez-Morales, Anton Civit
Deadline: 31 October 2025

Topic in Actuators, Algorithms, BDCC, Future Internet, JMMP, Machines, Robotics, Systems

Smart Product Design and Manufacturing on Industrial Internet Topic Editors: Pingyu Jiang, Jihong Liu, Ying Liu, Jihong Yan
Deadline: 31 December 2025

Topic in Computers, Information, AI, Electronics, Technologies, BDCC

Graph Neural Networks and Learning Systems Topic Editors: Huijia Li, Jun Hu, Weichen Zhao, Jie Cao
Deadline: 31 January 2026

Topic in AI, BDCC, Fire, GeoHazards, Remote Sensing

AI for Natural Disasters Detection, Prediction and Modeling Topic Editors: Moulay A. Akhloufi, Mozhdeh Shahbazi
Deadline: 31 March 2026

More Topics

Conferences

Propose a Conference Collaboration

More Conferences...

Special Issues

Propose a Special Issue

Special Issue in BDCC

Energy Conservation Towards a Low-Carbon and Sustainability Future Guest Editors: Yongming Han, Xuan Hu
Deadline: 25 September 2025

Special Issue in BDCC

Application of Machine and Deep Learning in Cyber-Physical Systems (CPSs) Guest Editors: Yongxin Liu, Jian Wang, Shuteng Niu, Thomas Yang, Dahai Liu, Prashant Shekhar, Hong Liu
Deadline: 30 September 2025

Special Issue in BDCC

Application of Artificial Intelligence in Traffic Management Guest Editors: Weihao Ma, Dongfang Ma
Deadline: 30 September 2025

Special Issue in BDCC

Recent Advances in Machine Learning Methods for Imperfect Large-Scale Data Guest Editors: Ximing Li, Bo Fu, Changchun Li
Deadline: 30 September 2025

More Special Issues

Back to TopTop