Next Issue
Volume 8, February
Previous Issue
Volume 7, December
 
 

Big Data Cogn. Comput., Volume 8, Issue 1 (January 2024) – 10 articles

Cover Story (view full-size image): This paper analyzes the performance of knowledge-based and generative artificial intelligence (AI) educational chatbots concerning trust and Grice's Cooperative Principle, which incorporates various aspects to assess the quality of communication. The results indicate that both systems are capable of providing accurate answers. However, generative AI is more trusted than knowledge-based chatbots and offers greater flexibility and verbosity while facing challenges such as hallucinations and lack of guidance that are not present in knowledge-based systems. The paper also shows that adapting generative chatbots to domain-specific data can mitigate these problems and improve verbosity, flexibility, and accuracy. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Select all
Export citation of selected articles as:
3 pages, 150 KiB  
Editorial
Quality and Security of Critical Infrastructure Systems
by Ivan Izonin, Tetiana Hovorushchenko and Shishir Kumar Shandilya
Big Data Cogn. Comput. 2024, 8(1), 10; https://doi.org/10.3390/bdcc8010010 - 22 Jan 2024
Viewed by 1344
Abstract
The amount of information is constantly growing, and thus, the issue of information security is becoming more acute [...] Full article
(This article belongs to the Special Issue Quality and Security of Critical Infrastructure Systems)
17 pages, 10960 KiB  
Article
Deep Learning and YOLOv8 Utilized in an Accurate Face Mask Detection System
by Christine Dewi, Danny Manongga, Hendry, Evangs Mailoa and Kristoko Dwi Hartomo
Big Data Cogn. Comput. 2024, 8(1), 9; https://doi.org/10.3390/bdcc8010009 - 16 Jan 2024
Cited by 1 | Viewed by 2886
Abstract
Face mask detection is a technological application that employs computer vision methodologies to ascertain the presence or absence of a face mask on an individual depicted in an image or video. This technology gained significant attention and adoption during the COVID-19 pandemic, as [...] Read more.
Face mask detection is a technological application that employs computer vision methodologies to ascertain the presence or absence of a face mask on an individual depicted in an image or video. This technology gained significant attention and adoption during the COVID-19 pandemic, as wearing face masks became an important measure to prevent the spread of the virus. Face mask detection helps to enforce mask-wearing guidelines, which can significantly reduce the spread of respiratory illnesses, including COVID-19. Wearing masks in densely populated areas provides individuals with protection and hinders the spread of airborne particles that transmit viruses. The application of deep learning models in object recognition has shown significant progress, leading to promising outcomes in the identification and localization of objects within images. The primary aim of this study is to annotate and classify face mask entities depicted in authentic images. To mitigate the spread of COVID-19 within public settings, individuals can employ the use of face masks created from materials specifically designed for medical purposes. This study utilizes YOLOv8, a state-of-the-art object detection algorithm, to accurately detect and identify face masks. To analyze this study, we conducted an experiment in which we combined the Face Mask Dataset (FMD) and the Medical Mask Dataset (MMD) into a single dataset. The detection performance of an earlier research study using the FMD and MMD was improved by the suggested model to a “Good” level of 99.1%, up from 98.6%. Our study demonstrates that the model scheme we have provided is a reliable method for detecting faces that are obscured by medical masks. Additionally, after the completion of the study, a comparative analysis was conducted to examine the findings in conjunction with those of related research. The proposed detector demonstrated superior performance compared to previous research in terms of both accuracy and precision. Full article
Show Figures

Figure 1

23 pages, 1527 KiB  
Article
Evaluating the Robustness of Deep Learning Models against Adversarial Attacks: An Analysis with FGSM, PGD and CW
by William Villegas-Ch, Angel Jaramillo-Alcázar and Sergio Luján-Mora
Big Data Cogn. Comput. 2024, 8(1), 8; https://doi.org/10.3390/bdcc8010008 - 16 Jan 2024
Viewed by 2285
Abstract
This study evaluated the generation of adversarial examples and the subsequent robustness of an image classification model. The attacks were performed using the Fast Gradient Sign method, the Projected Gradient Descent method, and the Carlini and Wagner attack to perturb the original images [...] Read more.
This study evaluated the generation of adversarial examples and the subsequent robustness of an image classification model. The attacks were performed using the Fast Gradient Sign method, the Projected Gradient Descent method, and the Carlini and Wagner attack to perturb the original images and analyze their impact on the model’s classification accuracy. Additionally, image manipulation techniques were investigated as defensive measures against adversarial attacks. The results highlighted the model’s vulnerability to conflicting examples: the Fast Gradient Signed Method effectively altered the original classifications, while the Carlini and Wagner method proved less effective. Promising approaches such as noise reduction, image compression, and Gaussian blurring were presented as effective countermeasures. These findings underscore the importance of addressing the vulnerability of machine learning models and the need to develop robust defenses against adversarial examples. This article emphasizes the urgency of addressing the threat posed by harmful standards in machine learning models, highlighting the relevance of implementing effective countermeasures and image manipulation techniques to mitigate the effects of adversarial attacks. These efforts are crucial to safeguarding model integrity and trust in an environment marked by constantly evolving hostile threats. An average 25% decrease in accuracy was observed for the VGG16 model when exposed to the Fast Gradient Signed Method and Projected Gradient Descent attacks, and an even more significant 35% decrease with the Carlini and Wagner method. Full article
(This article belongs to the Special Issue Security, Privacy, and Trust in Artificial Intelligence Applications)
Show Figures

Figure 1

28 pages, 1136 KiB  
Review
A Survey of Incremental Deep Learning for Defect Detection in Manufacturing
by Reenu Mohandas, Mark Southern, Eoin O’Connell and Martin Hayes
Big Data Cogn. Comput. 2024, 8(1), 7; https://doi.org/10.3390/bdcc8010007 - 5 Jan 2024
Viewed by 1975
Abstract
Deep learning based visual cognition has greatly improved the accuracy of defect detection, reducing processing times and increasing product throughput across a variety of manufacturing use cases. There is however a continuing need for rigorous procedures to dynamically update model-based detection methods that [...] Read more.
Deep learning based visual cognition has greatly improved the accuracy of defect detection, reducing processing times and increasing product throughput across a variety of manufacturing use cases. There is however a continuing need for rigorous procedures to dynamically update model-based detection methods that use sequential streaming during the training phase. This paper reviews how new process, training or validation information is rigorously incorporated in real time when detection exceptions arise during inspection. In particular, consideration is given to how new tasks, classes or decision pathways are added to existing models or datasets in a controlled fashion. An analysis of studies from the incremental learning literature is presented, where the emphasis is on the mitigation of process complexity challenges such as, catastrophic forgetting. Further, practical implementation issues that are known to affect the complexity of deep learning model architecture, including memory allocation for incoming sequential data or incremental learning accuracy, is considered. The paper highlights case study results and methods that have been used to successfully mitigate such real-time manufacturing challenges. Full article
(This article belongs to the Topic Electronic Communications, IOT and Big Data)
Show Figures

Figure 1

27 pages, 5013 KiB  
Article
Enhancing Credit Card Fraud Detection: An Ensemble Machine Learning Approach
by Abdul Rehman Khalid, Nsikak Owoh, Omair Uthmani, Moses Ashawa, Jude Osamor and John Adejoh
Big Data Cogn. Comput. 2024, 8(1), 6; https://doi.org/10.3390/bdcc8010006 - 3 Jan 2024
Cited by 2 | Viewed by 6010
Abstract
In the era of digital advancements, the escalation of credit card fraud necessitates the development of robust and efficient fraud detection systems. This paper delves into the application of machine learning models, specifically focusing on ensemble methods, to enhance credit card fraud detection. [...] Read more.
In the era of digital advancements, the escalation of credit card fraud necessitates the development of robust and efficient fraud detection systems. This paper delves into the application of machine learning models, specifically focusing on ensemble methods, to enhance credit card fraud detection. Through an extensive review of existing literature, we identified limitations in current fraud detection technologies, including issues like data imbalance, concept drift, false positives/negatives, limited generalisability, and challenges in real-time processing. To address some of these shortcomings, we propose a novel ensemble model that integrates a Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Random Forest (RF), Bagging, and Boosting classifiers. This ensemble model tackles the dataset imbalance problem associated with most credit card datasets by implementing under-sampling and the Synthetic Over-sampling Technique (SMOTE) on some machine learning algorithms. The evaluation of the model utilises a dataset comprising transaction records from European credit card holders, providing a realistic scenario for assessment. The methodology of the proposed model encompasses data pre-processing, feature engineering, model selection, and evaluation, with Google Colab computational capabilities facilitating efficient model training and testing. Comparative analysis between the proposed ensemble model, traditional machine learning methods, and individual classifiers reveals the superior performance of the ensemble in mitigating challenges associated with credit card fraud detection. Across accuracy, precision, recall, and F1-score metrics, the ensemble outperforms existing models. This paper underscores the efficacy of ensemble methods as a valuable tool in the battle against fraudulent transactions. The findings presented lay the groundwork for future advancements in the development of more resilient and adaptive fraud detection systems, which will become crucial as credit card fraud techniques continue to evolve. Full article
Show Figures

Figure 1

26 pages, 6098 KiB  
Article
Unveiling Sentiments: A Comprehensive Analysis of Arabic Hajj-Related Tweets from 2017–2022 Utilizing Advanced AI Models
by Hanan M. Alghamdi
Big Data Cogn. Comput. 2024, 8(1), 5; https://doi.org/10.3390/bdcc8010005 - 2 Jan 2024
Cited by 1 | Viewed by 2144
Abstract
Sentiment analysis plays a crucial role in understanding public opinion and social media trends. It involves analyzing the emotional tone and polarity of a given text. When applied to Arabic text, this task becomes particularly challenging due to the language’s complex morphology, right-to-left [...] Read more.
Sentiment analysis plays a crucial role in understanding public opinion and social media trends. It involves analyzing the emotional tone and polarity of a given text. When applied to Arabic text, this task becomes particularly challenging due to the language’s complex morphology, right-to-left script, and intricate nuances in expressing emotions. Social media has emerged as a powerful platform for individuals to express their sentiments, especially regarding religious and cultural events. Consequently, studying sentiment analysis in the context of Hajj has become a captivating subject. This research paper presents a comprehensive sentiment analysis of tweets discussing the annual Hajj pilgrimage over a six-year period. By employing a combination of machine learning and deep learning models, this study successfully conducted sentiment analysis on a sizable dataset consisting of Arabic tweets. The process involves pre-processing, feature extraction, and sentiment classification. The objective was to uncover the prevailing sentiments associated with Hajj over different years, before, during, and after each Hajj event. Importantly, the results presented in this study highlight that BERT, an advanced transformer-based model, outperformed other models in accurately classifying sentiment. This underscores its effectiveness in capturing the complexities inherent in Arabic text. Full article
(This article belongs to the Special Issue Advances in Natural Language Processing and Text Mining)
Show Figures

Figure 1

19 pages, 692 KiB  
Article
BNMI-DINA: A Bayesian Cognitive Diagnosis Model for Enhanced Personalized Learning
by Yiming Chen and Shuang Liang
Big Data Cogn. Comput. 2024, 8(1), 4; https://doi.org/10.3390/bdcc8010004 - 29 Dec 2023
Viewed by 1553
Abstract
In the field of education, cognitive diagnosis is crucial for achieving personalized learning. The widely adopted DINA (Deterministic Inputs, Noisy And gate) model uncovers students’ mastery of essential skills necessary to answer questions correctly. However, existing DINA-based approaches overlook the dependency between knowledge [...] Read more.
In the field of education, cognitive diagnosis is crucial for achieving personalized learning. The widely adopted DINA (Deterministic Inputs, Noisy And gate) model uncovers students’ mastery of essential skills necessary to answer questions correctly. However, existing DINA-based approaches overlook the dependency between knowledge points, and their model training process is computationally inefficient for large datasets. In this paper, we propose a new cognitive diagnosis model called BNMI-DINA, which stands for Bayesian Network-based Multiprocess Incremental DINA. Our proposed model aims to enhance personalized learning by providing accurate and detailed assessments of students’ cognitive abilities. By incorporating a Bayesian network, BNMI-DINA establishes the dependency relationship between knowledge points, enabling more accurate evaluations of students’ mastery levels. To enhance model convergence speed, key steps of our proposed algorithm are parallelized. We also provide theoretical proof of the convergence of BNMI-DINA. Extensive experiments demonstrate that our approach effectively enhances model accuracy and reduces computational time compared to state-of-the-art cognitive diagnosis models. Full article
Show Figures

Figure 1

12 pages, 2495 KiB  
Article
Semantic Similarity of Common Verbal Expressions in Older Adults through a Pre-Trained Model
by Marcos Orellana, Patricio Santiago García, Guillermo Daniel Ramon, Jorge Luis Zambrano-Martinez, Andrés Patiño-León, María Verónica Serrano and Priscila Cedillo
Big Data Cogn. Comput. 2024, 8(1), 3; https://doi.org/10.3390/bdcc8010003 - 29 Dec 2023
Viewed by 1683
Abstract
Health problems in older adults lead to situations where communication with peers, family and caregivers becomes challenging for seniors; therefore, it is necessary to use alternative methods to facilitate communication. In this context, Augmentative and Alternative Communication (AAC) methods are widely used to [...] Read more.
Health problems in older adults lead to situations where communication with peers, family and caregivers becomes challenging for seniors; therefore, it is necessary to use alternative methods to facilitate communication. In this context, Augmentative and Alternative Communication (AAC) methods are widely used to support this population segment. Moreover, with Artificial Intelligence (AI), and specifically, machine learning algorithms, AAC can be improved. Although there have been several studies in this field, it is interesting to analyze common phrases used by seniors, depending on their context (i.e., slang and everyday expressions typical of their age). This paper proposes a semantic analysis of the common phrases of older adults and their corresponding meanings through Natural Language Processing (NLP) techniques and a pre-trained language model using semantic textual similarity to represent the older adults’ phrases with their corresponding graphic images (pictograms). The results show good scores achieved in the semantic similarity between the phrases of the older adults and the definitions, so the relationship between the phrase and the pictogram has a high degree of probability. Full article
(This article belongs to the Special Issue Advances in Natural Language Processing and Text Mining)
Show Figures

Figure 1

20 pages, 724 KiB  
Article
Knowledge-Based and Generative-AI-Driven Pedagogical Conversational Agents: A Comparative Study of Grice’s Cooperative Principles and Trust
by Matthias Wölfel, Mehrnoush Barani Shirzad, Andreas Reich and Katharina Anderer
Big Data Cogn. Comput. 2024, 8(1), 2; https://doi.org/10.3390/bdcc8010002 - 26 Dec 2023
Cited by 1 | Viewed by 2393
Abstract
The emergence of generative language models (GLMs), such as OpenAI’s ChatGPT, is changing the way we communicate with computers and has a major impact on the educational landscape. While GLMs have great potential to support education, their use is not unproblematic, as they [...] Read more.
The emergence of generative language models (GLMs), such as OpenAI’s ChatGPT, is changing the way we communicate with computers and has a major impact on the educational landscape. While GLMs have great potential to support education, their use is not unproblematic, as they suffer from hallucinations and misinformation. In this paper, we investigate how a very limited amount of domain-specific data, from lecture slides and transcripts, can be used to build knowledge-based and generative educational chatbots. We found that knowledge-based chatbots allow full control over the system’s response but lack the verbosity and flexibility of GLMs. The answers provided by GLMs are more trustworthy and offer greater flexibility, but their correctness cannot be guaranteed. Adapting GLMs to domain-specific data trades flexibility for correctness. Full article
(This article belongs to the Special Issue Artificial Intelligence and Natural Language Processing)
Show Figures

Figure 1

25 pages, 582 KiB  
Article
Distributed Bayesian Inference for Large-Scale IoT Systems
by Eleni Vlachou, Aristeidis Karras, Christos Karras, Leonidas Theodorakopoulos, Constantinos Halkiopoulos and Spyros Sioutas
Big Data Cogn. Comput. 2024, 8(1), 1; https://doi.org/10.3390/bdcc8010001 - 19 Dec 2023
Cited by 1 | Viewed by 1875
Abstract
In this work, we present a Distributed Bayesian Inference Classifier for Large-Scale Systems, where we assess its performance and scalability on distributed environments such as PySpark. The presented classifier consistently showcases efficient inference time, irrespective of the variations in the size of the [...] Read more.
In this work, we present a Distributed Bayesian Inference Classifier for Large-Scale Systems, where we assess its performance and scalability on distributed environments such as PySpark. The presented classifier consistently showcases efficient inference time, irrespective of the variations in the size of the test set, implying a robust ability to handle escalating data sizes without a proportional increase in computational demands. Notably, throughout the experiments, there is an observed increase in memory usage with growing test set sizes, this increment is sublinear, demonstrating the proficiency of the classifier in memory resource management. This behavior is consistent with the typical tendencies of PySpark tasks, which witness increasing memory consumption due to data partitioning and various data operations as datasets expand. CPU resource utilization, which is another crucial factor, also remains stable, emphasizing the capability of the classifier to manage larger computational workloads without significant resource strain. From a classification perspective, the Bayesian Logistic Regression Spark Classifier consistently achieves reliable performance metrics, with a particular focus on high specificity, indicating its aptness for applications where pinpointing true negatives is crucial. In summary, based on all experiments conducted under various data sizes, our classifier emerges as a top contender for scalability-driven applications in IoT systems, highlighting its dependable performance, adept resource management, and consistent prediction accuracy. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop