Machine Learning and Knowledge Extraction

11 pages, 892 KB

Open AccessArticle

Cross-Validation Visualized: A Narrative Guide to Advanced Methods

by Johannes Allgaier and Rüdiger Pryss

Mach. Learn. Knowl. Extr. 2024, 6(2), 1378-1388; https://doi.org/10.3390/make6020065 - 20 Jun 2024

Cited by 58 | Viewed by 11611

This study delves into the multifaceted nature of cross-validation (CV) techniques in machine learning model evaluation and selection, underscoring the challenge of choosing the most appropriate method due to the plethora of available variants. It aims to clarify and standardize terminology such as [...] Read more.

This study delves into the multifaceted nature of cross-validation (CV) techniques in machine learning model evaluation and selection, underscoring the challenge of choosing the most appropriate method due to the plethora of available variants. It aims to clarify and standardize terminology such as sets, groups, folds, and samples pivotal in the CV domain, and introduces an exhaustive compilation of advanced CV methods like leave-one-out, leave-p-out, Monte Carlo, grouped, stratified, and time-split CV within a hold-out CV framework. Through graphical representations, the paper enhances the comprehension of these methodologies, facilitating more informed decision making for practitioners. It further explores the synergy between different CV strategies and advocates for a unified approach to reporting model performance by consolidating essential metrics. The paper culminates in a comprehensive overview of the CV techniques discussed, illustrated with practical examples, offering valuable insights for both novice and experienced researchers in the field. Full article

(This article belongs to the Section Visualization)

► Show Figures

Figure 1

17 pages, 2445 KB

Open AccessArticle

Image Text Extraction and Natural Language Processing of Unstructured Data from Medical Reports

by Ivan Malashin, Igor Masich, Vadim Tynchenko, Andrei Gantimurov, Vladimir Nelyub and Aleksei Borodulin

Mach. Learn. Knowl. Extr. 2024, 6(2), 1361-1377; https://doi.org/10.3390/make6020064 - 18 Jun 2024

Cited by 7 | Viewed by 7274

Abstract

This study presents an integrated approach for automatically extracting and structuring information from medical reports, captured as scanned documents or photographs, through a combination of image recognition and natural language processing (NLP) techniques like named entity recognition (NER). The primary aim was to [...] Read more.

This study presents an integrated approach for automatically extracting and structuring information from medical reports, captured as scanned documents or photographs, through a combination of image recognition and natural language processing (NLP) techniques like named entity recognition (NER). The primary aim was to develop an adaptive model for efficient text extraction from medical report images. This involved utilizing a genetic algorithm (GA) to fine-tune optical character recognition (OCR) hyperparameters, ensuring maximal text extraction length, followed by NER processing to categorize the extracted information into required entities, adjusting parameters if entities were not correctly extracted based on manual annotations. Despite the diverse formats of medical report images in the dataset, all in Russian, this serves as a conceptual example of information extraction (IE) that can be easily extended to other languages. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

18 pages, 2378 KB

Open AccessReview

A Review of Orebody Knowledge Enhancement Using Machine Learning on Open-Pit Mine Measure-While-Drilling Data

by Daniel M. Goldstein, Chris Aldrich and Louisa O’Connor

Mach. Learn. Knowl. Extr. 2024, 6(2), 1343-1360; https://doi.org/10.3390/make6020063 - 18 Jun 2024

Cited by 8 | Viewed by 3314

Abstract

Measure while drilling (MWD) refers to the acquisition of real-time data associated with the drilling process, including information related to the geological characteristics encountered in hard-rock mining. The availability of large quantities of low-cost MWD data from blast holes compared to expensive and [...] Read more.

Measure while drilling (MWD) refers to the acquisition of real-time data associated with the drilling process, including information related to the geological characteristics encountered in hard-rock mining. The availability of large quantities of low-cost MWD data from blast holes compared to expensive and sparsely collected orebody knowledge (OBK) data from exploration drill holes make the former more desirable for characterizing pre-excavation subsurface conditions. Machine learning (ML) plays a critical role in the real-time or near-real-time analysis of MWD data to enable timely enhancement of OBK for operational purposes. Applications can be categorized into three areas, focused on the mechanical properties of the rock mass, the lithology of the rock, as well as, related to that, the estimation of the geochemical species in the rock mass. From a review of the open literature, the following can be concluded: (i) The most important MWD metrics are the rate of penetration (rop), torque (tor), weight on bit (wob), bit air pressure (bap), and drill rotation speed (rpm). (ii) Multilayer perceptron analysis has mostly been used, followed by Gaussian processes and other methods, mainly to identify rock types. (iii) Recent advances in deep learning methods designed to deal with unstructured data, such as borehole images and vibrational signals, have not yet been fully exploited, although this is an emerging trend. (iv) Significant recent developments in explainable artificial intelligence could also be used to better advantage in understanding the association between MWD metrics and the mechanical and geochemical structure and properties of drilled rock. Full article

(This article belongs to the Topic Big Data Intelligence: Methodologies and Applications)

► Show Figures

Figure 1

20 pages, 5568 KB

Open AccessArticle

Extracting Interpretable Knowledge from the Remote Monitoring of COVID-19 Patients

by Melina Tziomaka, Athanasios Kallipolitis, Andreas Menychtas, Parisis Gallos, Christos Panagopoulos, Alice Georgia Vassiliou, Edison Jahaj, Ioanna Dimopoulou, Anastasia Kotanidou and Ilias Maglogiannis

Mach. Learn. Knowl. Extr. 2024, 6(2), 1323-1342; https://doi.org/10.3390/make6020062 - 18 Jun 2024

Cited by 1 | Viewed by 1786

Abstract

Apart from providing user-friendly applications that support digitized healthcare routines, the use of wearable devices has proven to increase the independence of patients in a healthcare setting. By applying machine learning techniques to real health-related data, important conclusions can be drawn for unsolved [...] Read more.

Apart from providing user-friendly applications that support digitized healthcare routines, the use of wearable devices has proven to increase the independence of patients in a healthcare setting. By applying machine learning techniques to real health-related data, important conclusions can be drawn for unsolved issues related to disease prognosis. In this paper, various machine learning techniques are examined and analyzed for the provision of personalized care to COVID-19 patients with mild symptoms based on individual characteristics and the comorbidities they have, while the connection between the stimuli and predictive results are utilized for the evaluation of the system’s transparency. The results, jointly analyzing wearable and electronic health record data for the prediction of a daily dyspnea grade and the duration of fever, are promising in terms of evaluation metrics even in a specified stratum of patients. The interpretability scheme provides useful insight concerning factors that greatly influenced the results. Moreover, it is demonstrated that the use of wearable devices for remote monitoring through cloud platforms is feasible while providing awareness of a patient’s condition, leading to the early detection of undesired changes and reduced visits for patient screening. Full article

(This article belongs to the Topic Big Data Intelligence: Methodologies and Applications)

► Show Figures

Figure 1

25 pages, 1059 KB

Open AccessArticle

Interaction Difference Hypothesis Test for Prediction Models

by Thomas Welchowski and Dominic Edelmann

Mach. Learn. Knowl. Extr. 2024, 6(2), 1298-1322; https://doi.org/10.3390/make6020061 - 14 Jun 2024

Viewed by 1514

Abstract

Machine learning research focuses on the improvement of prediction performance. Progress was made with black-box models that flexibly adapt to the given data. However, due to their increased complexity, black-box models are more difficult to interpret. To address this issue, techniques for interpretable [...] Read more.

Machine learning research focuses on the improvement of prediction performance. Progress was made with black-box models that flexibly adapt to the given data. However, due to their increased complexity, black-box models are more difficult to interpret. To address this issue, techniques for interpretable machine learning have been developed, yet there is still a lack of methods to reliably identify interaction effects between predictors under uncertainty. In this work, we present a model-agnostic hypothesis test for the identification of interaction effects in black-box machine learning models. The test statistic is based on the difference between the variance of the estimated prediction function and a version of the estimated prediction function without interaction effects derived via partial dependence functions. The properties of the proposed hypothesis test were explored in simulations of linear and nonlinear models. The proposed hypothesis test can be applied to any black-box prediction model, and the null hypothesis of the test can be flexibly specified according to the research question of interest. Furthermore, the test is computationally fast to apply, as the null distribution does not require the resampling or refitting of black-box prediction models. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

17 pages, 1472 KB

Open AccessArticle

Advanced Multi-Label Image Classification Techniques Using Ensemble Methods

by Tamás Katona, Gábor Tóth, Mátyás Petró and Balázs Harangi

Mach. Learn. Knowl. Extr. 2024, 6(2), 1281-1297; https://doi.org/10.3390/make6020060 - 7 Jun 2024

Cited by 2 | Viewed by 3843

Abstract

Chest X-rays are vital in healthcare for diagnosing various conditions due to their low Radiation exposure, widespread availability, and rapid interpretation. However, their interpretation requires specialized expertise, which can limit scalability and delay diagnoses. This study addresses the multi-label classification challenge of chest [...] Read more.

Chest X-rays are vital in healthcare for diagnosing various conditions due to their low Radiation exposure, widespread availability, and rapid interpretation. However, their interpretation requires specialized expertise, which can limit scalability and delay diagnoses. This study addresses the multi-label classification challenge of chest X-ray images using the Chest X-ray14 dataset. We propose a novel online ensemble technique that differs from previous penalty-based methods by focusing on combining individual model losses with the overall ensemble loss. This approach enhances interaction and feedback among models during training. Our method integrates multiple pre-trained CNNs using strategies like combining CNNs through an additional fully connected layer and employing a label-weighted average for outputs. This multi-layered approach leverages the strengths of each model component, improving classification accuracy and generalization. By focusing solely on image data, our ensemble model addresses the challenges posed by null vectors and diverse pathologies, advancing computer-aided radiology. Full article

► Show Figures

Figure 1

18 pages, 2035 KB

Open AccessReview

Machine Learning in Geosciences: A Review of Complex Environmental Monitoring Applications

by Maria Silvia Binetti, Carmine Massarelli and Vito Felice Uricchio

Mach. Learn. Knowl. Extr. 2024, 6(2), 1263-1280; https://doi.org/10.3390/make6020059 - 5 Jun 2024

Cited by 13 | Viewed by 8365

Abstract

This is a systematic literature review of the application of machine learning (ML) algorithms in geosciences, with a focus on environmental monitoring applications. ML algorithms, with their ability to analyze vast quantities of data, decipher complex relationships, and predict future events, and they [...] Read more.

This is a systematic literature review of the application of machine learning (ML) algorithms in geosciences, with a focus on environmental monitoring applications. ML algorithms, with their ability to analyze vast quantities of data, decipher complex relationships, and predict future events, and they offer promising capabilities to implement technologies based on more precise and reliable data processing. This review considers several vulnerable and particularly at-risk themes as landfills, mining activities, the protection of coastal dunes, illegal discharges into water bodies, and the pollution and degradation of soil and water matrices in large industrial complexes. These case studies about environmental monitoring provide an opportunity to better examine the impact of human activities on the environment, with a specific focus on water and soil matrices. The recent literature underscores the increasing importance of ML in these contexts, highlighting a preference for adapted classic models: random forest (RF) (the most widely used), decision trees (DTs), support vector machines (SVMs), artificial neural networks (ANNs), convolutional neural networks (CNNs), principal component analysis (PCA), and much more. In the field of environmental management, the following methodologies offer invaluable insights that can steer strategic planning and decision-making based on more accurate image classification, prediction models, object detection and recognition, map classification, data classification, and environmental variable predictions. Full article

(This article belongs to the Collection Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction)

► Show Figures

Figure 1

20 pages, 951 KB

Open AccessReview

Bayesian Networks for the Diagnosis and Prognosis of Diseases: A Scoping Review

by Kristina Polotskaya, Carlos S. Muñoz-Valencia, Alejandro Rabasa, Jose A. Quesada-Rico, Domingo Orozco-Beltrán and Xavier Barber

Mach. Learn. Knowl. Extr. 2024, 6(2), 1243-1262; https://doi.org/10.3390/make6020058 - 4 Jun 2024

Cited by 19 | Viewed by 12175

Abstract

Bayesian networks (BNs) are probabilistic graphical models that leverage Bayes’ theorem to portray dependencies and cause-and-effect relationships between variables. These networks have gained prominence in the field of health sciences, particularly in diagnostic processes, by allowing the integration of medical knowledge into models [...] Read more.

Bayesian networks (BNs) are probabilistic graphical models that leverage Bayes’ theorem to portray dependencies and cause-and-effect relationships between variables. These networks have gained prominence in the field of health sciences, particularly in diagnostic processes, by allowing the integration of medical knowledge into models and addressing uncertainty in a probabilistic manner. Objectives: This review aims to provide an exhaustive overview of the current state of Bayesian networks in disease diagnosis and prognosis. Additionally, it seeks to introduce readers to the fundamental methodology of BNs, emphasising their versatility and applicability across varied medical domains. Employing a meticulous search strategy with MeSH descriptors in diverse scientific databases, we identified 190 relevant references. These were subjected to a rigorous analysis, resulting in the retention of 60 papers for in-depth review. The robustness of our approach minimised the risk of selection bias. Results: The selected studies encompass a wide range of medical areas, providing insights into the statistical methodology, implementation feasibility, and predictive accuracy of BNs, as evidenced by an average area under the curve (AUC) exceeding 75%. The comprehensive analysis underscores the adaptability and efficacy of Bayesian networks in diverse clinical scenarios. The majority of the examined studies demonstrate the potential of BNs as reliable adjuncts to clinical decision-making. The findings of this review affirm the role of Bayesian networks as accessible and versatile artificial intelligence tools in healthcare. They offer a viable solution to address complex medical challenges, facilitating timely and informed decision-making under conditions of uncertainty. The extensive exploration of Bayesian networks presented in this review highlights their significance and growing impact in the realm of disease diagnosis and prognosis. It underscores the need for further research and development to optimise their capabilities and broaden their applicability in addressing diverse and intricate healthcare challenges. Full article

(This article belongs to the Collection Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction)

► Show Figures

Figure 1

33 pages, 5392 KB

Open AccessArticle

An Analysis of Radio Frequency Transfer Learning Behavior

by Lauren J. Wong, Braeden Muller, Sean McPherson and Alan J. Michaels

Mach. Learn. Knowl. Extr. 2024, 6(2), 1210-1242; https://doi.org/10.3390/make6020057 - 3 Jun 2024

Cited by 2 | Viewed by 2134

Abstract

Transfer learning (TL) techniques, which leverage prior knowledge gained from data with different distributions to achieve higher performance and reduced training time, are often used in computer vision (CV) and natural language processing (NLP), but have yet to be fully utilized in the [...] Read more.

Transfer learning (TL) techniques, which leverage prior knowledge gained from data with different distributions to achieve higher performance and reduced training time, are often used in computer vision (CV) and natural language processing (NLP), but have yet to be fully utilized in the field of radio frequency machine learning (RFML). This work systematically evaluates how the training domain and task, characterized by the transmitter (Tx)/receiver (Rx) hardware and channel environment, impact radio frequency (RF) TL performance for example automatic modulation classification (AMC) and specific emitter identification (SEI) use-cases. Through exhaustive experimentation using carefully curated synthetic and captured datasets with varying signal types, channel types, signal to noise ratios (SNRs), carrier/center frequencys (CFs), frequency offsets (FOs), and Tx and Rx devices, actionable and generalized conclusions are drawn regarding how best to use RF TL techniques for domain adaptation and sequential learning. Consistent with trends identified in other modalities, our results show that RF TL performance is highly dependent on the similarity between the source and target domains/tasks, but also on the relative difficulty of the source and target domains/tasks. Results also discuss the impacts of channel environment and hardware variations on RF TL performance and compare RF TL performance using head re-training and model fine-tuning methods. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

17 pages, 13820 KB

Open AccessArticle

Design and Implementation of a Self-Supervised Algorithm for Vein Structural Patterns Analysis Using Advanced Unsupervised Techniques

by Swati Rastogi, Siddhartha Prakash Duttagupta and Anirban Guha

Mach. Learn. Knowl. Extr. 2024, 6(2), 1193-1209; https://doi.org/10.3390/make6020056 - 31 May 2024

Cited by 1 | Viewed by 1744

Abstract

Compared to other identity verification systems applications, vein patterns have the lowest potential for being used fraudulently. The present research examines the practicability of gathering vascular data from NIR images of veins. In this study, we propose a self-supervision learning algorithm that envisions [...] Read more.

Compared to other identity verification systems applications, vein patterns have the lowest potential for being used fraudulently. The present research examines the practicability of gathering vascular data from NIR images of veins. In this study, we propose a self-supervision learning algorithm that envisions an automated process to retrieve vascular patterns computationally using unsupervised approaches. This new self-learning algorithm sorts the vascular patterns into clusters and then uses 2D image data to recuperate the extracted vascular patterns linked to NIR templates. Our work incorporates multi-scale filtering followed by multi-scale feature extraction, recognition, identification, and matching. We design the ORC, GPO, and RDM algorithms with these inclusions and finally develop the vascular pattern mining model to visualize the computational retrieval of vascular patterns from NIR imageries. As a result, the developed self-supervised learning algorithm shows a 96.7% accuracy rate utilizing appropriate image quality assessment parameters. In our work, we also contend that we provide strategies that are both theoretically sound and practically efficient for concerns such as how many clusters should be used for specific tasks, which clustering technique should be used, how to set the threshold for single linkage algorithms, and how much data should be excluded as outliers. Consequently, we aim to circumvent Kleinberg’s impossibility while attaining significant clustering to develop a self-supervised learning algorithm using unsupervised methodologies. Full article

(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)

► Show Figures

Figure 1

23 pages, 5195 KB

Open AccessArticle

Uncertainty in XAI: Human Perception and Modeling Approaches

by Teodor Chiaburu, Frank Haußer and Felix Bießmann

Mach. Learn. Knowl. Extr. 2024, 6(2), 1170-1192; https://doi.org/10.3390/make6020055 - 27 May 2024

Cited by 10 | Viewed by 4430

Abstract

Artificial Intelligence (AI) plays an increasingly integral role in decision-making processes. In order to foster trust in AI predictions, many approaches towards explainable AI (XAI) have been developed and evaluated. Surprisingly, one factor that is essential for trust has been underrepresented in XAI [...] Read more.

Artificial Intelligence (AI) plays an increasingly integral role in decision-making processes. In order to foster trust in AI predictions, many approaches towards explainable AI (XAI) have been developed and evaluated. Surprisingly, one factor that is essential for trust has been underrepresented in XAI research so far: uncertainty, both with respect to how it is modeled in Machine Learning (ML) and XAI as well as how it is perceived by humans relying on AI assistance. This review paper provides an in-depth analysis of both aspects. We review established and recent methods to account for uncertainty in ML models and XAI approaches and we discuss empirical evidence on how model uncertainty is perceived by human users of XAI systems. We summarize the methodological advancements and limitations of methods and human perception. Finally, we discuss the implications of the current state of the art in model development and research on human perception. We believe highlighting the role of uncertainty in XAI will be helpful to both practitioners and researchers and could ultimately support more responsible use of AI in practical applications. Full article

► Show Figures

Figure 1

16 pages, 1625 KB

Open AccessArticle

Fine-Tuning Artificial Neural Networks to Predict Pest Numbers in Grain Crops: A Case Study in Kazakhstan

by Galiya Anarbekova, Luis Gonzaga Baca Ruiz, Akerke Akanova, Saltanat Sharipova and Nazira Ospanova

Mach. Learn. Knowl. Extr. 2024, 6(2), 1154-1169; https://doi.org/10.3390/make6020054 - 26 May 2024

Cited by 5 | Viewed by 1907

Abstract

This study investigates the application of different ML methods for predicting pest outbreaks in Kazakhstan for grain crops. Comprehensive data spanning from 2005 to 2022, including pest population metrics, meteorological data, and geographical parameters, were employed to train the neural network for forecasting [...] Read more.

This study investigates the application of different ML methods for predicting pest outbreaks in Kazakhstan for grain crops. Comprehensive data spanning from 2005 to 2022, including pest population metrics, meteorological data, and geographical parameters, were employed to train the neural network for forecasting the population dynamics of Phyllotreta vittula pests in Kazakhstan. By evaluating various network configurations and hyperparameters, this research considers the application of MLP, MT-ANN, LSTM, transformer, and SVR. The transformer consistently demonstrates superior predictive accuracy in terms of MSE. Additionally, this work highlights the impact of several training hyperparameters such as epochs and batch size on predictive accuracy. Interestingly, the second season exhibits unique responses, stressing the effect of some features on model performance. By advancing our understanding of fine-tuning ANNs for accurate pest prediction in grain crops, this research contributes to the development of more precise and efficient pest control strategies. In addition, the consistent dominance of the transformer model makes it suitable for its implementation in practical applications. Finally, this work contributes to sustainable agricultural practices by promoting targeted interventions and potentially reducing reliance on chemical pesticides. Full article

(This article belongs to the Section Network)

► Show Figures

Figure 1

9 pages, 1615 KB

Open AccessArticle

Evaluation of AI ChatBots for the Creation of Patient-Informed Consent Sheets

by Florian Jürgen Raimann, Vanessa Neef, Marie Charlotte Hennighausen, Kai Zacharowski and Armin Niklas Flinspach

Mach. Learn. Knowl. Extr. 2024, 6(2), 1145-1153; https://doi.org/10.3390/make6020053 - 24 May 2024

Cited by 7 | Viewed by 2957

Abstract

Introduction: Large language models (LLMs), such as ChatGPT, are a topic of major public interest, and their potential benefits and threats are a subject of discussion. The potential contribution of these models to health care is widely discussed. However, few studies to date [...] Read more.

Introduction: Large language models (LLMs), such as ChatGPT, are a topic of major public interest, and their potential benefits and threats are a subject of discussion. The potential contribution of these models to health care is widely discussed. However, few studies to date have examined LLMs. For example, the potential use of LLMs in (individualized) informed consent remains unclear. Methods: We analyzed the performance of the LLMs ChatGPT 3.5, ChatGPT 4.0, and Gemini with regard to their ability to create an information sheet for six basic anesthesiologic procedures in response to corresponding questions. We performed multiple attempts to create forms for anesthesia and analyzed the results checklists based on existing standard sheets. Results: None of the LLMs tested were able to create a legally compliant information sheet for any basic anesthesiologic procedure. Overall, fewer than one-third of the risks, procedural descriptions, and preparations listed were covered by the LLMs. Conclusions: There are clear limitations of current LLMs in terms of practical application. Advantages in the generation of patient-adapted risk stratification within individual informed consent forms are not available at the moment, although the potential for further development is difficult to predict. Full article

(This article belongs to the Collection Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction)

► Show Figures

Figure 1

19 pages, 714 KB

Open AccessArticle

Locally-Scaled Kernels and Confidence Voting

by Elizabeth Hofer and Martin v. Mohrenschildt

Mach. Learn. Knowl. Extr. 2024, 6(2), 1126-1144; https://doi.org/10.3390/make6020052 - 23 May 2024

Cited by 3 | Viewed by 1255

Abstract

Classification, the task of discerning the class of an unlabeled data point using information from a set of labeled data points, is a well-studied area of machine learning with a variety of approaches. Many of these approaches are closely linked to the selection [...] Read more.

Classification, the task of discerning the class of an unlabeled data point using information from a set of labeled data points, is a well-studied area of machine learning with a variety of approaches. Many of these approaches are closely linked to the selection of metrics or the generalizing of similarities defined by kernels. These metrics or similarity measures often require their parameters to be tuned in order to achieve the highest accuracy for each dataset. For example, an extensive search is required to determine the value of K or the choice of distance metric in K-NN classification. This paper explores a method of kernel construction that when used in classification performs consistently over a variety of datasets and does not require the parameters to be tuned. Inspired by dimensionality reduction techniques (DRT), we construct a kernel-based similarity measure that captures the topological structure of the data. This work compares the accuracy of K-NN classifiers, computed with specific operating parameters that obtain the highest accuracy per dataset, to a single trial of the here-proposed kernel classifier with no specialized parameters on standard benchmark sets. The here-proposed kernel used with simple classifiers has comparable accuracy to the ‘best-case’ K-NN classifiers without requiring the tuning of operating parameters. Full article

► Show Figures

Figure 1

12 pages, 793 KB

Open AccessArticle

The Human-Centred Design of a Universal Module for Artificial Intelligence Literacy in Tertiary Education Institutions

by Daswin De Silva, Shalinka Jayatilleke, Mona El-Ayoubi, Zafar Issadeen, Harsha Moraliyage and Nishan Mills

Mach. Learn. Knowl. Extr. 2024, 6(2), 1114-1125; https://doi.org/10.3390/make6020051 - 18 May 2024

Cited by 4 | Viewed by 3424

Abstract

Generative Artificial Intelligence (AI) is heralding a new era in AI for performing a spectrum of complex tasks that are indistinguishable from humans. Alongside language and text, Generative AI models have been built for all other modalities of digital data, image, video, audio, [...] Read more.

Generative Artificial Intelligence (AI) is heralding a new era in AI for performing a spectrum of complex tasks that are indistinguishable from humans. Alongside language and text, Generative AI models have been built for all other modalities of digital data, image, video, audio, and code. The full extent of Generative AI and its opportunities, challenges, contributions, and risks are still being explored by academic researchers, industry practitioners, and government policymakers. While this deep understanding of Generative AI continues to evolve, the lack of fluency, literacy, and effective interaction with Generative and conventional AI technologies are common challenges across all domains. Tertiary education institutions are uniquely positioned to address this void. In this article, we present the human-centred design of a universal AI literacy module, followed by its four primary constructs that provide core competence in AI to coursework and research students and academic and professional staff in a tertiary education setting. In comparison to related work in AI literacy, our design is inclusive due to the collaborative approach between multiple stakeholder groups and is comprehensive given the descriptive formulation of the primary constructs of this module with exemplars of how they activate core operational competence across the four groups. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

27 pages, 6643 KB

Open AccessArticle

Assessment of Software Vulnerability Contributing Factors by Model-Agnostic Explainable AI

by Ding Li, Yan Liu and Jun Huang

Mach. Learn. Knowl. Extr. 2024, 6(2), 1087-1113; https://doi.org/10.3390/make6020050 - 16 May 2024

Cited by 5 | Viewed by 2488

Abstract

Software vulnerability detection aims to proactively reduce the risk to software security and reliability. Despite advancements in deep-learning-based detection, a semantic gap still remains between learned features and human-understandable vulnerability semantics. In this paper, we present an XAI-based framework to assess program code [...] Read more.

Software vulnerability detection aims to proactively reduce the risk to software security and reliability. Despite advancements in deep-learning-based detection, a semantic gap still remains between learned features and human-understandable vulnerability semantics. In this paper, we present an XAI-based framework to assess program code in a graph context as feature representations and their effect on code vulnerability classification into multiple Common Weakness Enumeration (CWE) types. Our XAI framework is deep-learning-model-agnostic and programming-language-neutral. We rank the feature importance of 40 syntactic constructs for each of the top 20 distributed CWE types from three datasets in Java and C++. By means of four metrics of information retrieval, we measure the similarity of human-understandable CWE types using each CWE type’s feature contribution ranking learned from XAI methods. We observe that the subtle semantic difference between CWE types occurs after the variation in neighboring features’ contribution rankings. Our study shows that the XAI explanation results have approximately 78% Top-1 to 89% Top-5 similarity hit rates and a mean average precision of 0.70 compared with the baseline of CWE similarity identified by the open community experts. Our framework allows for code vulnerability patterns to be learned and contributing factors to be assessed at the same stage. Full article

(This article belongs to the Special Issue Advances in Explainable Artificial Intelligence (XAI): 2nd Edition)

► Show Figures

Figure 1

15 pages, 1886 KB

Open AccessArticle

Improving Time Series Regression Model Accuracy via Systematic Training Dataset Augmentation and Sampling

by Robin Ströbel, Marcus Mau, Alexander Puchta and Jürgen Fleischer

Mach. Learn. Knowl. Extr. 2024, 6(2), 1072-1086; https://doi.org/10.3390/make6020049 - 11 May 2024

Cited by 3 | Viewed by 3155

Abstract

This study addresses a significant gap in the field of time series regression modeling by highlighting the central role of data augmentation in improving model accuracy. The primary objective is to present a detailed methodology for systematic sampling of training datasets through data [...] Read more.

This study addresses a significant gap in the field of time series regression modeling by highlighting the central role of data augmentation in improving model accuracy. The primary objective is to present a detailed methodology for systematic sampling of training datasets through data augmentation to improve the accuracy of time series regression models. Therefore, different augmentation techniques are compared to evaluate their impact on model accuracy across different datasets and model architectures. In addition, this research highlights the need for a standardized approach to creating training datasets using multiple augmentation methods. The lack of a clear framework hinders the easy integration of data augmentation into time series regression pipelines. Our systematic methodology promotes model accuracy while providing a robust foundation for practitioners to seamlessly integrate data augmentation into their modeling practices. The effectiveness of our approach is demonstrated using process data from two milling machines. Experiments show that the optimized training dataset improves the generalization ability of machine learning models in 86.67% of the evaluated scenarios. However, the prediction accuracy of models trained on a sufficient dataset remains largely unaffected. Based on these results, sophisticated sampling strategies such as Quadratic Weighting of multiple augmentation approaches may be beneficial. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

17 pages, 9163 KB

Open AccessArticle

EyeXNet: Enhancing Abnormality Detection and Diagnosis via Eye-Tracking and X-ray Fusion

by Chihcheng Hsieh, André Luís, José Neves, Isabel Blanco Nobre, Sandra Costa Sousa, Chun Ouyang, Joaquim Jorge and Catarina Moreira

Mach. Learn. Knowl. Extr. 2024, 6(2), 1055-1071; https://doi.org/10.3390/make6020048 - 9 May 2024

Cited by 2 | Viewed by 3022

Abstract

Integrating eye gaze data with chest X-ray images in deep learning (DL) has led to contradictory conclusions in the literature. Some authors assert that eye gaze data can enhance prediction accuracy, while others consider eye tracking irrelevant for predictive tasks. We argue that [...] Read more.

Integrating eye gaze data with chest X-ray images in deep learning (DL) has led to contradictory conclusions in the literature. Some authors assert that eye gaze data can enhance prediction accuracy, while others consider eye tracking irrelevant for predictive tasks. We argue that this disagreement lies in how researchers process eye-tracking data as most remain agnostic to the human component and apply the data directly to DL models without proper preprocessing. We present EyeXNet, a multimodal DL architecture that combines images and radiologists’ fixation masks to predict abnormality locations in chest X-rays. We focus on fixation maps during reporting moments as radiologists are more likely to focus on regions with abnormalities and provide more targeted regions to the predictive models. Our analysis compares radiologist fixations in both silent and reporting moments, revealing that more targeted and focused fixations occur during reporting. Our results show that integrating the fixation masks in a multimodal DL architecture outperformed the baseline model in five out of eight experiments regarding average Recall and six out of eight regarding average Precision. Incorporating fixation masks representing radiologists’ classification patterns in a multimodal DL architecture benefits lesion detection in chest X-ray (CXR) images, particularly when there is a strong correlation between fixation masks and generated proposal regions. This highlights the potential of leveraging fixation masks to enhance multimodal DL architectures for CXR image analysis. This work represents a first step towards human-centered DL, moving away from traditional data-driven and human-agnostic approaches. Full article

(This article belongs to the Collection Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction)

► Show Figures

Figure 1

46 pages, 3360 KB

Open AccessReview

Categorical Data Clustering: A Bibliometric Analysis and Taxonomy

by Maya Cendana and Ren-Jieh Kuo

Mach. Learn. Knowl. Extr. 2024, 6(2), 1009-1054; https://doi.org/10.3390/make6020047 - 7 May 2024

Cited by 6 | Viewed by 4853

Abstract

Numerous real-world applications apply categorical data clustering to find hidden patterns in the data. The K-modes-based algorithm is a popular algorithm for solving common issues in categorical data, from outlier and noise sensitivity to local optima, utilizing metaheuristic methods. Many studies have [...] Read more.

Numerous real-world applications apply categorical data clustering to find hidden patterns in the data. The K-modes-based algorithm is a popular algorithm for solving common issues in categorical data, from outlier and noise sensitivity to local optima, utilizing metaheuristic methods. Many studies have focused on increasing clustering performance, with new methods now outperforming the traditional K-modes algorithm. It is important to investigate this evolution to help scholars understand how the existing algorithms overcome the common issues of categorical data. Using a research-area-based bibliometric analysis, this study retrieved articles from the Web of Science (WoS) Core Collection published between 2014 and 2023. This study presents a deep analysis of 64 articles to develop a new taxonomy of categorical data clustering algorithms. This study also discusses the potential challenges and opportunities in possible alternative solutions to categorical data clustering. Full article

(This article belongs to the Topic Big Data Intelligence: Methodologies and Applications)

► Show Figures

Figure 1

22 pages, 2777 KB

Open AccessArticle

Multilayer Perceptron Neural Network with Arithmetic Optimization Algorithm-Based Feature Selection for Cardiovascular Disease Prediction

by Fahad A. Alghamdi, Haitham Almanaseer, Ghaith Jaradat, Ashraf Jaradat, Mutasem K. Alsmadi, Sana Jawarneh, Abdullah S. Almurayh, Jehad Alqurni and Hayat Alfagham

Mach. Learn. Knowl. Extr. 2024, 6(2), 987-1008; https://doi.org/10.3390/make6020046 - 5 May 2024

Cited by 23 | Viewed by 3628

Abstract

In the healthcare field, diagnosing disease is the most concerning issue. Various diseases including cardiovascular diseases (CVDs) significantly influence illness or death. On the other hand, early and precise diagnosis of CVDs can decrease chances of death, resulting in a better and healthier [...] Read more.

In the healthcare field, diagnosing disease is the most concerning issue. Various diseases including cardiovascular diseases (CVDs) significantly influence illness or death. On the other hand, early and precise diagnosis of CVDs can decrease chances of death, resulting in a better and healthier life for patients. Researchers have used traditional machine learning (ML) techniques for CVD prediction and classification. However, many of them are inaccurate and time-consuming due to the unavailability of quality data including imbalanced samples, inefficient data preprocessing, and the existing selection criteria. These factors lead to an overfitting or bias issue towards a certain class label in the prediction model. Therefore, an intelligent system is needed which can accurately diagnose CVDs. We proposed an automated ML model for various kinds of CVD prediction and classification. Our prediction model consists of multiple steps. Firstly, a benchmark dataset is preprocessed using filter techniques. Secondly, a novel arithmetic optimization algorithm is implemented as a feature selection technique to select the best subset of features that influence the accuracy of the prediction model. Thirdly, a classification task is implemented using a multilayer perceptron neural network to classify the instances of the dataset into two class labels, determining whether they have a CVD or not. The proposed ML model is trained on the preprocessed data and then tested and validated. Furthermore, for the comparative analysis of the model, various performance evaluation metrics are calculated including overall accuracy, precision, recall, and F1-score. As a result, it has been observed that the proposed prediction model can achieve 88.89% accuracy, which is the highest in a comparison with the traditional ML techniques. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

22 pages, 11463 KB

Open AccessArticle

VOD: Vision-Based Building Energy Data Outlier Detection

by Jinzhao Tian, Tianya Zhao, Zhuorui Li, Tian Li, Haipei Bie and Vivian Loftness

Mach. Learn. Knowl. Extr. 2024, 6(2), 965-986; https://doi.org/10.3390/make6020045 - 3 May 2024

Cited by 3 | Viewed by 2997

Abstract

Outlier detection plays a critical role in building operation optimization and data quality maintenance. However, existing methods often struggle with the complexity and variability of building energy data, leading to poorly generalized and explainable results. To address the gap, this study introduces a [...] Read more.

Outlier detection plays a critical role in building operation optimization and data quality maintenance. However, existing methods often struggle with the complexity and variability of building energy data, leading to poorly generalized and explainable results. To address the gap, this study introduces a novel Vision-based Outlier Detection (VOD) approach, leveraging computer vision models to spot outliers in the building energy records. The models are trained to identify outliers by analyzing the load shapes in 2D time series plots derived from the energy data. The VOD approach is tested on four years of workday time-series electricity consumption data from 290 commercial buildings in the United States. Two distinct models are developed for different usage purposes, namely a classification model for broad-level outlier detection and an object detection model for the demands of precise pinpointing of outliers. The classification model is also interpreted via Grad-CAM to enhance its usage reliability. The classification model achieves an F1 score of 0.88, and the object detection model achieves an Average Precision (AP) of 0.84. VOD is a very efficient path to identifying energy consumption outliers in building operations, paving the way for the enhancement of building energy data quality, operation efficiency, and energy savings. Full article

(This article belongs to the Collection Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction)

► Show Figures

Figure 1

21 pages, 7555 KB

Open AccessArticle

Quantum-Enhanced Representation Learning: A Quanvolutional Autoencoder Approach against DDoS Threats

by Pablo Rivas, Javier Orduz, Tonni Das Jui, Casimer DeCusatis and Bikram Khanal

Mach. Learn. Knowl. Extr. 2024, 6(2), 944-964; https://doi.org/10.3390/make6020044 - 1 May 2024

Cited by 6 | Viewed by 3844

Abstract

Motivated by the growing threat of distributed denial-of-service (DDoS) attacks and the emergence of quantum computing, this study introduces a novel “quanvolutional autoencoder” architecture for learning representations. The architecture leverages the computational advantages of quantum mechanics to improve upon traditional machine learning techniques. [...] Read more.

Motivated by the growing threat of distributed denial-of-service (DDoS) attacks and the emergence of quantum computing, this study introduces a novel “quanvolutional autoencoder” architecture for learning representations. The architecture leverages the computational advantages of quantum mechanics to improve upon traditional machine learning techniques. Specifically, the quanvolutional autoencoder employs randomized quantum circuits to analyze time-series data from DDoS attacks, offering a robust alternative to classical convolutional neural networks. Experimental results suggest that the quanvolutional autoencoder performs similarly to classical models in visualizing and learning from DDoS hive plots and leads to faster convergence and learning stability. These findings suggest that quantum machine learning holds significant promise for advancing data analysis and visualization in cybersecurity. The study highlights the need for further research in this fast-growing field, particularly for unsupervised anomaly detection. Full article

(This article belongs to the Collection Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction)

► Show Figures

Figure 1

27 pages, 2009 KB

Open AccessReview

A Comprehensive Summary of the Application of Machine Learning Techniques for CO₂-Enhanced Oil Recovery Projects

by Xuejia Du, Sameer Salasakar and Ganesh Thakur

Mach. Learn. Knowl. Extr. 2024, 6(2), 917-943; https://doi.org/10.3390/make6020043 - 29 Apr 2024

Cited by 13 | Viewed by 5206

Abstract

This paper focuses on the current application of machine learning (ML) in enhanced oil recovery (EOR) through CO₂ injection, which exhibits promising economic and environmental benefits for climate-change mitigation strategies. Our comprehensive review explores the diverse use cases of ML techniques in [...] Read more.

This paper focuses on the current application of machine learning (ML) in enhanced oil recovery (EOR) through CO₂ injection, which exhibits promising economic and environmental benefits for climate-change mitigation strategies. Our comprehensive review explores the diverse use cases of ML techniques in CO₂-EOR, including aspects such as minimum miscible pressure (MMP) prediction, well location optimization, oil production and recovery factor prediction, multi-objective optimization, Pressure–Volume–Temperature (PVT) property estimation, Water Alternating Gas (WAG) analysis, and CO₂-foam EOR, from 101 reviewed papers. We catalog relative information, including the input parameters, objectives, data sources, train/test/validate information, results, evaluation, and rating score for each area based on criteria such as data quality, ML-building process, and the analysis of results. We also briefly summarized the benefits and limitations of ML methods in petroleum industry applications. Our detailed and extensive study could serve as an invaluable reference for employing ML techniques in the petroleum industry. Based on the review, we found that ML techniques offer great potential in solving problems in the majority of CO₂-EOR areas involving prediction and regression. With the generation of massive amounts of data in the everyday oil and gas industry, machine learning techniques can provide efficient and reliable preliminary results for the industry. Full article

(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity)

► Show Figures

Figure 1

19 pages, 2051 KB

Open AccessPerspective

Concept Paper for a Digital Expert: Systematic Derivation of (Causal) Bayesian Networks Based on Ontologies for Knowledge-Based Production Steps

by Manja Mai-Ly Pfaff-Kastner, Ken Wenzel and Steffen Ihlenfeldt

Mach. Learn. Knowl. Extr. 2024, 6(2), 898-916; https://doi.org/10.3390/make6020042 - 25 Apr 2024

Cited by 3 | Viewed by 2141

Abstract

Despite increasing digitalization and automation, complex production processes often require human judgment/decision-making adaptability. Humans can abstract and transfer knowledge to new situations. People in production are an irreplaceable resource. This paper presents a new concept for digitizing human expertise and their ability to [...] Read more.

Despite increasing digitalization and automation, complex production processes often require human judgment/decision-making adaptability. Humans can abstract and transfer knowledge to new situations. People in production are an irreplaceable resource. This paper presents a new concept for digitizing human expertise and their ability to make knowledge-based decisions in the production area based on ontologies and causal Bayesian networks for further research. Dedicated approaches for the ontology-based creation of Bayesian networks exist in the literature. Therefore, we first comprehensively analyze previous studies and summarize the approaches. We then add the causal perspective, which has often not been an explicit subject of consideration. We see a research gap in the systematic and structured approach to ontology-based generation of causal graphs (CGs). At the current state of knowledge, the semantic understanding of a domain formalized in an ontology can contribute to developing a generic approach to derive a CG. The ontology functions as a knowledge base by formally representing knowledge and experience. Causal inference calculations can mathematically imitate the human decision-making process under uncertainty. Therefore, a systematic ontology-based approach to building a CG can allow digitizing the human ability to make decisions based on experience and knowledge. Full article

(This article belongs to the Section Network)

► Show Figures

Figure 1

21 pages, 3643 KB

Open AccessArticle

Enhancing Legal Sentiment Analysis: A Convolutional Neural Network–Long Short-Term Memory Document-Level Model

by Bolanle Abimbola, Enrique de La Cal Marin and Qing Tan

Mach. Learn. Knowl. Extr. 2024, 6(2), 877-897; https://doi.org/10.3390/make6020041 - 19 Apr 2024

Cited by 15 | Viewed by 4576

Abstract

This research investigates the application of deep learning in sentiment analysis of Canadian maritime case law. It offers a framework for improving maritime law and legal analytic policy-making procedures. The automation of legal document extraction takes center stage, underscoring the vital role sentiment [...] Read more.

This research investigates the application of deep learning in sentiment analysis of Canadian maritime case law. It offers a framework for improving maritime law and legal analytic policy-making procedures. The automation of legal document extraction takes center stage, underscoring the vital role sentiment analysis plays at the document level. Therefore, this study introduces a novel strategy for sentiment analysis in Canadian maritime case law, combining sentiment case law approaches with state-of-the-art deep learning techniques. The overarching goal is to systematically unearth hidden biases within case law and investigate their impact on legal outcomes. Employing Convolutional Neural Network (CNN)- and long short-term memory (LSTM)-based models, this research achieves a remarkable accuracy of 98.05% for categorizing instances. In contrast, conventional machine learning techniques such as support vector machine (SVM) yield an accuracy rate of 52.57%, naïve Bayes at 57.44%, and logistic regression at 61.86%. The superior accuracy of the CNN and LSTM model combination underscores its usefulness in legal sentiment analysis, offering promising future applications in diverse fields like legal analytics and policy design. These findings mark a significant choice for AI-powered legal tools, presenting more sophisticated and sentiment-aware options for the legal profession. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

35 pages, 2155 KB

Open AccessArticle

A Comprehensive Survey on Deep Learning Methods in Human Activity Recognition

by Michail Kaseris, Ioannis Kostavelis and Sotiris Malassiotis

Mach. Learn. Knowl. Extr. 2024, 6(2), 842-876; https://doi.org/10.3390/make6020040 - 18 Apr 2024

Cited by 36 | Viewed by 12424

Abstract

Human activity recognition (HAR) remains an essential field of research with increasing real-world applications ranging from healthcare to industrial environments. As the volume of publications in this domain continues to grow, staying abreast of the most pertinent and innovative methodologies can be challenging. [...] Read more.

Human activity recognition (HAR) remains an essential field of research with increasing real-world applications ranging from healthcare to industrial environments. As the volume of publications in this domain continues to grow, staying abreast of the most pertinent and innovative methodologies can be challenging. This survey provides a comprehensive overview of the state-of-the-art methods employed in HAR, embracing both classical machine learning techniques and their recent advancements. We investigate a plethora of approaches that leverage diverse input modalities including, but not limited to, accelerometer data, video sequences, and audio signals. Recognizing the challenge of navigating the vast and ever-growing HAR literature, we introduce a novel methodology that employs large language models to efficiently filter and pinpoint relevant academic papers. This not only reduces manual effort but also ensures the inclusion of the most influential works. We also provide a taxonomy of the examined literature to enable scholars to have rapid and organized access when studying HAR approaches. Through this survey, we aim to inform researchers and practitioners with a holistic understanding of the current HAR landscape, its evolution, and the promising avenues for future exploration. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

15 pages, 4325 KB

Open AccessArticle

Impact of Nature of Medical Data on Machine and Deep Learning for Imbalanced Datasets: Clinical Validity of SMOTE Is Questionable

by Seifollah Gholampour

Mach. Learn. Knowl. Extr. 2024, 6(2), 827-841; https://doi.org/10.3390/make6020039 - 15 Apr 2024

Cited by 14 | Viewed by 4838

Abstract

Dataset imbalances pose a significant challenge to predictive modeling in both medical and financial domains, where conventional strategies, including resampling and algorithmic modifications, often fail to adequately address minority class underrepresentation. This study theoretically and practically investigates how the inherent nature of medical [...] Read more.

Dataset imbalances pose a significant challenge to predictive modeling in both medical and financial domains, where conventional strategies, including resampling and algorithmic modifications, often fail to adequately address minority class underrepresentation. This study theoretically and practically investigates how the inherent nature of medical data affects the classification of minority classes. It employs ten machine and deep learning classifiers, ranging from ensemble learners to cost-sensitive algorithms, across comparably sized medical and financial datasets. Despite these efforts, none of the classifiers achieved effective classification of the minority class in the medical dataset, with sensitivity below 5.0% and area under the curve (AUC) below 57.0%. In contrast, the similar classifiers applied to the financial dataset demonstrated strong discriminative power, with overall accuracy exceeding 95.0%, sensitivity over 73.0%, and AUC above 96.0%. This disparity underscores the unpredictable variability inherent in the nature of medical data, as exemplified by the dispersed and homogeneous distribution of the minority class among other classes in principal component analysis (PCA) graphs. The application of the synthetic minority oversampling technique (SMOTE) introduced 62 synthetic patients based on merely 20 original cases, casting doubt on its clinical validity and the representation of real-world patient variability. Furthermore, post-SMOTE feature importance analysis, utilizing SHapley Additive exPlanations (SHAP) and tree-based methods, contradicted established cerebral stroke parameters, further questioning the clinical coherence of synthetic dataset augmentation. These findings call into question the clinical validity of the SMOTE technique and underscore the urgent need for advanced modeling techniques and algorithmic innovations for predicting minority-class outcomes in medical datasets without depending on resampling strategies. This approach underscores the importance of developing methods that are not only theoretically robust but also clinically relevant and applicable to real-world clinical scenarios. Consequently, this study underscores the importance of future research efforts to bridge the gap between theoretical advancements and the practical, clinical applications of models like SMOTE in healthcare. Full article

(This article belongs to the Topic Communications Challenges in Health and Well-Being)

► Show Figures

Figure 1

27 pages, 1266 KB

Open AccessArticle

A Meta Algorithm for Interpretable Ensemble Learning: The League of Experts

by Richard Vogel, Tobias Schlosser, Robert Manthey, Marc Ritter, Matthias Vodel, Maximilian Eibl and Kristan Alexander Schneider

Mach. Learn. Knowl. Extr. 2024, 6(2), 800-826; https://doi.org/10.3390/make6020038 - 9 Apr 2024

Cited by 1 | Viewed by 2906

Abstract

Background. The importance of explainable artificial intelligence and machine learning (XAI/XML) is increasingly being recognized, aiming to understand how information contributes to decisions, the method’s bias, or sensitivity to data pathologies. Efforts are often directed to post hoc explanations [...] Read more.

Background. The importance of explainable artificial intelligence and machine learning (XAI/XML) is increasingly being recognized, aiming to understand how information contributes to decisions, the method’s bias, or sensitivity to data pathologies. Efforts are often directed to post hoc explanations of black box models. These approaches add additional sources for errors without resolving their shortcomings. Less effort is directed into the design of intrinsically interpretable approaches. Methods. We introduce an intrinsically interpretable methodology motivated by ensemble learning: the League of Experts (LoE) model. We establish the theoretical framework first and then deduce a modular meta algorithm. In our description, we focus primarily on classification problems. However, LoE applies equally to regression problems. Specific to classification problems, we employ classical decision trees as classifier ensembles as a particular instance. This choice facilitates the derivation of human-understandable decision rules for the underlying classification problem, which results in a derived rule learning system denoted as RuleLoE. Results. In addition to 12 KEEL classification datasets, we employ two standard datasets from particularly relevant domains—medicine and finance—to illustrate the LoE algorithm. The performance of LoE with respect to its accuracy and rule coverage is comparable to common state-of-the-art classification methods. Moreover, LoE delivers a clearly understandable set of decision rules with adjustable complexity, describing the classification problem. Conclusions. LoE is a reliable method for classification and regression problems with an accuracy that seems to be appropriate for situations in which underlying causalities are in the center of interest rather than just accurate predictions or classifications. Full article

(This article belongs to the Special Issue Advances in Explainable Artificial Intelligence (XAI): 2nd Edition)

► Show Figures

Figure 1

11 pages, 1606 KB

Open AccessArticle

Effective Data Reduction Using Discriminative Feature Selection Based on Principal Component Analysis

by Faith Nwokoma, Justin Foreman and Cajetan M. Akujuobi

Mach. Learn. Knowl. Extr. 2024, 6(2), 789-799; https://doi.org/10.3390/make6020037 - 3 Apr 2024

Cited by 6 | Viewed by 3173

Abstract

Effective data reduction must retain the greatest possible amount of informative content of the data under examination. Feature selection is the default for dimensionality reduction, as the relevant features of a dataset are usually retained through this method. In this study, we used [...] Read more.

Effective data reduction must retain the greatest possible amount of informative content of the data under examination. Feature selection is the default for dimensionality reduction, as the relevant features of a dataset are usually retained through this method. In this study, we used unsupervised learning to discover the top-k discriminative features present in the large multivariate IoT dataset used. We used the statistics of principal component analysis to filter the relevant features based on the ranks of the features along the principal directions while also considering the coefficients of the components. The selected number of principal components was used to decide the number of features to be selected in the SVD process. A number of experiments were conducted using different benchmark datasets, and the effectiveness of the proposed method was evaluated based on the reconstruction error. The potency of the results was verified by subjecting the algorithm to a large IoT dataset, and we compared the performance based on accuracy and reconstruction error to the results of the benchmark datasets. The performance evaluation showed consistency with the results obtained with the benchmark datasets, which were of high accuracy and low reconstruction error. Full article

(This article belongs to the Topic Big Data Intelligence: Methodologies and Applications)

► Show Figures

Figure 1

19 pages, 2757 KB

Open AccessArticle

Birthweight Range Prediction and Classification: A Machine Learning-Based Sustainable Approach

by Dina A. Alabbad, Shahad Y. Ajibi, Raghad B. Alotaibi, Noura K. Alsqer, Rahaf A. Alqahtani, Noor M. Felemban, Atta Rahman, Sumayh S. Aljameel, Mohammed Imran Basheer Ahmed and Mustafa M. Youldash

Mach. Learn. Knowl. Extr. 2024, 6(2), 770-788; https://doi.org/10.3390/make6020036 - 1 Apr 2024

Cited by 14 | Viewed by 4953

Abstract

An accurate prediction of fetal birth weight is crucial in ensuring safe delivery without health complications for the mother and baby. The uncertainty surrounding the fetus’s birth situation, including its weight range, can lead to significant risks for both mother and baby. As [...] Read more.

An accurate prediction of fetal birth weight is crucial in ensuring safe delivery without health complications for the mother and baby. The uncertainty surrounding the fetus’s birth situation, including its weight range, can lead to significant risks for both mother and baby. As there is a standard birth weight range, if the fetus exceeds or falls below this range, it can result in considerable health problems. Although ultrasound imaging is commonly used to predict fetal weight, it does not always provide accurate readings, which may lead to unnecessary decisions such as early delivery and cesarian section. Besides that, no supporting system is available to predict the weight range in Saudi Arabia. Therefore, leveraging the available technologies to build a system that can serve as a second opinion for doctors and health professionals is essential. Machine learning (ML) offers significant advantages to numerous fields and can address various issues. As such, this study aims to utilize ML techniques to build a predictive model to predict the birthweight range of infants into low, normal, or high. For this purpose, two datasets were used: one from King Fahd University Hospital (KFHU), Saudi Arabia, and another publicly available dataset from the Institute of Electrical and Electronics Engineers (IEEE) data port. KFUH’s best result was obtained with the Extra Trees model, achieving an accuracy, precision, recall, and F1-score of 98%, with a specificity of 99%. On the other hand, using the Random Forest model, the IEEE dataset attained an accuracy, precision, recall, and F1-score of 96%, respectively, with a specificity of 98%. These results suggest that the proposed ML system can provide reliable predictions, which could be of significant value for doctors and health professionals in Saudi Arabia. Full article

(This article belongs to the Special Issue Sustainable Applications for Machine Learning)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Mach. Learn. Knowl. Extr., Volume 6, Issue 2 (June 2024) – 32 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI