Machine Learning and Knowledge Extraction doi: 10.3390/make6010033
Authors: Foziya Ahmed Mohammed Kula Kekeba Tune Beakal Gizachew Assefa Marti Jett Seid Muhie
In this review, we compiled convolutional neural network (CNN) methods which have the potential to automate the manual, costly and error-prone processing of medical images. We attempted to provide a thorough survey of improved architectures, popular frameworks, activation functions, ensemble techniques, hyperparameter optimizations, performance metrics, relevant datasets and data preprocessing strategies that can be used to design robust CNN models. We also used machine learning algorithms for the statistical modeling of the current literature to uncover latent topics, method gaps, prevalent themes and potential future advancements. The statistical modeling results indicate a temporal shift in favor of improved CNN designs, such as a shift from the use of a CNN architecture to a CNN-transformer hybrid. The insights from statistical modeling point that the surge of CNN practitioners into the medical imaging field, partly driven by the COVID-19 challenge, catalyzed the use of CNN methods for detecting and diagnosing pathological conditions. This phenomenon likely contributed to the sharp increase in the number of publications on the use of CNNs for medical imaging, both during and after the pandemic. Overall, the existing literature has certain gaps in scope with respect to the design and optimization of CNN architectures and methods specifically for medical imaging. Additionally, there is a lack of post hoc explainability of CNN models and slow progress in adopting CNNs for low-resource medical imaging. This review ends with a list of open research questions that have been identified through statistical modeling and recommendations that can potentially help set up more robust, improved and reproducible CNN experiments for medical imaging.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010032
Authors: Leandra Lukomski Juan Pisula Naita Wirsik Alexander Damanakis Jin-On Jung Karl Knipper Rabi Datta Wolfgang Schröder Florian Gebauer Thomas Schmidt Alexander Quaas Katarzyna Bozek Christiane Bruns Felix Popp
AIM: In this study, we use Artificial Intelligence (AI), including Machine (ML) and Deep Learning (DL), to predict the long-term survival of resectable esophageal cancer (EC) patients in a high-volume surgical center. Our objective is to evaluate the predictive efficacy of AI methods for survival prognosis across different time points of oncological treatment. This involves comparing models trained with clinical data, integrating either Tumor, Node, Metastasis (TNM) classification or tumor biomarker analysis, for long-term survival predictions. METHODS: In this retrospective study, 1002 patients diagnosed with EC between 1996 and 2021 were analyzed. The original dataset comprised 55 pre- and postoperative patient characteristics and 55 immunohistochemically evaluated biomarkers following surgical intervention. To predict the five-year survival status, four AI methods (Random Forest RF, XG Boost XG, Artificial Neural Network ANN, TabNet TN) and Logistic Regression (LR) were employed. The models were trained using three predefined subsets of the training dataset as follows: (I) the baseline dataset (BL) consisting of pre-, intra-, and postoperative data, including the TNM but excluding tumor biomarkers, (II) clinical data accessible at the time of the initial diagnostic workup (primary staging dataset, PS), and (III) the PS dataset including tumor biomarkers from tissue microarrays (PS + biomarkers), excluding TNM status. We used permutation feature importance for feature selection to identify only important variables for AI-driven reduced datasets and subsequent model retraining. RESULTS: Model training on the BL dataset demonstrated similar predictive performances for all models (Accuracy, ACC: 0.73/0.74/0.76/0.75/0.73; AUC: 0.78/0.82/0.83/0.80/0.79 RF/XG/ANN/TN/LR, respectively). The predictive performance and generalizability declined when the models were trained with the PS dataset. Surprisingly, the inclusion of biomarkers in the PS dataset for model training led to improved predictions (PS dataset vs. PS dataset + biomarkers; ACC: 0.70 vs. 0.77/0.73 vs. 0.79/0.71 vs. 0.75/0.69 vs. 0.72/0.63 vs. 0.66; AUC: 0.77 vs. 0.83/0.80 vs. 0.85/0.76 vs. 0.86/0.70 vs. 0.76/0.70 vs. 0.69 RF/XG/ANN/TN/LR, respectively). The AI models outperformed LR when trained with the PS datasets. The important features shared after AI-driven feature selection in all models trained with the BL dataset included histopathological lymph node status (pN), histopathological tumor size (pT), clinical tumor size (cT), age at the time of surgery, and postoperative tracheostomy. Following training with the PS dataset with biomarkers, the important predictive features included patient age at the time of surgery, TP-53 gene mutation, Mesothelin expression, thymidine phosphorylase (TYMP) expression, NANOG homebox protein expression, and indoleamine 2,3-dioxygenase (IDO) expressed on tumor-infiltrating lymphocytes, as well as tumor-infiltrating Mast- and Natural killer cells. CONCLUSION: Different AI methods similarly predict the long-term survival status of patients with EC and outperform LR, the state-of-the-art classification model. Survival status can be predicted with similar predictive performance with patient data at an early stage of treatment when utilizing additional biomarker analysis. This suggests that individual survival predictions can be made early in cancer treatment by utilizing biomarkers, reducing the necessity for the pathological TNM status post-surgery. This study identifies important features for survival predictions that vary depending on the timing of oncological treatment.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010031
Authors: Soma Onishi Masahiro Nishimura Ryota Fujimura Yoichi Hayashi
Although machine learning models are widely used in critical domains, their complexity and poor interpretability remain problematic. Decision trees (DTs) and rule-based models are known for their interpretability, and numerous studies have investigated techniques for approximating tree ensembles using DTs or rule sets, even though these approximators often overlook interpretability. These methods generate three types of rule sets: DT based, unordered, and decision list based. However, very few metrics exist that can distinguish and compare these rule sets. Therefore, the present study proposes an interpretability metric to allow for comparisons of interpretability between different rule sets and investigates the interpretability of the rules generated by the tree ensemble approximators. We compare these rule sets with the Recursive-Rule eXtraction algorithm (Re-RX) with J48graft to offer insights into the interpretability gap. The results indicate that Re-RX with J48graft can handle categorical and numerical attributes separately, has simple rules, and achieves a high interpretability, even when the number of rules is large. RuleCOSI+, a state-of-the-art method, showed significantly lower results regarding interpretability, but had the smallest number of rules.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010030
Authors: Heesung Shim Jonathan E. Allen W. F. Drew Bennett
Decades of drug development research have explored a vast chemical space for highly active compounds. The exponential growth of virtual libraries enables easy access to billions of synthesizable molecules. Computational modeling, particularly molecular docking, utilizes physics-based calculations to prioritize molecules for synthesis and testing. Nevertheless, the molecular docking process often yields docking poses with favorable scores that prove to be inaccurate with experimental testing. To address these issues, several approaches using machine learning (ML) have been proposed to filter incorrect poses based on the crystal structures. However, most of the methods are limited by the availability of structure data. Here, we propose a new pose classification approach, PECAN2 (Pose Classification with 3D Atomic Network 2), without the need for crystal structures, based on a 3D atomic neural network with Point Cloud Network (PCN). The new approach uses the correlation between docking scores and experimental data to assign labels, instead of relying on the crystal structures. We validate the proposed classifier on multiple datasets including human mu, delta, and kappa opioid receptors and SARS-CoV-2 Mpro. Our results demonstrate that leveraging the correlation between docking scores and experimental data alone enhances molecular docking performance by filtering out false positives and false negatives.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010029
Authors: Loay Hassan Mohamed Abdel-Nasser Adel Saleh Domenec Puig
Digital breast tomosynthesis (DBT) is a 3D breast cancer screening technique that can overcome the limitations of standard 2D digital mammography. However, DBT images often suffer from artifacts stemming from acquisition conditions, a limited angular range, and low radiation doses. These artifacts have the potential to degrade the performance of automated breast tumor classification tools. Notably, most existing automated breast tumor classification methods do not consider the effect of DBT image quality when designing the classification models. In contrast, this paper introduces a novel deep learning-based framework for classifying breast tumors in DBT images. This framework combines global image quality-aware features with tumor texture descriptors. The proposed approach employs a two-branch model: in the top branch, a deep convolutional neural network (CNN) model is trained to extract robust features from the region of interest that includes the tumor. In the bottom branch, a deep learning model named TomoQA is trained to extract global image quality-aware features from input DBT images. The quality-aware features and the tumor descriptors are then combined and fed into a fully-connected layer to classify breast tumors as benign or malignant. The unique advantage of this model is the combination of DBT image quality-aware features with tumor texture descriptors, which helps accurately classify breast tumors as benign or malignant. Experimental results on a publicly available DBT image dataset demonstrate that the proposed framework achieves superior breast tumor classification results, outperforming all existing deep learning-based methods.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010028
Authors: Danial Hooshyar Roger Azevedo Yeongwook Yang
Artificial neural networks (ANNs) have proven to be among the most important artificial intelligence (AI) techniques in educational applications, providing adaptive educational services. However, their educational potential is limited in practice due to challenges such as the following: (i) the difficulties in incorporating symbolic educational knowledge (e.g., causal relationships and practitioners’ knowledge) in their development, (ii) a propensity to learn and reflect biases, and (iii) a lack of interpretability. As education is classified as a ‘high-risk’ domain under recent regulatory frameworks like the EU AI Act—highlighting its influence on individual futures and discrimination risks—integrating educational insights into ANNs is essential. This ensures that AI applications adhere to essential educational restrictions and provide interpretable predictions. This research introduces NSAI, a neural-symbolic AI approach that integrates neural networks with knowledge representation and symbolic reasoning. It injects and extracts educational knowledge into and from deep neural networks to model learners’ computational thinking, aiming to enhance personalized learning and develop computational thinking skills. Our findings revealed that the NSAI approach demonstrates better generalizability compared to deep neural networks trained on both original training data and data enriched by SMOTE and autoencoder methods. More importantly, we found that, unlike traditional deep neural networks, which mainly relied on spurious correlations in their predictions, the NSAI approach prioritizes the development of robust representations that accurately capture causal relationships between inputs and outputs. This focus significantly reduces the reinforcement of biases and prevents misleading correlations in the models. Furthermore, our research showed that the NSAI approach enables the extraction of rules from the trained network, facilitating interpretation and reasoning during the path to predictions, as well as refining the initial educational knowledge. These findings imply that neural-symbolic AI not only overcomes the limitations of ANNs in education but also holds broader potential for transforming educational practices and outcomes through trustworthy and interpretable applications.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010027
Authors: Stephen Fox Vitor Fortes Rey
Hybrid machine learning encompasses predefinition of rules and ongoing learning from data. Human organizations can implement hybrid machine learning (HML) to automate some of their operations. Human organizations need to ensure that their HML implementations are aligned with human ethical requirements as defined in laws, regulations, standards, etc. The purpose of the study reported here was to investigate technical opportunities for representing human ethical requirements in HML. The study sought to represent two types of human ethical requirements in HML: locally simple and locally complex. The locally simple case is road traffic regulations. This can be considered to be a relatively simple case because human ethical requirements for road safety, such as stopping at red traffic lights, are defined clearly and have limited scope for personal interpretation. The locally complex case is diagnosis procedures for functional disorders, which can include medically unexplained symptoms. This case can be considered to be locally complex because human ethical requirements for functional disorder healthcare are less well defined and are more subject to personal interpretation. Representations were made in a type of HML called Algebraic Machine Learning. Our findings indicate that there are technical opportunities to represent human ethical requirements in HML because of its combination of human-defined top down rules and bottom up data-driven learning. However, our findings also indicate that there are limitations to representing human ethical requirements: irrespective of what type of machine learning is used. These limitations arise from fundamental challenges in defining complex ethical requirements, and from potential for opposing interpretations of their implementation. Furthermore, locally simple ethical requirements can contribute to wider ethical complexity.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010026
Authors: Cezary Maszczyk Marek Sikora Łukasz Wróbel
Most rule induction algorithms generate rules with simple logical conditions based on equality or inequality relations. This feature limits their ability to discover complex dependencies that may exist in data. This article presents an extension to the sequential covering rule induction algorithm that allows it to generate complex and M-of-N conditions within the premises of rules. The proposed methodology uncovers complex patterns in data that are not adequately expressed by rules with simple conditions. The novel two-phase approach efficiently generates M-of-N conditions by analysing frequent sets in previously induced simple and complex rule conditions. The presented method allows rule induction for classification, regression and survival problems. Extensive experiments on various public datasets show that the proposed method often leads to more concise rulesets compared to those using only simple conditions. Importantly, the inclusion of complex conditions and M-of-N conditions has no statistically significant negative impact on the predictive ability of the ruleset. Experimental results and a ready-to-use implementation are available in the GitHub repository. The proposed algorithm can potentially serve as a valuable tool for knowledge discovery and facilitate the interpretation of rule-based models by making them more concise.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010025
Authors: Enqi Ma Zbigniew J. Kabala
Squash is a sport where referee decisions are essential to the game. However, these decisions are very subjective in nature. Disputes, both from the players and the audience, regularly occur because the referee made a controversial call. In this study, we propose automating the referee decision process through machine learning. We trained neural networks to predict such decisions using data from 400 referee decisions acquired through extensive video footage reviewing and labeling. Six positional values were extracted, including the attacking player’s position, the retreating player’s position, the ball’s position in the frame, the ball’s projected first bounce, the ball’s projected second bounce, and the attacking player’s racket head position. We calculated nine additional distance values, such as the distance between players and the distance from the attacking player’s racket head to the ball’s path. Models were trained on Wolfram Mathematica and Python using these values. The best Wolfram Mathematica model and the best Python model achieved accuracies of 86% ± 3.03% and 85.2% ± 5.1%, respectively. These accuracies surpass 85%, demonstrating near-human performance. Our model has great potential for improvement as it is currently trained with limited, unbalanced data (400 decisions) and lacks crucial data points such as time and speed. The performance of our model is almost surely going to improve significantly with a larger training dataset. Unlike human referees, machine learning models follow a consistent standard, have unlimited attention spans, and make decisions instantly. If the accuracy is improved in the future, the model can potentially serve as an extra refereeing official for both professional and amateur squash matches. Both the analysis of referee decisions in squash and the proposal to automate the process using machine learning is unique to this study.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010024
Authors: Mohammed G. Alsubaie Suhuai Luo Kamran Shaukat
Alzheimer’s disease (AD) is a pressing global issue, demanding effective diagnostic approaches. This systematic review surveys the recent literature (2018 onwards) to illuminate the current landscape of AD detection via deep learning. Focusing on neuroimaging, this study explores single- and multi-modality investigations, delving into biomarkers, features, and preprocessing techniques. Various deep models, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and generative models, are evaluated for their AD detection performance. Challenges such as limited datasets and training procedures persist. Emphasis is placed on the need to differentiate AD from similar brain patterns, necessitating discriminative feature representations. This review highlights deep learning’s potential and limitations in AD detection, underscoring dataset importance. Future directions involve benchmark platform development for streamlined comparisons. In conclusion, while deep learning holds promise for accurate AD detection, refining models and methods is crucial to tackle challenges and enhance diagnostic precision.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010023
Authors: Subhayu Dutta Subhrangshu Adhikary Ashutosh Dhar Dwivedi
Complex documents have text, figures, tables, and other elements. The classification of scanned copies of different categories of complex documents like memos, newspapers, letters, and more is essential for rapid digitization. However, this task is very challenging as most scanned complex documents look similar. This is because all documents have similar colors of the page and letters, similar textures for all papers, and very few contrasting features. Several attempts have been made in the state of the art to classify complex documents; however, only a few of these works have addressed the classification of complex documents with similar features, and among these, the performances could be more satisfactory. To overcome this, this paper presents a method to use an optical character reader to extract the texts. It proposes a multi-headed model to combine vision-based transfer learning and natural-language-based Transformers within the same network for simultaneous training for different inputs and optimizers in specific parts of the network. A subset of the Ryers Vision Lab Complex Document Information Processing dataset containing 16 different document classes was used to evaluate the performances. The proposed multi-headed VisFormers network classified the documents with up to 94.2% accuracy, while a regular natural-language-processing-based Transformer network achieved 83%, and vision-based VGG19 transfer learning could achieve only up to 90% accuracy. The model deployment can help sort the scanned copies of various documents into different categories.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010022
Authors: Sabrina Djeradi Tahar Dahame Mohamed Abdelilah Fadla Bachir Bentria Mohammed Benali Kanoun Souraya Goumri-Said
Perovskite materials have attracted much attention in recent years due to their high performance, especially in the field of photovoltaics. However, the dark side of these materials is their poor stability, which poses a huge challenge to their practical applications. Double perovskite compounds, on the other hand, can show more stability as a result of their specific structure. One of the key properties of both perovskite and double perovskite is their tunable band gap, which can be determined using different techniques. Density functional theory (DFT), for instance, offers the potential to intelligently direct experimental investigation activities and predict various properties, including band gap. In reality, however, it is still difficult to anticipate the energy band gap from first principles, and accurate results often require more expensive methods such as hybrid functional or GW methods. In this paper, we present our development of high-throughput supervised ensemble learning-based methods: random forest, XGBoost, and Light GBM using a database of 1306 double perovskites materials to predict the energy band gap. Based on elemental properties, characteristics have been vectorized from chemical compositions. Our findings demonstrate the efficiency of ensemble learning methods and imply that scientists would benefit from recently employed methods in materials informatics.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010021
Authors: Musacchio Nicoletta Rita Zilich Davide Masi Fabio Baccetti Besmir Nreu Carlo Bruno Giorda Giacomo Guaita Lelio Morviducci Marco Muselli Alessandro Ozzello Federico Pisani Paola Ponzani Antonio Rossi Pierluigi Santin Damiano Verda Graziano Di Cianni Riccardo Candido
Background: International guidelines for diabetes care emphasize the urgency of promptly achieving and sustaining adequate glycemic control to reduce the occurrence of micro/macrovascular complications in patients with type 2 diabetes mellitus (T2DM). However, data from the Italian Association of Medical Diabetologists (AMD) Annals reveal that only 47% of T2DM patients reach appropriate glycemic targets, with approximately 30% relying on insulin therapy, either solely or in combination. This artificial intelligence analysis seeks to assess the potential impact of timely insulin initiation in all eligible patients via a “what-if” scenario simulation, leveraging real-world data. Methods: This retrospective cohort study utilized the AMD Annals database, comprising 1,186,247 T2DM patients from 2005 to 2019. Employing the Logic Learning Machine (LLM), we simulated timely insulin use for all eligible patients, estimating its effect on glycemic control after 12 months within a cohort of 85,239 patients. Of these, 20,015 were employed for the machine learning phase and 65,224 for simulation. Results: Within the simulated scenario, the introduction of appropriate insulin therapy led to a noteworthy projected 17% increase in patients meeting the metabolic target after 12 months from therapy initiation within the cohort of 65,224 individuals. The LLM’s projection envisages 32,851 potential patients achieving the target (hemoglobin glycated < 7.5%) after 12 months, compared to 21,453 patients observed in real-world cases. The receiver operating characteristic (ROC) curve analysis for this model demonstrated modest performance, with an area under the curve (AUC) value of 70.4%. Conclusions: This study reaffirms the significance of combatting therapeutic inertia in managing T2DM patients. Early insulinization, when clinically appropriate, markedly enhances patients’ metabolic goals at the 12-month follow-up.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010020
Authors: Yuxin Cong Toshiyuki Motohashi Koki Nakao Shinya Inazumi
The objective of this study was to investigate the liquefaction resistance of chemically improved sandy soils in a straightforward and accurate manner. Using only the existing experimental databases and artificial intelligence, the goal was to predict the experimental results as supporting information before performing the physical experiments. Emphasis was placed on the significance of data from 20 loading cycles of cyclic undrained triaxial tests to determine the liquefaction resistance and the contribution of each explanatory variable. Different combinations of explanatory variables were considered. Regarding the predictive model, it was observed that a case with the liquefaction resistance ratio as the dependent variable and other parameters as explanatory variables yielded favorable results. In terms of exploring combinations of explanatory variables, it was found advantageous to include all the variables, as doing so consistently resulted in a high coefficient of determination. The inclusion of the liquefaction resistance ratio in the training data was found to improve the predictive accuracy. In addition, the results obtained when using a linear model for the prediction suggested the potential to accurately predict the liquefaction resistance using historical data.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010019
Authors: Shira Nemirovsky-Rotman Eyal Bercovich
DNN-based systems have demonstrated unprecedented performance in terms of accuracy and speed over the past decade. However, recent work has shown that such models may not be sufficiently robust during the inference process. Furthermore, due to the data-driven learning nature of DNNs, designing interpretable and generalizable networks is a major challenge, especially when considering critical applications such as medical computer-aided diagnostics (CAD) and other medical imaging tasks. Within this context, a line of approaches incorporating prior knowledge domain information into deep learning methods has recently emerged. In particular, many of these approaches utilize known physics-based forward imaging models, aimed at improving the stability and generalization ability of DNNs for medical imaging applications. In this paper, we review recent work focused on such physics-based or physics-prior-based learning for a variety of imaging modalities and medical applications. We discuss how the inclusion of such physics priors to the training process and/or network architecture supports their stability and generalization ability. Moreover, we propose a new physics-based approach, in which an explicit physics prior, which describes the relation between the input and output of the forward imaging model, is included as an additional input into the network architecture. Furthermore, we propose a tailored training process for this extended architecture, for which training data are generated with perturbed physical priors that are also integrated into the network. Within the scope of this approach, we offer a problem formulation for a regression task with a highly nonlinear forward model and highlight possible useful applications for this task. Finally, we briefly discuss future challenges for physics-informed deep learning in the context of medical imaging.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010018
Authors: Fouad Trad Ali Chehab
Large Language Models (LLMs) are reshaping the landscape of Machine Learning (ML) application development. The emergence of versatile LLMs capable of undertaking a wide array of tasks has reduced the necessity for intensive human involvement in training and maintaining ML models. Despite these advancements, a pivotal question emerges: can these generalized models negate the need for task-specific models? This study addresses this question by comparing the effectiveness of LLMs in detecting phishing URLs when utilized with prompt-engineering techniques versus when fine-tuned. Notably, we explore multiple prompt-engineering strategies for phishing URL detection and apply them to two chat models, GPT-3.5-turbo and Claude 2. In this context, the maximum result achieved was an F1-score of 92.74% by using a test set of 1000 samples. Following this, we fine-tune a range of base LLMs, including GPT-2, Bloom, Baby LLaMA, and DistilGPT-2—all primarily developed for text generation—exclusively for phishing URL detection. The fine-tuning approach culminated in a peak performance, achieving an F1-score of 97.29% and an AUC of 99.56% on the same test set, thereby outperforming existing state-of-the-art methods. These results highlight that while LLMs harnessed through prompt engineering can expedite application development processes, achieving a decent performance, they are not as effective as dedicated, task-specific LLMs.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010017
Authors: Ekaterina Novozhilova Kate Mays Sejin Paik James E. Katz
Modern AI applications have caused broad societal implications across key public domains. While previous research primarily focuses on individual user perspectives regarding AI systems, this study expands our understanding to encompass general public perceptions. Through a survey (N = 1506), we examined public trust across various tasks within education, healthcare, and creative arts domains. The results show that participants vary in their trust across domains. Notably, AI systems’ abilities were evaluated higher than their benevolence across all domains. Demographic traits had less influence on trust in AI abilities and benevolence compared to technology-related factors. Specifically, participants with greater technological competence, AI familiarity, and knowledge viewed AI as more capable in all domains. These participants also perceived greater systems’ benevolence in healthcare and creative arts but not in education. We discuss the importance of considering public trust and its determinants in AI adoption.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010016
Authors: Mailson Ribeiro Santos Affonso Guedes Ignacio Sanchez-Gendriz
This study introduces an efficient methodology for addressing fault detection, classification, and severity estimation in rolling element bearings. The methodology is structured into three sequential phases, each dedicated to generating distinct machine-learning-based models for the tasks of fault detection, classification, and severity estimation. To enhance the effectiveness of fault diagnosis, information acquired in one phase is leveraged in the subsequent phase. Additionally, in the pursuit of attaining models that are both compact and efficient, an explainable artificial intelligence (XAI) technique is incorporated to meticulously select optimal features for the machine learning (ML) models. The chosen ML technique for the tasks of fault detection, classification, and severity estimation is the support vector machine (SVM). To validate the approach, the widely recognized Case Western Reserve University benchmark is utilized. The results obtained emphasize the efficiency and efficacy of the proposal. Remarkably, even with a highly limited number of features, evaluation metrics consistently indicate an accuracy of over 90% in the majority of cases when employing this approach.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010015
Authors: Audris Arzovs Janis Judvaitis Krisjanis Nesenbergs Leo Selavo
The goal of the IoT–Edge–Cloud Continuum approach is to distribute computation and data loads across multiple types of devices taking advantage of the different strengths of each, such as proximity to the data source, data access, or computing power, while mitigating potential weaknesses. Most current machine learning operations are currently concentrated on remote high-performance computing devices, such as the cloud, which leads to challenges related to latency, privacy, and other inefficiencies. Distributed learning approaches can address these issues by enabling the distribution of machine learning operations throughout the IoT–Edge–Cloud Continuum by incorporating Edge and even IoT layers into machine learning operations more directly. Approaches like transfer learning could help to transfer the knowledge from more performant IoT–Edge–Cloud Continuum layers to more resource-constrained devices, e.g., IoT. The implementation of these methods in machine learning operations, including the related data handling security and privacy approaches, is challenging and actively being researched. In this article the distributed learning and transfer learning domains are researched, focusing on security, robustness, and privacy aspects, and their potential usage in the IoT–Edge–Cloud Continuum, including research on tools to use for implementing these methods. To achieve this, we have reviewed 145 sources and described the relevant methods as well as their relevant attack vectors and provided suggestions on mitigation.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010014
Authors: Nhut Huynh Kim-Doang Nguyen
Nozzles are ubiquitous in agriculture: they are used to spray and apply nutrients and pesticides to crops. The properties of droplets sprayed from nozzles are vital factors that determine the effectiveness of the spray. Droplet size and other characteristics affect spray retention and drift, which indicates how much of the spray adheres to the crop and how much becomes chemical runoff that pollutes the environment. There is a critical need to measure these droplet properties to improve the performance of crop spraying systems. This paper establishes a deep learning methodology to detect droplets moving across a camera frame to measure their size. This framework is compatible with embedded systems that have limited onboard resources and can operate in real time. The method leverages a combination of techniques including resizing, normalization, pruning, detection head, unified feature map extraction via a feature pyramid network, non-maximum suppression, and optimization-based training. The approach is designed with the capability of detecting droplets of various sizes, shapes, and orientations. The experimental results demonstrate that the model designed in this study, coupled with the right combination of dataset and augmentation, achieved a 97% precision and 96.8% recall in droplet detection. The proposed methodology outperformed previous models, marking a significant advancement in droplet detection for precision agriculture applications.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010013
Authors: Maximilian Lowin
Introduction: Due to the lack of labeled data, applying predictive maintenance algorithms for facility management is cumbersome. Most companies are unwilling to share data or do not have time for annotation. In addition, most available facility management data are text data. Thus, there is a need for an unsupervised predictive maintenance algorithm that is capable of handling textual data. Methodology: This paper proposes applying association rule mining on maintenance requests to identify upcoming needs in facility management. By coupling temporal association rule mining with the concept of semantic similarity derived from large language models, the proposed methodology can discover meaningful knowledge in the form of rules suitable for decision-making. Results: Relying on the large German language models works best for the presented case study. Introducing a temporal lift filter allows for reducing the created rules to the most important ones. Conclusions: Only a few maintenance requests are sufficient to mine association rules that show links between different infrastructural failures. Due to the unsupervised manner of the proposed algorithm, domain experts need to evaluate the relevance of the specific rules. Nevertheless, the algorithm enables companies to efficiently utilize their data stored in databases to create interpretable rules supporting decision-making.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010012
Authors: Liang Luo George K. Stylios
The structure of fibrous assemblies is highly complex, being both random and regular at the same time, which leads to the complexity of its mechanical behaviour. Using algorithms such as machine learning to process complex mechanical property data requires consideration and understanding of its information principles. There are many different methods and instruments for measuring flexible material mechanics, and many different mechanics models exist. There is a need for an evaluation method to determine how close the results they obtain are to the real material mechanical behaviours. This paper considers and investigates measurements, data, models and simulations of fabric’s low-stress mechanics from an information perspective. The simplification of measurements and models will lead to a loss of information and, ultimately, a loss of authenticity in the results. Kolmogorov complexity is used as a tool to analyse and evaluate the algorithmic information content of multivariate nonlinear relationships of fabric stress and strain. The loss of algorithmic information content resulting from simplified approaches to various material measurements, models and simulations is also evaluated. For example, ignoring the friction hysteresis component in the material mechanical data can cause the model and simulation to lose more than 50% of the algorithm information, whilst the average loss of information using uniaxial measurement data can be as high as 75%. The results of this evaluation can be used to determine the authenticity of measurements and models and to identify the direction for new measurement instrument development and material mechanics modelling. It has been shown that a vast number of models, which use unary relationships to describe fabric behaviour and ignore the presence of frictional hysteresis, are inaccurate because they hold less than 12% of real fabric mechanics data. The paper also explores the possibility of compressing the measurement data of fabric mechanical properties.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010011
Authors: Sophie Zentner Alberto Barradas Chacon Selina C. Wriessnegger
Understanding and detecting human emotions is crucial for enhancing mental health, cognitive performance and human–computer interactions. This field in affective computing is relatively unexplored, and gaining knowledge about which external factors impact emotions could enhance communication between users and machines. Furthermore, it could also help us to manage affective disorders or understand affective physiological responses to human spatial and digital environments. The main objective of the current study was to investigate the influence of external stimulation, specifically the influence of different light conditions, on brain activity while observing affect-eliciting pictures and their classification. In this context, a multichannel electroencephalography (EEG) was recorded in 30 participants as they observed images from the Nencki Affective Picture System (NAPS) database in an art-gallery-style Virtual Reality (VR) environment. The elicited affect states were classified into three affect classes within the two-dimensional valence–arousal plane. Valence (positive/negative) and arousal (high/low) values were reported by participants on continuous scales. The experiment was conducted in two experimental conditions: a warm light condition and a cold light condition. Thus, three classification tasks arose with regard to the recorded brain data: classification of an affect state within a warm-light condition, classification of an affect state within a cold light condition, and warm light vs. cold light classification during observation of affect-eliciting images. For all classification tasks, Linear Discriminant Analysis, a Spatial Filter Model, a Convolutional Neural Network, the EEGNet, and the SincNet were compared. The EEGNet architecture performed best in all tasks. It could significantly classify three affect states with 43.12% accuracy under the influence of warm light. Under the influence of cold light, no model could achieve significant results. The classification between visual stimulus with warm light vs. cold light could be classified significantly with 76.65% accuracy from the EEGNet, well above any other machine learning or deep learning model. No significant differences could be detected between affect recognition in different light conditions, but the results point towards the advantage of gradient-based learning methods for data-driven experimental designs for the problem of affect decoding from EEG, providing modern tools for affective computing in digital spaces. Moreover, the ability to discern externally driven affective states through deep learning not only advances our understanding of the human mind but also opens avenues for developing innovative therapeutic interventions and improving human–computer interaction.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010010
Authors: Mohammadali Fallahian Mohsen Dorodchi Kyle Kreth
In data-driven systems, data exploration is imperative for making real-time decisions. However, big data are stored in massive databases that are difficult to retrieve. Approximate Query Processing (AQP) is a technique for providing approximate answers to aggregate queries based on a summary of the data (synopsis) that closely replicates the behavior of the actual data; this can be useful when an approximate answer to queries is acceptable in a fraction of the real execution time. This study explores the novel utilization of a Generative Adversarial Network (GAN) for the generation of tabular data that can be employed in AQP for synopsis construction. We thoroughly investigate the unique challenges posed by the synopsis construction process, including maintaining data distribution characteristics, handling bounded continuous and categorical data, and preserving semantic relationships, and we then introduce the advancement of tabular GAN architectures that overcome these challenges. Furthermore, we propose and validate a suite of statistical metrics tailored for assessing the reliability of GAN-generated synopses. Our findings demonstrate that advanced GAN variations exhibit a promising capacity to generate high-fidelity synopses, potentially transforming the efficiency and effectiveness of AQP in data-driven systems.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010009
Authors: Abdulkarim Faraj Alqahtani Mohammad Ilyas
The impact of communication through social media is currently considered a significant social issue. This issue can lead to inappropriate behavior using social media, which is referred to as cyberbullying. Automated systems are capable of efficiently identifying cyberbullying and performing sentiment analysis on social media platforms. This study focuses on enhancing a system to detect six types of cyberbullying tweets. Employing multi-classification algorithms on a cyberbullying dataset, our approach achieved high accuracy, particularly with the TF-IDF (bigram) feature extraction. Our experiment achieved high performance compared with that stated for previous experiments on the same dataset. Two ensemble machine learning methods, employing the N-gram with TF-IDF feature-extraction technique, demonstrated superior performance in classification. Three popular multi-classification algorithms: Decision Trees, Random Forest, and XGBoost, were combined into two varied ensemble methods separately. These ensemble classifiers demonstrated superior performance compared to traditional machine learning classifier models. The stacking classifier reached 90.71% accuracy and the voting classifier 90.44%. The results of the experiments showed that the framework can detect six different types of cyberbullying more efficiently, with an accuracy rate of 0.9071.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010008
Authors: Mustafa Pamuk Matthias Schumann Robert C. Nickerson
The intended automation in the financial industry creates a proper area for artificial intelligence usage. However, complex and high regulatory standards and rapid technological developments pose significant challenges in developing and deploying AI-based services in the finance industry. The regulatory principles defined by financial authorities in Europe need to be structured in a fine-granular way to promote understanding and ensure customer safety and the quality of AI-based services in the financial industry. This will lead to a better understanding of regulators’ priorities and guide how AI-based services are built. This paper provides a classification pattern with a taxonomy that clarifies the existing European regulatory principles for researchers, regulatory authorities, and financial services companies. Our study can pave the way for developing compliant AI-based services by bringing out the thematic focus of regulatory principles.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010007
Authors: Didier Gohourou Kazuhiro Kuwabara
Network representation of data is key to a variety of fields and their applications including trading and business. A major source of data that can be used to build insightful networks is the abundant amount of unstructured text data available through the web. The efforts to turn unstructured text data into a network have spawned different research endeavors, including the simplification of the process. This study presents the design and implementation of TraCER, a pipeline that turns unstructured text data into a graph, targeting the business networking domain. It describes the application of natural language processing techniques used to process the text, as well as the heuristics and learning algorithms that categorize the nodes and the links. The study also presents some simple yet efficient methods for the entity-linking and relation classification steps of the pipeline.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010006
Authors: Jennifer Werner Dimitri Nowak Franziska Hunger Tomas Johnson Andreas Mark Alexander Gösta Fredrik Edelvik
Wind comfort is an important factor when new buildings in existing urban areas are planned. It is common practice to use computational fluid dynamics (CFD) simulations to model wind comfort. These simulations are usually time-consuming, making it impossible to explore a high number of different design choices for a new urban development with wind simulations. Data-driven approaches based on simulations have shown great promise, and have recently been used to predict wind comfort in urban areas. These surrogate models could be used in generative design software and would enable the planner to explore a large number of options for a new design. In this paper, we propose a novel machine learning workflow (MLW) for direct wind comfort prediction. The MLW incorporates a regression and a classification U-Net, trained based on CFD simulations. Furthermore, we present an augmentation strategy focusing on generating more training data independent of the underlying wind statistics needed to calculate the wind comfort criterion. We train the models based on different sets of training data and compare the results. All trained models (regression and classification) yield an F1-score greater than 80% and can be combined with any wind rose statistic.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010005
Authors: Jorge Blanco Prieto Marina Ferreras González Steven Van Vaerenbergh Oscar Jesús Cosido Cobos
Efficient planning and management of health transport services are crucial for improving accessibility and enhancing the quality of healthcare. This study focuses on the choice of determinant variables in the prediction of health transport demand using data mining and analysis techniques. Specifically, health transport services data from Asturias, spanning a seven-year period, are analyzed with the aim of developing accurate predictive models. The problem at hand requires the handling of large volumes of data and multiple predictor variables, leading to challenges in computational cost and interpretation of the results. Therefore, data mining techniques are applied to identify the most relevant variables in the design of predictive models. This approach allows for reducing the computational cost without sacrificing prediction accuracy. The findings of this study underscore that the selection of significant variables is essential for optimizing medical transport resources and improving the planning of emergency services. With the most relevant variables identified, a balance between prediction accuracy and computational efficiency is achieved. As a result, improved service management is observed to lead to increased accessibility to health services and better resource planning.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010004
Authors: Adnan Alagic Natasa Zivic Esad Kadusic Dzenan Hamzic Narcisa Hadzajlic Mejra Dizdarevic Elmedin Selmanovic
The number of loan requests is rapidly growing worldwide representing a multi-billion-dollar business in the credit approval industry. Large data volumes extracted from the banking transactions that represent customers’ behavior are available, but processing loan applications is a complex and time-consuming task for banking institutions. In 2022, over 20 million Americans had open loans, totaling USD 178 billion in debt, although over 20% of loan applications were rejected. Numerous statistical methods have been deployed to estimate loan risks opening the field to estimate whether machine learning techniques can better predict the potential risks. To study the machine learning paradigm in this sector, the mental health dataset and loan approval dataset presenting survey results from 1991 individuals are used as inputs to experiment with the credit risk prediction ability of the chosen machine learning algorithms. Giving a comprehensive comparative analysis, this paper shows how the chosen machine learning algorithms can distinguish between normal and risky loan customers who might never pay their debts back. The results from the tested algorithms show that XGBoost achieves the highest accuracy of 84% in the first dataset, surpassing gradient boost (83%) and KNN (83%). In the second dataset, random forest achieved the highest accuracy of 85%, followed by decision tree and KNN with 83%. Alongside accuracy, the precision, recall, and overall performance of the algorithms were tested and a confusion matrix analysis was performed producing numerical results that emphasized the superior performance of XGBoost and random forest in the classification tasks in the first dataset, and XGBoost and decision tree in the second dataset. Researchers and practitioners can rely on these findings to form their model selection process and enhance the accuracy and precision of their classification models.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010003
Authors: Kuangsheng Cai Zugang Chen Hengliang Guo Shaohua Wang Guoqing Li Jing Li Feng Chen Hang Feng
Semantic folding theory (SFT) is an emerging cognitive science theory that aims to explain how the human brain processes and organizes semantic information. The distribution of text into semantic grids is key to SFT. We propose a sentence-level semantic division baseline with 100 grids (SSDB-100), the only dataset we are currently aware of that performs a relevant validation of the sentence-level SFT algorithm, to evaluate the validity of text distribution in semantic grids and divide it using classical division algorithms on SSDB-100. In this article, we describe the construction of SSDB-100. First, a semantic division questionnaire with broad coverage was generated by limiting the uncertainty range of the topics and corpus. Subsequently, through an expert survey, 11 human experts provided feedback. Finally, we analyzed and processed the feedback; the average consistency index for the used feedback was 0.856 after eliminating the invalid feedback. SSDB-100 has 100 semantic grids with clear distinctions between the grids, allowing the dataset to be extended using semantic methods.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010002
Authors: Abhijeet Kumar Anirban Guha Sauvik Banerjee
While machine learning (ML) has been quite successful in the field of structural health monitoring (SHM), its practical implementation has been limited. This is because ML model training requires data containing a variety of distinct instances of damage captured from a real structure and the experimental generation of such data is challenging. One way to tackle this issue is by generating training data through numerical simulations. However, simulated data cannot capture the bias and variance of experimental uncertainty. To overcome this problem, this work proposes a deep-learning-based domain transformation method for transforming simulated data to the experimental domain. Use of this technique has been demonstrated for debonding location and size predictions of stiffened panels using a vibration-based method. The results are satisfactory for both debonding location and size prediction. This domain transformation method can be used in any field in which experimental data for training machine-learning models is scarce.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make6010001
Authors: Devang Mehta Noah Klarmann
Manufacturing industries require the efficient and voluminous production of high-quality finished goods. In the context of Industry 4.0, visual anomaly detection poses an optimistic solution for automatically controlled product quality with high precision. In general, automation based on computer vision is a promising solution to prevent bottlenecks at the product quality checkpoint. We considered recent advancements in machine learning to improve visual defect localization, but challenges persist in obtaining a balanced feature set and database of the wide variety of defects occurring in the production line. Hence, this paper proposes a defect localizing autoencoder with unsupervised class selection by clustering with k-means the features extracted from a pretrained VGG16 network. Moreover, the selected classes of defects are augmented with natural wild textures to simulate artificial defects. The study demonstrates the effectiveness of the defect localizing autoencoder with unsupervised class selection for improving defect detection in manufacturing industries. The proposed methodology shows promising results with precise and accurate localization of quality defects on melamine-faced boards for the furniture industry. Incorporating artificial defects into the training data shows significant potential for practical implementation in real-world quality control scenarios.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040095
Authors: Jose Dixon Md Rahman
The overall purpose of this paper is to demonstrate how data preprocessing, training size variation, and subsampling can dynamically change the performance metrics of imbalanced text classification. The methodology encompasses using two different supervised learning classification approaches of feature engineering and data preprocessing with the use of five machine learning classifiers, five imbalanced sampling techniques, specified intervals of training and subsampling sizes, statistical analysis using R and tidyverse on a dataset of 1000 portable document format files divided into five labels from the World Health Organization Coronavirus Research Downloadable Articles of COVID-19 papers and PubMed Central databases of non-COVID-19 papers for binary classification that affects the performance metrics of precision, recall, receiver operating characteristic area under the curve, and accuracy. One approach that involves labeling rows of sentences based on regular expressions significantly improved the performance of imbalanced sampling techniques verified by performing statistical analysis using a t-test documenting performance metrics of iterations versus another approach that automatically labels the sentences based on how the documents are organized into positive and negative classes. The study demonstrates the effectiveness of ML classifiers and sampling techniques in text classification datasets, with different performance levels and class imbalance issues observed in manual and automatic methods of data processing.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040094
Authors: Sakorn Mekruksavanich Anuchit Jitpattanakul
Epileptic seizures are a prevalent neurological condition that impacts a considerable portion of the global population. Timely and precise identification can result in as many as 70% of individuals achieving freedom from seizures. To achieve this, there is a pressing need for smart, automated systems to assist medical professionals in identifying neurological disorders correctly. Previous efforts have utilized raw electroencephalography (EEG) data and machine learning techniques to classify behaviors in patients with epilepsy. However, these studies required expertise in clinical domains like radiology and clinical procedures for feature extraction. Traditional machine learning for classification relied on manual feature engineering, limiting performance. Deep learning excels at automated feature learning directly from raw data sans human effort. For example, deep neural networks now show promise in analyzing raw EEG data to detect seizures, eliminating intensive clinical or engineering needs. Though still emerging, initial studies demonstrate practical applications across medical domains. In this work, we introduce a novel deep residual model called ResNet-BiGRU-ECA, analyzing brain activity through EEG data to accurately identify epileptic seizures. To evaluate our proposed deep learning model’s efficacy, we used a publicly available benchmark dataset on epilepsy. The results of our experiments demonstrated that our suggested model surpassed both the basic model and cutting-edge deep learning models, achieving an outstanding accuracy rate of 0.998 and the top F1-score of 0.998.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040093
Authors: Hossein Hassani Nadejda Komendantova Elena Rovenskaya Mohammad Reza Yeganegi
Social trend mining, situated at the confluence of data science and social research, provides a novel lens through which to examine societal dynamics and emerging trends. This paper explores the intricate landscape of social trend mining, with a specific emphasis on discerning leading and lagging trends. Within this context, our study employs social trend mining techniques to scrutinize X (formerly Twitter) data pertaining to risk management, earthquakes, and disasters. A comprehensive comprehension of how individuals perceive the significance of these pivotal facets within disaster risk management is essential for shaping policies that garner public acceptance. This paper sheds light on the intricacies of public sentiment and provides valuable insights for policymakers and researchers alike.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040092
Authors: Faraz Ahmad Massimo Ferri Patrizio Frosini
This paper is part of a line of research devoted to developing a compositional and geometric theory of Group Equivariant Non-Expansive Operators (GENEOs) for Geometric Deep Learning. It has two objectives. The first objective is to generalize the notions of permutants and permutant measures, originally defined for the identity of a single “perception pair”, to a map between two such pairs. The second and main objective is to extend the application domain of the whole theory, which arose in the set-theoretical and topological environments, to graphs. This is performed using classical methods of mathematical definitions and arguments. The theoretical outcome is that, both in the case of vertex-weighted and edge-weighted graphs, a coherent theory is developed. Several simple examples show what may be hoped from GENEOs and permutants in graph theory and how they can be built. Rather than being a competitor to other methods in Geometric Deep Learning, this theory is proposed as an approach that can be integrated with such methods.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040091
Authors: Julien Hautot Céline Teulière Nourddine Azzaoui
Visual Reinforcement Learning (RL) has been largely investigated in recent decades. Existing approaches are often composed of multiple networks requiring massive computational power to solve partially observable tasks from high-dimensional data such as images. Using State Representation Learning (SRL) has been shown to improve the performance of visual RL by reducing the high-dimensional data into compact representation, but still often relies on deep networks and on the environment. In contrast, we propose a lighter, more generic method to extract sparse and localized features from raw images without training. We achieve this using a Visual Radial Basis Function Network (VRBFN), which offers significant practical advantages, including efficient and accurate training with minimal complexity due to its two linear layers. For real-world applications, its scalability and resilience to noise are essential, as real sensors are subject to change and noise. Unlike CNNs, which may require extensive retraining, this network might only need minor fine-tuning. We test the efficiency of the VRBFN representation to solve different RL tasks using Proximal Policy Optimization (PPO). We present a large study and comparison of our extraction methods with five classical visual RL and SRL approaches on five different first-person partially observable scenarios. We show that this approach presents appealing features such as sparsity and robustness to noise and that the obtained results when training RL agents are better than other tested methods on four of the five proposed scenarios.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040090
Authors: Rafael Rodrigues Mendes Ribeiro Carlos Dias Maciel
A Bayesian network (BN) is a probabilistic graphical model that can model complex and nonlinear relationships. Its structural learning from data is an NP-hard problem because of its search-space size. One method to perform structural learning is a search and score approach, which uses a search algorithm and structural score. A study comparing 15 algorithms showed that hill climbing (HC) and tabu search (TABU) performed the best overall on the tests. This work performs a deeper analysis of the application of the adaptive genetic algorithm with varying population size (AGAVaPS) on the BN structural learning problem, which a preliminary test showed that it had the potential to perform well on. AGAVaPS is a genetic algorithm that uses the concept of life, where each solution is in the population for a number of iterations. Each individual also has its own mutation rate, and there is a small probability of undergoing mutation twice. Parameter analysis of AGAVaPS in BN structural leaning was performed. Also, AGAVaPS was compared to HC and TABU for six literature datasets considering F1 score, structural Hamming distance (SHD), balanced scoring function (BSF), Bayesian information criterion (BIC), and execution time. HC and TABU performed basically the same for all the tests made. AGAVaPS performed better than the other algorithms for F1 score, SHD, and BIC, showing that it can perform well and is a good choice for BN structural learning.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040089
Authors: Elie Neghawi Yan Liu
The rapid development of semi-supervised machine learning (SSML) algorithms has shown enhanced versatility, but pinpointing the primary influencing factors remains a challenge. Historically, deep neural networks (DNNs) have been used to underpin these algorithms, resulting in improved classification precision. This study aims to delve into the performance determinants of SSML models by employing post-hoc explainable artificial intelligence (XAI) methods. By analyzing the components of well-established SSML algorithms and comparing them to newer counterparts, this work redefines semi-supervised computation processes for both data preprocessing and classification. Integrating different types of DNNs, we evaluated the effects of parameter adjustments during training across varied labeled and unlabeled data proportions. Our analysis of 45 experiments showed a notable 8% drop in training loss and a 6.75% enhancement in learning precision when using the Shake-Shake26 classifier with the RemixMatch SSML algorithm. Additionally, our findings suggest a strong positive relationship between the amount of labeled data and training duration, indicating that more labeled data leads to extended training periods, which further influences parameter adjustments in learning processes.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040088
Authors: Horacio Rodriguez-Bazan Grigori Sidorov Ponciano Jorge Escamilla-Ambrosio
The proliferation of Android-based devices has brought about an unprecedented surge in mobile application usage, making the Android ecosystem a prime target for cybercriminals. In this paper, a new method for Android malware classification is proposed. The method implements a convolutional neural network for malware classification using images. The research presents a novel approach to transforming the Android Application Package (APK) into a grayscale image. The image creation utilizes natural language processing techniques for text cleaning, extraction, and fuzzy hashing to represent the decompiled code from the APK in a set of hashes after preprocessing, where the image is composed of n fuzzy hashes that represent an APK. The method was tested on an Android malware dataset with 15,493 samples of five malware types. The proposed method showed an increase in accuracy compared to others in the literature, achieving up to 98.24% in the classification task.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040087
Authors: Borna Feldsar Rudolf Mayer Andreas Rauber
Deep Learning has enabled significant progress towards more accurate predictions and is increasingly integrated into our everyday lives in real-world applications; this is true especially for Convolutional Neural Networks (CNNs) in the field of image analysis. Nevertheless, it has been shown that Deep Learning is vulnerable against well-crafted, small perturbations to the input, i.e., adversarial examples. Defending against such attacks is therefore crucial to ensure the proper functioning of these models—especially when autonomous decisions are taken in safety-critical applications, such as autonomous vehicles. In this work, shallow machine learning models, such as Logistic Regression and Support Vector Machine, are utilised as surrogates of a CNN based on the assumption that they would be differently affected by the minute modifications crafted for CNNs. We develop three detection strategies for adversarial examples by analysing differences in the prediction of the surrogate and the CNN model: namely, deviation in (i) the prediction, (ii) the distance of the predictions, and (iii) the confidence of the predictions. We consider three different feature spaces: raw images, extracted features, and the activations of the CNN model. Our evaluation shows that our methods achieve state-of-the-art performance compared to other approaches, such as Feature Squeezing, MagNet, PixelDefend, and Subset Scanning, on the MNIST, Fashion-MNIST, and CIFAR-10 datasets while being robust in the sense that they do not entirely fail against selected single attacks. Further, we evaluate our defence against an adaptive attacker in a grey-box setting.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040086
Authors: Gili Rosenberg John Kyle Brubaker Martin J. A. Schuetz Grant Salton Zhihuai Zhu Elton Yechao Zhu Serdar Kadıoğlu Sima E. Borujeni Helmut G. Katzgraber
We propose and implement an interpretable machine learning classification model for Explainable AI (XAI) based on expressive Boolean formulas. Potential applications include credit scoring and diagnosis of medical conditions. The Boolean formula defines a rule with tunable complexity (or interpretability) according to which input data are classified. Such a formula can include any operator that can be applied to one or more Boolean variables, thus providing higher expressivity compared to more rigid rule- and tree-based approaches. The classifier is trained using native local optimization techniques, efficiently searching the space of feasible formulas. Shallow rules can be determined by fast Integer Linear Programming (ILP) or Quadratic Unconstrained Binary Optimization (QUBO) solvers, potentially powered by special-purpose hardware or quantum devices. We combine the expressivity and efficiency of the native local optimizer with the fast operation of these devices by executing non-local moves that optimize over the subtrees of the full Boolean formula. We provide extensive numerical benchmarking results featuring several baselines on well-known public datasets. Based on the results, we find that the native local rule classifier is generally competitive with the other classifiers. The addition of non-local moves achieves similar results with fewer iterations. Therefore, using specialized or quantum hardware could lead to a significant speedup through the rapid proposal of non-local moves.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040085
Authors: Jonathan Plangger Mohamed Atia Hicham Chaoui
In this paper, we present a comparative study of modern semantic segmentation loss functions and their resultant impact when applied with state-of-the-art off-road datasets. Class imbalance, inherent in these datasets, presents a significant challenge to off-road terrain semantic segmentation systems. With numerous environment classes being extremely sparse and underrepresented, model training becomes inefficient and struggles to comprehend the infrequent minority classes. As a solution to this problem, loss functions have been configured to take class imbalance into account and counteract this issue. To this end, we present a novel loss function, Focal Class-based Intersection over Union (FCIoU), which directly targets performance imbalance through the optimization of class-based Intersection over Union (IoU). The new loss function results in a general increase in class-based performance when compared to state-of-the-art targeted loss functions.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040084
Authors: Leandro L. Cunha Miguel A. Brito Domingos F. Oliveira Ana P. Martins
The cryptocurrency market has grown significantly, and this quick growth has given rise to scams. It is necessary to put fraud detection mechanisms in place. The challenge of inadequate labeling is addressed in this work, which is a barrier to the training of high-performance supervised classifiers. It aims to lessen the necessity for laborious and time-consuming manual labeling. Some unlabeled data points have labels that are more pertinent and informative for the supervised model to learn from. The viability of utilizing unsupervised anomaly detection algorithms and active learning strategies to build an iterative process of acquiring labeled transactions in a cold start scenario, where there are no initial-labeled transactions, is being investigated. Investigating anomaly detection capabilities for a subset of data that maximizes supervised models’ learning potential is the goal. The anomaly detection algorithms under performed, according to the results. The findings underscore the need that anomaly detection algorithms be reserved for situations involving cold starts. As a result, using active learning techniques would produce better outcomes and supervised machine learning model performance.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040083
Authors: Juan Terven Diana-Margarita Córdova-Esparza Julio-Alejandro Romero-González
YOLO has become a central real-time object detection system for robotics, driverless cars, and video monitoring applications. We present a comprehensive analysis of YOLO’s evolution, examining the innovations and contributions in each iteration from the original YOLO up to YOLOv8, YOLO-NAS, and YOLO with transformers. We start by describing the standard metrics and postprocessing; then, we discuss the major changes in network architecture and training tricks for each model. Finally, we summarize the essential lessons from YOLO’s development and provide a perspective on its future, highlighting potential research directions to enhance real-time object detection systems.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040082
Authors: Samuel Corecco Giorgia Adorni Luca Maria Gambardella
In an era characterised by rapid technological advancement, the application of algorithmic approaches to address complex problems has become crucial across various disciplines. Within the realm of education, there is growing recognition of the pivotal role played by computational thinking (CT). This skill set has emerged as indispensable in our ever-evolving digital landscape, accompanied by an equal need for effective methods to assess and measure these skills. This research places its focus on the Cross Array Task (CAT), an educational activity designed within the Swiss educational system to assess students’ algorithmic skills. Its primary objective is to evaluate pupils’ ability to deconstruct complex problems into manageable steps and systematically formulate sequential strategies. The CAT has proven its effectiveness as an educational tool in tracking and monitoring the development of CT skills throughout compulsory education. Additionally, this task presents an enthralling avenue for algorithmic research, owing to its inherent complexity and the necessity to scrutinise the intricate interplay between different strategies and the structural aspects of this activity. This task, deeply rooted in logical reasoning and intricate problem solving, often poses a substantial challenge for human solvers striving for optimal solutions. Consequently, the exploration of computational power to unearth optimal solutions or uncover less intuitive strategies presents a captivating and promising endeavour. This paper explores two distinct algorithmic approaches to the CAT problem. The first approach combines clustering, random search, and move selection to find optimal solutions. The second approach employs reinforcement learning techniques focusing on the Proximal Policy Optimization (PPO) model. The findings of this research hold the potential to deepen our understanding of how machines can effectively tackle complex challenges like the CAT problem but also have broad implications, particularly in educational contexts, where these approaches can be seamlessly integrated into existing tools as a tutoring mechanism, offering assistance to students encountering difficulties. This can ultimately enhance students’ CT and problem-solving abilities, leading to an enriched educational experience.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040081
Authors: Esraa Samkari Muhammad Arif Manal Alghamdi Mohammed A. Al Ghamdi
Human Pose Estimation (HPE) is the task that aims to predict the location of human joints from images and videos. This task is used in many applications, such as sports analysis and surveillance systems. Recently, several studies have embraced deep learning to enhance the performance of HPE tasks. However, building an efficient HPE model is difficult; many challenges, like crowded scenes and occlusion, must be handled. This paper followed a systematic procedure to review different HPE models comprehensively. About 100 articles published since 2014 on HPE using deep learning were selected using several selection criteria. Both image and video data types of methods were investigated. Furthermore, both single and multiple HPE methods were reviewed. In addition, the available datasets, different loss functions used in HPE, and pretrained feature extraction models were all covered. Our analysis revealed that Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) are the most used in HPE. Moreover, occlusion and crowd scenes remain the main problems affecting models’ performance. Therefore, the paper presented various solutions to address these issues. Finally, this paper highlighted the potential opportunities for future work in this task.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040080
Authors: Manzoor Hussain Jang-Eui Hong
The perception system is a safety-critical component that directly impacts the overall safety of autonomous driving systems (ADSs). It is imperative to ensure the robustness of the deep-learning model used in the perception system. However, studies have shown that these models are highly vulnerable to the adversarial perturbation of input data. The existing works mainly focused on studying the impact of these adversarial attacks on classification rather than regression models. Therefore, this paper first introduces two generalized methods for perturbation-based attacks: (1) We used naturally occurring noises to create perturbations in the input data. (2) We introduce a modified square, HopSkipJump, and decision-based/boundary attack to attack the regression models used in ADSs. Then, we propose a deep-autoencoder-based adversarial attack detector. In addition to offline evaluation metrics (e.g., F1 score and precision, etc.), we introduce an online evaluation framework to evaluate the robustness of the model under attack. The framework considers the reconstruction loss of the deep autoencoder that validates the robustness of the models under attack in an end-to-end fashion at runtime. Our experimental results showed that the proposed adversarial attack detector could detect square, HopSkipJump, and decision-based/boundary attacks with a true positive rate (TPR) of 93%.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040079
Authors: Michael T. Mapundu Chodziwadziwa W. Kabudula Eustasius Musenge Victor Olago Turgay Celik
Verbal autopsies (VA) are commonly used in Low- and Medium-Income Countries (LMIC) to determine cause of death (CoD) where death occurs outside clinical settings, with the most commonly used international gold standard being physician medical certification. Interviewers elicit information from relatives of the deceased, regarding circumstances and events that might have led to death. This information is stored in textual format as VA narratives. The narratives entail detailed information that can be used to determine CoD. However, this approach still remains a manual task that is costly, inconsistent, time-consuming and subjective (prone to errors), amongst many drawbacks. As such, this negatively affects the VA reporting process, despite it being vital for strengthening health priorities and informing civil registration systems. Therefore, this study seeks to close this gap by applying novel deep learning (DL) interpretable approaches for reviewing VA narratives and generate CoD prediction in a timely, easily interpretable, cost-effective and error-free way. We validate our DL models using optimisation and performance accuracy machine learning (ML) curves as a function of training samples. We report on validation with training set accuracy (LSTM = 76.11%, CNN = 76.35%, and SEDL = 82.1%), validation accuracy (LSTM = 67.05%, CNN = 66.16%, and SEDL = 82%) and test set accuracy (LSTM = 67%, CNN = 66.2%, and SEDL = 82%) for our models. Furthermore, we also present Local Interpretable Model-agnostic Explanations (LIME) for ease of interpretability of the results, thereby building trust in the use of machines in healthcare. We presented robust deep learning methods to determine CoD from VAs, with the stacked ensemble deep learning (SEDL) approaches performing optimally and better than Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). Our empirical results suggest that ensemble DL methods may be integrated in the CoD process to help experts get to a diagnosis. Ultimately, this will reduce the turnaround time needed by physicians to go through the narratives in order to be able to give an appropriate diagnosis, cut costs and minimise errors. This study was limited by the number of samples needed for training our models and the high levels of lexical variability in the words used in our textual information.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040078
Authors: Evaldo Jorge Alcántara Suárez Victor Monzon Baeza
Machine learning (ML) has become a critical technology in the defense sector, enabling the development of advanced systems for threat detection, decision making, and autonomous operations. However, the increasing ML use in defense systems has raised ethical concerns related to accountability, transparency, and bias. In this paper, we provide a comprehensive analysis of the impact of ML on the defense sector, including the benefits and drawbacks of using ML in various applications such as surveillance, target identification, and autonomous weapons systems. We also discuss the ethical implications of using ML in defense, focusing on privacy, accountability, and bias issues. Finally, we present recommendations for mitigating these ethical concerns, including increased transparency, accountability, and stakeholder involvement in designing and deploying ML systems in the defense sector.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040077
Authors: Amnon Bleich Antje Linnemann Benjamin Jaidi Björn H. Diem Tim O. F. Conrad
Implantable Cardiac Monitor (ICM) devices are demonstrating, as of today, the fastest-growing market for implantable cardiac devices. As such, they are becoming increasingly common in patients for measuring heart electrical activity. ICMs constantly monitor and record a patient’s heart rhythm, and when triggered, send it to a secure server where health care professionals (HCPs) can review it. These devices employ a relatively simplistic rule-based algorithm (due to energy consumption constraints) to make alerts for abnormal heart rhythms. This algorithm is usually parameterized to an over-sensitive mode in order to not miss a case (resulting in a relatively high false-positive rate), and this, combined with the device’s nature of constantly monitoring the heart rhythm and its growing popularity, results in HCPs having to analyze and diagnose an increasingly growing number of data. In order to reduce the load on the latter, automated methods for ECG analysis are nowadays becoming a great tool to assist HCPs in their analysis. While state-of-the-art algorithms are data-driven rather than rule-based, training data for ICMs often consist of specific characteristics that make their analysis unique and particularly challenging. This study presents the challenges and solutions in automatically analyzing ICM data and introduces a method for its classification that outperforms existing methods on such data. It carries this out by combining high-frequency noise detection (which often occurs in ICM data) with a semi-supervised learning pipeline that allows for the re-labeling of training episodes and by using segmentation and dimension-reduction techniques that are robust to morphology variations of the sECG signal (which are typical to ICM data). As a result, it performs better than state-of-the-art techniques on such data with, e.g., an F1 score of 0.51 vs. 0.38 of our baseline state-of-the-art technique in correctly calling atrial fibrillation in ICM data. As such, it could be used in numerous ways, such as aiding HCPs in the analysis of ECGs originating from ICMs by, e.g., suggesting a rhythm type.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040076
Authors: Louisa Heidrich Emanuel Slany Stephan Scheele Ute Schmid
The rise of machine-learning applications in domains with critical end-user impact has led to a growing concern about the fairness of learned models, with the goal of avoiding biases that negatively impact specific demographic groups. Most existing bias-mitigation strategies adapt the importance of data instances during pre-processing. Since fairness is a contextual concept, we advocate for an interactive machine-learning approach that enables users to provide iterative feedback for model adaptation. Specifically, we propose to adapt the explanatory interactive machine-learning approach Caipi for fair machine learning. FairCaipi incorporates human feedback in the loop on predictions and explanations to improve the fairness of the model. Experimental results demonstrate that FairCaipi outperforms a state-of-the-art pre-processing bias mitigation strategy in terms of the fairness and the predictive performance of the resulting machine-learning model. We show that FairCaipi can both uncover and reduce bias in machine-learning models and allows us to detect human bias.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040075
Authors: Ruchita Mehta Sara Sharifzadeh Vasile Palade Bo Tan Alireza Daneshkhah Yordanka Karayaneva
Human capability to perform routine tasks declines with age and age-related problems. Remote human activity recognition (HAR) is beneficial for regular monitoring of the elderly population. This paper addresses the problem of the continuous detection of daily human activities using a mm-wave Doppler radar. In this study, two strategies have been employed: the first method uses un-equalized series of activities, whereas the second method utilizes a gradient-based strategy for equalization of the series of activities. The dynamic time warping (DTW) algorithm and Long Short-term Memory (LSTM) techniques have been implemented for the classification of un-equalized and equalized series of activities, respectively. The input for DTW was provided using three strategies. The first approach uses the pixel-level data of frames (UnSup-PLevel). In the other two strategies, a convolutional variational autoencoder (CVAE) is used to extract Un-Supervised Encoded features (UnSup-EnLevel) and Supervised Encoded features (Sup-EnLevel) from the series of Doppler frames. The second approach for equalized data series involves the application of four distinct feature extraction methods: i.e., convolutional neural networks (CNN), supervised and unsupervised CVAE, and principal component Analysis (PCA). The extracted features were considered as an input to the LSTM. This paper presents a comparative analysis of a novel supervised feature extraction pipeline, employing Sup-ENLevel-DTW and Sup-EnLevel-LSTM, against several state-of-the-art unsupervised methods, including UnSUp-EnLevel-DTW, UnSup-EnLevel-LSTM, CNN-LSTM, and PCA-LSTM. The results demonstrate the superiority of the Sup-EnLevel-LSTM strategy. However, the UnSup-PLevel strategy worked surprisingly well without using annotations and frame equalization.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040074
Authors: Joel Arweiler Cihan Ates Jesus Cerquides Rainer Koch Hans-Jörg Bauer
The inherent dependency of deep learning models on labeled data is a well-known problem and one of the barriers that slows down the integration of such methods into different fields of applied sciences and engineering, in which experimental and numerical methods can easily generate a colossal amount of unlabeled data. This paper proposes an unsupervised domain adaptation methodology that mimics the peer review process to label new observations in a different domain from the training set. The approach evaluates the validity of a hypothesis using domain knowledge acquired from the training set through a similarity analysis, exploring the projected feature space to examine the class centroid shifts. The methodology is tested on a binary classification problem, where synthetic images of cubes and cylinders in different orientations are generated. The methodology improves the accuracy of the object classifier from 60% to around 90% in the case of a domain shift in physical feature space without human labeling.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040073
Authors: Miguel S. Soriano-Garcia Ricardo Sevilla-Escoboza Angel Garcia-Pedrero
Generative Adversarial Networks are powerful generative models that are used in different areas and with multiple applications. However, this type of model has a training problem called mode collapse. This problem causes the generator to not learn the complete distribution of the data with which it is trained. To force the network to learn the entire data distribution, MSSGAN is introduced. This model has multiple generators and distributes the training data in multiple subspaces, where each generator is enforced to learn only one of the groups with the help of a classifier. We demonstrate that our model performs better on the FID and Sample Distribution metrics compared to previous models to avoid mode collapse. Experimental results show how each of the generators learns different information and, in turn, generates satisfactory quality samples.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040072
Authors: Robert S. Sullivan Luca Longo
Reinforcement Learning (RL) has shown promise in optimizing complex control and decision-making processes but Deep Reinforcement Learning (DRL) lacks interpretability, limiting its adoption in regulated sectors like manufacturing, finance, and healthcare. Difficulties arise from DRL’s opaque decision-making, hindering efficiency and resource use, this issue is amplified with every advancement. While many seek to move from Experience Replay to A3C, the latter demands more resources. Despite efforts to improve Experience Replay selection strategies, there is a tendency to keep the capacity high. We investigate training a Deep Convolutional Q-learning agent across 20 Atari games intentionally reducing Experience Replay capacity from 1×106 to 5×102. We find that a reduction from 1×104 to 5×103 doesn’t significantly affect rewards, offering a practical path to resource-efficient DRL. To illuminate agent decisions and align them with game mechanics, we employ a novel method: visualizing Experience Replay via Deep SHAP Explainer. This approach fosters comprehension and transparent, interpretable explanations, though any capacity reduction must be cautious to avoid overfitting. Our study demonstrates the feasibility of reducing Experience Replay and advocates for transparent, interpretable decision explanations using the Deep SHAP Explainer to promote enhancing resource efficiency in Experience Replay.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040071
Authors: Veronika Smejkalová Radovan Šomplák Martin Rosecký Kristína Šramková
Analysis of data is crucial in waste management to improve effective planning from both short- and long-term perspectives. Real-world data often presents anomalies, but in the waste management sector, anomaly detection is seldom performed. The main goal and contribution of this paper is a proposal of a complex machine learning framework for changepoint detection in a large number of short time series from waste management. In such a case, it is not possible to use only an expert-based approach due to the time-consuming nature of this process and subjectivity. The proposed framework consists of two steps: (1) outlier detection via outlier test for trend-adjusted data, and (2) changepoints are identified via comparison of linear model parameters. In order to use the proposed method, it is necessary to have a sufficient number of experts’ assessments of the presence of anomalies in time series. The proposed framework is demonstrated on waste management data from the Czech Republic. It is observed that certain waste categories in specific regions frequently exhibit changepoints. On the micro-regional level, approximately 31.1% of time series contain at least one outlier and 16.4% exhibit changepoints. Certain groups of waste are more prone to the occurrence of anomalies. The results indicate that even in the case of aggregated data, anomalies are not rare, and their presence should always be checked.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040070
Authors: Mohammed Lansari Reda Bellafqira Katarzyna Kapusta Vincent Thouvenot Olivier Bettan Gouenou Coatrieux
Federated learning (FL) is a technique that allows multiple participants to collaboratively train a Deep Neural Network (DNN) without the need to centralize their data. Among other advantages, it comes with privacy-preserving properties, making it attractive for application in sensitive contexts, such as health care or the military. Although the data are not explicitly exchanged, the training procedure requires sharing information about participants’ models. This makes the individual models vulnerable to theft or unauthorized distribution by malicious actors. To address the issue of ownership rights protection in the context of machine learning (ML), DNN watermarking methods have been developed during the last five years. Most existing works have focused on watermarking in a centralized manner, but only a few methods have been designed for FL and its unique constraints. In this paper, we provide an overview of recent advancements in federated learning watermarking, shedding light on the new challenges and opportunities that arise in this field.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040069
Authors: Bahareh Najafi Saeedeh Parsaeefard Alberto Leon-Garcia
This paper addresses the problem of learning temporal graph representations, which capture the changing nature of complex evolving networks. Existing approaches mainly focus on adding new nodes and edges to capture dynamic graph structures. However, to achieve more accurate representation of graph evolution, we consider both the addition and deletion of nodes and edges as events. These events occur at irregular time scales and are modeled using temporal point processes. Our goal is to learn the conditional intensity function of the temporal point process to investigate the influence of deletion events on node representation learning for link-level prediction. We incorporate network entropy, a measure of node and edge significance, to capture the effect of node deletion and edge removal in our framework. Additionally, we leveraged the characteristics of a generalized temporal Hawkes process, which considers the inhibitory effects of events where past occurrences can reduce future intensity. This framework enables dynamic representation learning by effectively modeling both addition and deletion events in the temporal graph. To evaluate our approach, we utilize autonomous system graphs, a family of inhomogeneous sparse graphs with instances of node and edge additions and deletions, in a link prediction task. By integrating these enhancements into our framework, we improve the accuracy of dynamic link prediction and enable better understanding of the dynamic evolution of complex networks.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040068
Authors: Cristian Ubal Gustavo Di-Giorgi Javier E. Contreras-Reyes Rodrigo Salas
Long-term dependence is an essential feature for the predictability of time series. Estimating the parameter that describes long memory is essential to describing the behavior of time series models. However, most long memory estimation methods assume that this parameter has a constant value throughout the time series, and do not consider that the parameter may change over time. In this work, we propose an automated methodology that combines the estimation methodologies of the fractional differentiation parameter (and/or Hurst parameter) with its application to Recurrent Neural Networks (RNNs) in order for said networks to learn and predict long memory dependencies from information obtained in nonlinear time series. The proposal combines three methods that allow for better approximation in the prediction of the values of the parameters for each one of the windows obtained, using Recurrent Neural Networks as an adaptive method to learn and predict the dependencies of long memory in Time Series. For the RNNs, we have evaluated four different architectures: the Simple RNN, LSTM, the BiLSTM, and the GRU. These models are built from blocks with gates controlling the cell state and memory. We have evaluated the proposed approach using both synthetic and real-world data sets. We have simulated ARFIMA models for the synthetic data to generate several time series by varying the fractional differentiation parameter. We have evaluated the proposed approach using synthetic and real datasets using Whittle’s estimates of the Hurst parameter classically obtained in each window. We have simulated ARFIMA models in such a way that the synthetic data generate several time series by varying the fractional differentiation parameter. The real-world IPSA stock option index and Tree Ringtime series datasets were evaluated. All of the results show that the proposed approach can predict the Hurst exponent with good performance by selecting the optimal window size and overlap change.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040067
Authors: Saman Sarraf Milton Kabia
This study introduces an optimal topology of vision transformers for real-time video action recognition in a cloud-based solution. Although model performance is a key criterion for real-time video analysis use cases, inference latency plays a more crucial role in adopting such technology in real-world scenarios. Our objective is to reduce the inference latency of the solution while admissibly maintaining the vision transformer’s performance. Thus, we employed the optimal cloud components as the foundation of our machine learning pipeline and optimized the topology of vision transformers. We utilized UCF101, including more than one million action recognition video clips. The modeling pipeline consists of a preprocessing module to extract frames from video clips, training two-dimensional (2D) vision transformer models, and deep learning baselines. The pipeline also includes a postprocessing step to aggregate the frame-level predictions to generate the video-level predictions at inference. The results demonstrate that our optimal vision transformer model with an input dimension of 56 × 56 × 3 with eight attention heads produces an F1 score of 91.497% for the testing set. The optimized vision transformer reduces the inference latency by 40.70%, measured through a batch-processing approach, with a 55.63% faster training time than the baseline. Lastly, we developed an enhanced skip-frame approach to improve the inference latency by finding an optimal ratio of frames for prediction at inference, where we could further reduce the inference latency by 57.15%. This study reveals that the vision transformer model is highly optimizable for inference latency while maintaining the model performance.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040066
Authors: Yashwanth Karthik Kumar Mamidi Tarun Karthik Kumar Mamidi Md Wasi Ul Kabir Jiande Wu Md Tamjidul Hoque Chindo Hicks
A critical unmet medical need in prostate cancer (PCa) clinical management centers around distinguishing indolent from aggressive tumors. Traditionally, Gleason grading has been utilized for this purpose. However, tumor classification using Gleason Grade 7 is often ambiguous, as the clinical behavior of these tumors follows a variable clinical course. This study aimed to investigate the application of machine learning techniques (ML) to classify patients into indolent and aggressive PCas. We used gene expression data from The Cancer Genome Atlas and compared gene expression levels between indolent and aggressive tumors to identify features for developing and validating a range of ML and stacking algorithms. ML algorithms accurately distinguished indolent from aggressive PCas. With the accuracy of 96%, the stacking model was superior to individual ML algorithms when all samples with primary Gleason Grades 6 to 10 were used. Excluding samples with Gleason Grade 7 improved accuracy to 97%. This study shows that ML algorithms and stacking models are powerful approaches for the accurate classification of indolent versus aggressive PCas. Future implementation of this methodology may significantly impact clinical decision making and patient outcomes in the clinical management of prostate cancer.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040065
Authors: Franc Lavrič Andrej Škraba
A modification of the brainstorming process by the application of artificial intelligence (AI) was proposed. Here, we describe the design of the software system “kresilnik”, which enables hybrid work between a human group and AI. The proposed system integrates the Open AI-GPT-3.5–turbo model with the server side providing the results to clients. The proposed architecture provides the possibility to not only generate ideas but also categorize them and set priorities. With the developed prototype, 760 ideas were generated on the topic of the design of the Gorenjska region’s development plan with eight different temperatures with the OpenAI-GPT-3.5-turbo algorithm. For the set of generated ideas, the entropy was determined, as well as the time needed for their generation. The distributions of the entropy of the ideas generated by the human-generated and the AI-generated sets of ideas of the OpenAI-GPT-3.5–turbo algorithm at different temperatures are provided in the form of histograms. Ideas are presented as word clouds and histograms for the human group and the AI-generated sets. A comparison of the process of generating ideas between the human group and AI was conducted. The statistical Mann-Whitney U-test was performed, which confirmed the significant differences in the average entropy of the generated ideas. Correlations between the length of the generated ideas and the time needed were determined for the human group and AI. The distributions for the time needed and the length of the ideas were determined, which are possible indicators to distinguish between human and artificial processes of generating ideas.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040064
Authors: Oliver Lohaj Ján Paralič Peter Bednár Zuzana Paraličová Matúš Huba
Machine learning (ML) has been used in different ways in the fight against COVID-19 disease. ML models have been developed, e.g., for diagnostic or prognostic purposes and using various modalities of data (e.g., textual, visual, or structured). Due to the many specific aspects of this disease and its evolution over time, there is still not enough understanding of all relevant factors influencing the course of COVID-19 in particular patients. In all aspects of our work, there was a strong involvement of a medical expert following the human-in-the-loop principle. This is a very important but usually neglected part of the ML and knowledge extraction (KE) process. Our research shows that explainable artificial intelligence (XAI) may significantly support this part of ML and KE. Our research focused on using ML for knowledge extraction in two specific scenarios. In the first scenario, we aimed to discover whether adding information about the predominant COVID-19 variant impacts the performance of the ML models. In the second scenario, we focused on prognostic classification models concerning the need for an intensive care unit for a given patient in connection with different explainability AI (XAI) methods. We have used nine ML algorithms, namely XGBoost, CatBoost, LightGBM, logistic regression, Naive Bayes, random forest, SGD, SVM-linear, and SVM-RBF. We measured the performance of the resulting models using precision, accuracy, and AUC metrics. Subsequently, we focused on knowledge extraction from the best-performing models using two different approaches as follows: (a) features extracted automatically by forward stepwise selection (FSS); (b) attributes and their interactions discovered by model explainability methods. Both were compared with the attributes selected by the medical experts in advance based on the domain expertise. Our experiments showed that adding information about the COVID-19 variant did not influence the performance of the resulting ML models. It also turned out that medical experts were much more precise in the identification of significant attributes than FSS. Explainability methods identified almost the same attributes as a medical expert and interesting interactions among them, which the expert discussed from a medical point of view. The results of our research and their consequences are discussed.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5040063
Authors: Nurudin Alvarez-Gonzalez Andreas Kaltenbrunner Vicenç Gómez
Identifying similar network structures is key to capturing graph isomorphisms and learning representations that exploit structural information encoded in graph data. This work shows that ego networks can produce a structural encoding scheme for arbitrary graphs with greater expressivity than the Weisfeiler–Lehman (1-WL) test. We introduce IGEL, a preprocessing step to produce features that augment node representations by encoding ego networks into sparse vectors that enrich message passing (MP) graph neural networks (GNNs) beyond 1-WL expressivity. We formally describe the relation between IGEL and 1-WL, and characterize its expressive power and limitations. Experiments show that IGEL matches the empirical expressivity of state-of-the-art methods on isomorphism detection while improving performance on nine GNN architectures and six graph machine learning tasks.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030062
Authors: Chandana Priya Nivarthi Stephan Vogt Bernhard Sick
Typically, renewable-power-generation forecasting using machine learning involves creating separate models for each photovoltaic or wind park, known as single-task learning models. However, transfer learning has gained popularity in recent years, as it allows for the transfer of knowledge from source parks to target parks. Nevertheless, determining the most similar source park(s) for transfer learning can be challenging, particularly when the target park has limited or no historical data samples. To address this issue, we propose a multi-task learning architecture that employs a Unified Autoencoder (UAE) to initially learn a common representation of input weather features among tasks and then utilizes a Task-Embedding layer in a Neural Network (TENN) to learn task-specific information. This proposed UAE-TENN architecture can be easily extended to new parks with or without historical data. We evaluate the performance of our proposed architecture and compare it to single-task learning models on six photovoltaic and wind farm datasets consisting of a total of 529 parks. Our results show that the UAE-TENN architecture significantly improves power-forecasting performance by 10 to 19% for photovoltaic parks and 5 to 15% for wind parks compared to baseline models. We also demonstrate that UAE-TENN improves forecast accuracy for a new park by 19% for photovoltaic parks, even in a zero-shot learning scenario where there is no historical data. Additionally, we propose variants of the Unified Autoencoder with convolutional and LSTM layers, compare their performance, and provide a comparison among architectures with different numbers of task-embedding dimensions. Finally, we demonstrate the utility of trained task embeddings for interpretation and visualization purposes.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030061
Authors: Mohammad H. Alshayeji
Thyroid disease is among the most prevalent endocrinopathies worldwide. As the thyroid gland controls human metabolism, thyroid illness is a matter of concern for human health. To save time and reduce error rates, an automatic, reliable, and accurate thyroid identification machine-learning (ML) system is essential. The proposed model aims to address existing work limitations such as the lack of detailed feature analysis, visualization, improvement in prediction accuracy, and reliability. Here, a public thyroid illness dataset containing 29 clinical features from the University of California, Irvine ML repository was used. The clinical features helped us to build an ML model that can predict thyroid illness by analyzing early symptoms and replacing the manual analysis of these attributes. Feature analysis and visualization facilitate an understanding of the role of features in thyroid prediction tasks. In addition, the overfitting problem was eliminated by 5-fold cross-validation and data balancing using the synthetic minority oversampling technique (SMOTE). Ensemble learning ensures prediction model reliability owing to the involvement of multiple classifiers in the prediction decisions. The proposed model achieved 99.5% accuracy, 99.39% sensitivity, and 99.59% specificity with the boosting method which is applicable to real-time computer-aided diagnosis (CAD) systems to ease diagnosis and promote early treatment.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030060
Authors: Sarwat Ali M. Arif Wani
One of the challenges in deep learning involves discovering the optimal architecture for a specific task. This is effectively tackled through Neural Architecture Search (NAS). Neural Architecture Search encompasses three prominent approaches—reinforcement learning, evolutionary algorithms, and gradient descent—that have demonstrated noteworthy potential in identifying good candidate architectures. However, approaches based on reinforcement learning and evolutionary algorithms often necessitate extensive computational resources, requiring hundreds of GPU days or more. Therefore, we confine this work to a gradient-based approach due to its lower computational resource demands. Our objective encompasses identifying the optimal gradient-based NAS method and pinpointing opportunities for future enhancements. To achieve this, a comprehensive evaluation of the use of four major Gradient descent-based architecture search methods for discovering the best neural architecture for image classification tasks is provided. An overview of these gradient-based methods, i.e., DARTS, PDARTS, Fair DARTS and Att-DARTS, is presented. A theoretical comparison, based on search spaces, continuous relaxation strategy and bi-level optimization, for deriving the best neural architecture is then provided. The strong and weak features of these methods are also listed. Experimental results for comparing the error rate and computational cost of these gradient-based methods are analyzed. These experiments involved using bench marking datasets CIFAR-10, CIFAR-100 and ImageNet. The results show that PDARTS is better and faster among the examined methods, making it a potent candidate for automating Neural Architecture Search. By effectively conducting a comparative analysis, our research provides valuable insights and future research directions to address the criticism and gaps in the literature.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030059
Authors: Taja Kuzman Igor Mozetič Nikola Ljubešić
Massive text collections are the backbone of large language models, the main ingredient of the current significant progress in artificial intelligence. However, as these collections are mostly collected using automatic methods, researchers have few insights into what types of texts they consist of. Automatic genre identification is a text classification task that enriches texts with genre labels, such as promotional and legal, providing meaningful insights into the composition of these large text collections. In this paper, we evaluate machine learning approaches for the genre identification task based on their generalizability across different datasets to assess which model is the most suitable for the downstream task of enriching large web corpora with genre information. We train and test multiple fine-tuned BERT-like Transformer-based models and show that merging different genre-annotated datasets yields superior results. Moreover, we explore the zero-shot capabilities of large GPT Transformer models in this task and discuss the advantages and disadvantages of the zero-shot approach. We also publish the best-performing fine-tuned model that enables automatic genre annotation in multiple languages. In addition, to promote further research in this area, we plan to share, upon request, a new benchmark for automatic genre annotation, ensuring the non-exposure of the latest large language models.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030058
Authors: Jorge E. Coyac-Torres Grigori Sidorov Eleazar Aguirre-Anaya Gerardo Hernández-Oregón
Social networks have captured the attention of many people worldwide. However, these services have also attracted a considerable number of malicious users whose aim is to compromise the digital assets of other users by using messages as an attack vector to execute different types of cyberattacks against them. This work presents an approach based on natural language processing tools and a convolutional neural network architecture to detect and classify four types of cyberattacks in social network messages, including malware, phishing, spam, and even one whose aim is to deceive a user into spreading malicious messages to other users, which, in this work, is identified as a bot attack. One notable feature of this work is that it analyzes textual content without depending on any characteristics from a specific social network, making its analysis independent of particular data sources. Finally, this work was tested on real data, demonstrating its results in two stages. The first stage detected the existence of any of the four types of cyberattacks within the message, achieving an accuracy value of 0.91. After detecting a message as a cyberattack, the next stage was to classify it as one of the four types of cyberattack, achieving an accuracy value of 0.82.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030057
Authors: Alexandre Hudon Kingsada Phraxayavong Stéphane Potvin Alexandre Dumais
(1) Background: Avatar Therapy (AT) is currently being studied to help patients suffering from treatment-resistant schizophrenia. Facilitating annotations of immersive verbatims in AT by using classification algorithms could be an interesting avenue to reduce the time and cost of conducting such analysis and adding objective quantitative data in the classification of the different interactions taking place during the therapy. The aim of this study is to compare the performance of machine learning algorithms in the automatic annotation of immersive session verbatims of AT. (2) Methods: Five machine learning algorithms were implemented over a dataset as per the Scikit-Learn library: Support vector classifier, Linear support vector classifier, Multinomial Naïve Bayes, Decision Tree, and Multi-layer perceptron classifier. The dataset consisted of the 27 different types of interactions taking place in AT for the Avatar and the patient for 35 patients who underwent eight immersive sessions as part of their treatment in AT. (3) Results: The Linear SVC performed best over the dataset as compared with the other algorithms with the highest accuracy score, recall score, and F1-Score. The regular SVC performed best for precision. (4) Conclusions: This study presented an objective method for classifying textual interactions based on immersive session verbatims and gave a first comparison of multiple machine learning algorithms on AT.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030056
Authors: Michael C. Thrun Julian Märte Quirin Stier
Dimensionality reduction methods can be used to project high-dimensional data into low-dimensional space. If the output space is restricted to two dimensions, the result is a scatter plot whose goal is to present insightful visualizations of distance- and density-based structures. The topological invariance of dimension indicates that the two-dimensional similarities in the scatter plot cannot coercively represent high-dimensional distances. In praxis, projections of several datasets with distance- and density-based structures show a misleading interpretation of the underlying structures. The examples outline that the evaluation of projections remains essential. Here, 19 unsupervised quality measurements (QM) are grouped into semantic classes with the aid of graph theory. We use three representative benchmark datasets to show that QMs fail to evaluate the projections of straightforward structures when common methods such as Principal Component Analysis (PCA), Uniform Manifold Approximation projection, or t-distributed stochastic neighbor embedding (t-SNE) are applied. This work shows that unsupervised QMs are biased towards assumed underlying structures. Based on insights gained from graph theory, we propose a new quality measurement called the Gabriel Classification Error (GCE). This work demonstrates that GCE can make an unbiased evaluation of projections. The GCE is accessible within the R package DR quality available on CRAN.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030055
Authors: Rebecca Potts Rick Hackney Georgios Leontidis
Predicting emissions for gas turbines is critical for monitoring harmful pollutants being released into the atmosphere. In this study, we evaluate the performance of machine learning models for predicting emissions for gas turbines. We compared an existing predictive emissions model, a first-principles-based Chemical Kinetics model, against two machine learning models we developed based on the Self-Attention and Intersample Attention Transformer (SAINT) and eXtreme Gradient Boosting (XGBoost), with the aim to demonstrate the improved predictive performance of nitrogen oxides (NOx) and carbon monoxide (CO) using machine learning techniques and determine whether XGBoost or a deep learning model performs the best on a specific real-life gas turbine dataset. Our analysis utilises a Siemens Energy gas turbine test bed tabular dataset to train and validate the machine learning models. Additionally, we explore the trade-off between incorporating more features to enhance the model complexity, and the resulting presence of increased missing values in the dataset.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030054
Authors: Frank Emmert-Streib
The concept of a digital twin (DT) has gained significant attention in academia and industry because of its perceived potential to address critical global challenges, such as climate change, healthcare, and economic crises. Originally introduced in manufacturing, many attempts have been made to present proper definitions of this concept. Unfortunately, there remains a great deal of confusion surrounding the underlying concept, with many scientists still uncertain about the distinction between a simulation, a mathematical model and a DT. The aim of this paper is to propose a formal definition of a digital twin. To achieve this goal, we utilize a data science framework that facilitates a functional representation of a DT and other components that can be combined together to form a larger entity we refer to as a digital twin system (DTS). In our framework, a DT is an open dynamical system with an updating mechanism, also referred to as complex adaptive system (CAS). Its primary function is to generate data via simulations, ideally, indistinguishable from its physical counterpart. On the other hand, a DTS provides techniques for analyzing data and decision-making based on the generated data. Interestingly, we find that a DTS shares similarities to the principles of general systems theory. This multi-faceted view of a DTS explains its versatility in adapting to a wide range of problems in various application domains such as engineering, manufacturing, urban planning, and personalized medicine.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030053
Authors: Mohammad Mohammad Amini Marcia Jesus Davood Fanaei Sheikholeslami Paulo Alves Aliakbar Hassanzadeh Benam Fatemeh Hariri
This study examines the ethical issues surrounding the use of Artificial Intelligence (AI) in healthcare, specifically nursing, under the European General Data Protection Regulation (GDPR). The analysis delves into how GDPR applies to healthcare AI projects, encompassing data collection and decision-making stages, to reveal the ethical implications at each step. A comprehensive review of the literature categorizes research investigations into three main categories: Ethical Considerations in AI; Practical Challenges and Solutions in AI Integration; and Legal and Policy Implications in AI. The analysis uncovers a significant research deficit in this field, with a particular focus on data owner rights and AI ethics within GDPR compliance. To address this gap, the study proposes new case studies that emphasize the importance of comprehending data owner rights and establishing ethical norms for AI use in medical applications, especially in nursing. This review makes a valuable contribution to the AI ethics debate and assists nursing and healthcare professionals in developing ethical AI practices. The insights provided help stakeholders navigate the intricate terrain of data protection, ethical considerations, and regulatory compliance in AI-driven healthcare. Lastly, the study introduces a case study of a real AI health-tech project named SENSOMATT, spotlighting GDPR and privacy issues.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030052
Authors: Paolo G. Cachi Sebastián Ventura Krzysztof J. Cios
The use of back propagation through the time learning rule enabled the supervised training of deep spiking neural networks to process temporal neuromorphic data. However, their performance is still below non-spiking neural networks. Previous work pointed out that one of the main causes is the limited number of neuromorphic data currently available, which are also difficult to generate. With the goal of overcoming this problem, we explore the usage of auxiliary learning as a means of helping spiking neural networks to identify more general features. Tests are performed on neuromorphic DVS-CIFAR10 and DVS128-Gesture datasets. The results indicate that training with auxiliary learning tasks improves their accuracy, albeit slightly. Different scenarios, including manual and automatic combination losses using implicit differentiation, are explored to analyze the usage of auxiliary tasks.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030051
Authors: Dylan Molinié Kurosh Madani Véronique Amarger Abdennasser Chebira
This paper introduces a non-parametric methodology based on classical unsupervised clustering techniques to automatically identify the main regions of a space, without requiring the objective number of clusters, so as to identify the major regular states of unknown industrial systems. Indeed, useful knowledge on real industrial processes entails the identification of their regular states, and their historically encountered anomalies. Since both should form compact and salient groups of data, unsupervised clustering generally performs this task fairly accurately; however, this often requires the number of clusters upstream, knowledge which is rarely available. As such, the proposed algorithm operates a first partitioning of the space, then it estimates the integrity of the clusters, and splits them again and again until every cluster obtains an acceptable integrity; finally, a step of merging based on the clusters’ empirical distributions is performed to refine the partitioning. Applied to real industrial data obtained in the scope of a European project, this methodology proved able to automatically identify the main regular states of the system. Results show the robustness of the proposed approach in the fully-automatic and non-parametric identification of the main regions of a space, knowledge which is useful to industrial anomaly detection and behavioral modeling.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030050
Authors: Phillip Czech Markus Braun Ulrich Kreßel Bin Yang
With the ongoing development of automated driving systems, the crucial task of predicting pedestrian behavior is attracting growing attention. The prediction of future pedestrian trajectories from the ego-vehicle camera perspective is particularly challenging due to the dynamically changing scene. Therefore, we present Behavior-Aware Pedestrian Trajectory Prediction (BA-PTP), a novel approach to pedestrian trajectory prediction for ego-centric camera views. It incorporates behavioral features extracted from real-world traffic scene observations such as the body and head orientation of pedestrians, as well as their pose, in addition to positional information from body and head bounding boxes. For each input modality, we employed independent encoding streams that are combined through a modality attention mechanism. To account for the ego-motion of the camera in an ego-centric view, we introduced Spatio-Temporal Ego-Motion Module (STEMM), a novel approach to ego-motion prediction. Compared to the related works, it utilizes spatial goal points of the ego-vehicle that are sampled from its intended route. We experimentally validated the effectiveness of our approach using two datasets for pedestrian behavior prediction in urban traffic scenes. Based on ablation studies, we show the advantages of incorporating different behavioral features for pedestrian trajectory prediction in the image plane. Moreover, we demonstrate the benefit of integrating STEMM into our pedestrian trajectory prediction method, BA-PTP. BA-PTP achieves state-of-the-art performance on the PIE dataset, outperforming prior work by 7% in MSE-1.5 s and CMSE as well as 9% in CFMSE.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030049
Authors: Litao Qiao Weijia Wang Bill Lin
This paper extends recent work on decision rule learning from neural networks for tabular data classification. We propose alternative formulations to trainable Boolean logic operators as neurons with continuous weights, including trainable NAND neurons. These alternative formulations provide uniform treatments to different trainable logic neurons so that they can be uniformly trained, which enables, for example, the direct application of existing sparsity-promoting neural net training techniques like reweighted L1 regularization to derive sparse networks that translate to simpler rules. In addition, we present an alternative network architecture based on trainable NAND neurons by applying De Morgan’s law to realize a NAND-NAND network instead of an AND-OR network, both of which can be readily mapped to decision rule sets. Our experimental results show that these alternative formulations can also generate accurate decision rule sets that achieve state-of-the-art performance in terms of accuracy in tabular learning applications.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030048
Authors: Hosein Barzekar Susan McRoy
Decision support systems based on machine learning models should be able to help users identify opportunities and threats. Popular model-agnostic explanation models can identify factors that support various predictions, answering questions such as “What factors affect sales?” or “Why did sales decline?”, but do not highlight what a person should or could do to get a more desirable outcome. Counterfactual explanation approaches address intervention, and some even consider feasibility, but none consider their suitability for real-time applications, such as question answering. Here, we address this gap by introducing a novel model-agnostic method that provides specific, feasible changes that would impact the outcomes of a complex Black Box AI model for a given instance and assess its real-world utility by measuring its real-time performance and ability to find achievable changes. The method uses the instance of concern to generate high-precision explanations and then applies a secondary method to find achievable minimally-contrastive counterfactual explanations (AMCC) while limiting the search to modifications that satisfy domain-specific constraints. Using a widely recognized dataset, we evaluated the classification task to ascertain the frequency and time required to identify successful counterfactuals. For a 90% accurate classifier, our algorithm identified AMCC explanations in 47% of cases (38 of 81), with an average discovery time of 80 ms. These findings verify the algorithm’s efficiency in swiftly producing AMCC explanations, suitable for real-time systems. The AMCC method enhances the transparency of Black Box AI models, aiding individuals in evaluating remedial strategies or assessing potential outcomes.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030047
Authors: Mahmood Ul Haq Muhammad Athar Javed Sethi Atiq Ur Rehman
Numerous advancements in various fields, including pattern recognition and image classification, have been made thanks to modern computer vision and machine learning methods. The capsule network is one of the advanced machine learning algorithms that encodes features based on their hierarchical relationships. Basically, a capsule network is a type of neural network that performs inverse graphics to represent the object in different parts and view the existing relationship between these parts, unlike CNNs, which lose most of the evidence related to spatial location and requires lots of training data. So, we present a comparative review of various capsule network architectures used in various applications. The paper’s main contribution is that it summarizes and explains the significant current published capsule network architectures with their advantages, limitations, modifications, and applications.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030046
Authors: Brian Lewandowski Randy Paffenroth
The proliferation of novel attacks and growing amounts of data has caused practitioners in the field of network intrusion detection to constantly work towards keeping up with this evolving adversarial landscape. Researchers have been seeking to harness deep learning techniques in efforts to detect zero-day attacks and allow network intrusion detection systems to more efficiently alert network operators. The technique outlined in this work uses a one-class training process to shape autoencoder feature residuals for the effective detection of network attacks. Compared to an original set of input features, we show that autoencoder feature residuals are a suitable replacement, and often perform at least as well as the original feature set. This quality allows autoencoder feature residuals to prevent the need for extensive feature engineering without reducing classification performance. Additionally, it is found that without generating new data compared to an original feature set, using autoencoder feature residuals often improves classifier performance. Practical side effects from using autoencoder feature residuals emerge by analyzing the potential data compression benefits they provide.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030045
Authors: Ala Alam Falaki Robin Gras
This paper presents a technique to reduce the number of parameters in a transformer-based encoder–decoder architecture by incorporating autoencoders. To discover the optimal compression, we trained different autoencoders on the embedding space (encoder’s output) of several pre-trained models. The experiments reveal that reducing the embedding size has the potential to dramatically decrease the GPU memory usage while speeding up the inference process. The proposed architecture was included in the BART model and tested for summarization, translation, and classification tasks. The summarization results show that a 60% decoder size reduction (from 96 M to 40 M parameters) will make the inference twice as fast and use less than half of GPU memory during fine-tuning process with only a 4.5% drop in R-1 score. The same trend is visible for translation and partially for classification tasks. Our approach reduces the GPU memory usage and processing time of large-scale sequence-to-sequence models for fine-tuning and inference. The implementation and checkpoints are available on GitHub.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030044
Authors: Daniel Klosa Christof Büskens
Traffic forecasting is an important task for transportation engineering as it helps authorities to plan and control traffic flow, detect congestion, and reduce environmental impact. Deep learning techniques have gained traction in handling such complex datasets, but require expertise in neural architecture engineering, often beyond the scope of traffic management decision-makers. Our study aims to address this challenge by using neural architecture search (NAS) methods. These methods, which simplify neural architecture engineering by discovering task-specific neural architectures, are only recently applied to traffic prediction. We specifically focus on the performance estimation of neural architectures, a computationally demanding sub-problem of NAS, that often hinders the real-world application of these methods. Extending prior work on evolutionary NAS (ENAS), our work evaluates the utility of zero-cost (ZC) proxies, recently emerged cost-effective evaluators of network architectures. These proxies operate without necessitating training, thereby circumventing the computational bottleneck, albeit at a slight cost to accuracy. Our findings indicate that, when integrated into the ENAS framework, ZC proxies can accelerate the search process by two orders of magnitude at a small cost of accuracy. These results establish the viability of ZC proxies as a practical solution to accelerate NAS methods while maintaining model accuracy. Our research contributes to the domain by showcasing how ZC proxies can enhance the accessibility and usability of NAS methods for traffic forecasting, despite potential limitations in neural architecture engineering expertise. This novel approach significantly aids in the efficient application of deep learning techniques in real-world traffic management scenarios.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030043
Authors: Peter Salamon David Salamon V. Adrian Cantu Michelle An Tyler Perry Robert A. Edwards Anca M. Segall
This paper investigates the post-hoc calibration of confidence for “exploratory” machine learning classification problems. The difficulty in these problems stems from the continuing desire to push the boundaries of which categories have enough examples to generalize from when curating datasets, and confusion regarding the validity of those categories. We argue that for such problems the “one-versus-all” approach (top-label calibration) must be used rather than the “calibrate-the-full-response-matrix” approach advocated elsewhere in the literature. We introduce and test four new algorithms designed to handle the idiosyncrasies of category-specific confidence estimation using only the test set and the final model. Chief among these methods is the use of kernel density ratios for confidence calibration including a novel algorithm for choosing the bandwidth. We test our claims and explore the limits of calibration on a bioinformatics application (PhANNs) as well as the classic MNIST benchmark. Finally, our analysis argues that post-hoc calibration should always be performed, may be performed using only the test dataset, and should be sanity-checked visually.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030042
Authors: David Muhr Michael Affenzeller Josef Küng
The scores of distance-based outlier detection methods are difficult to interpret, and it is challenging to determine a suitable cut-off threshold between normal and outlier data points without additional context. We describe a generic transformation of distance-based outlier scores into interpretable, probabilistic estimates. The transformation is ranking-stable and increases the contrast between normal and outlier data points. Determining distance relationships between data points is necessary to identify the nearest-neighbor relationships in the data, yet most of the computed distances are typically discarded. We show that the distances to other data points can be used to model distance probability distributions and, subsequently, use the distributions to turn distance-based outlier scores into outlier probabilities. Over a variety of tabular and image benchmark datasets, we show that the probabilistic transformation does not impact outlier ranking (ROC AUC) or detection performance (AP, F1), and increases the contrast between normal and outlier score distributions (statistical distance). The experimental findings indicate that it is possible to transform distance-based outlier scores into interpretable probabilities with increased contrast between normal and outlier samples. Our work generalizes to a wide range of distance-based outlier detection methods, and, because existing distance computations are used, it adds no significant computational overhead.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030041
Authors: Fábio Eid Morooka Adalberto Manoel Junior Tiago F. A. C. Sigahi Jefferson de Souza Pinto Izabela Simon Rampasso Rosley Anholon
Applications of deep learning (DL) in autonomous vehicle (AV) projects have gained increasing interest from both researchers and companies. This has caused a rapid expansion of scientific production on DL-AV in recent years, encouraging researchers to conduct systematic literature reviews (SLRs) to organize knowledge on the topic. However, a critical analysis of the existing SLRs on DL-AV reveals some methodological gaps, particularly regarding the use of bibliometric software, which are powerful tools for analyzing large amounts of data and for providing a holistic understanding on the structure of knowledge of a particular field. This study aims to identify the strategic themes and trends in DL-AV research using the Science Mapping Analysis Tool (SciMAT) and content analysis. Strategic diagrams and cluster networks were developed using SciMAT, allowing the identification of motor themes and research opportunities. The content analysis allowed categorization of the contribution of the academic literature on DL applications in AV project design; neural networks and AI models used in AVs; and transdisciplinary themes in DL-AV research, including energy, legislation, ethics, and cybersecurity. Potential research avenues are discussed for each of these categories. The findings presented in this study can benefit both experienced scholars who can gain access to condensed information about the literature on DL-AV and new researchers who may be attracted to topics related to technological development and other issues with social and environmental impacts.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030040
Authors: Kristian Miok Padraig Corcoran Irena Spasić
Clinical text often includes numbers of various types and formats. However, most current text classification approaches do not take advantage of these numbers. This study aims to demonstrate that using numbers as features can significantly improve the performance of text classification models. This study also demonstrates the feasibility of extracting such features from clinical text. Unsupervised learning was used to identify patterns of number usage in clinical text. These patterns were analyzed manually and converted into pattern-matching rules. Information extraction was used to incorporate numbers as features into a document representation model. We evaluated text classification models trained on such representation. Our experiments were performed with two document representation models (vector space model and word embedding model) and two classification models (support vector machines and neural networks). The results showed that even a handful of numerical features can significantly improve text classification performance. We conclude that commonly used document representations do not represent numbers in a way that machine learning algorithms can effectively utilize them as features. Although we demonstrated that traditional information extraction can be effective in converting numbers into features, further community-wide research is required to systematically incorporate number representation into the word embedding process.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030039
Authors: Jianfeng Li Xiaoqin Lian
Forest fires are one of the world’s deadliest natural disasters. Early detection of forest fires can help minimize the damage to ecosystems and forest life. In this paper, we propose an improved fire detection method YOLOv5-IFFDM for YOLOv5. Firstly, the fire and smoke detection accuracy and the network perception accuracy of small targets are improved by adding an attention mechanism to the backbone network. Secondly, the loss function is improved and the SoftPool pyramid pooling structure is used to improve the regression accuracy and detection performance of the model and the robustness of the model. In addition, a random mosaic augmentation technique is used to enhance the data to increase the generalization ability of the model, and re-clustering of flame and smoke detection a priori frames are used to improve the accuracy and speed. Finally, the parameters of the convolutional and normalization layers of the trained model are homogeneously merged to further reduce the model processing load and to improve the detection speed. Experimental results on self-built forest-fire and smoke datasets show that this algorithm has high detection accuracy and fast detection speed, with average accuracy of fire up to 90.5% and smoke up to 84.3%, and detection speed up to 75 FPS (frames per second transmission), which can meet the requirements of real-time and efficient fire detection.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030038
Authors: Angel Pina Corbin Petersheim Josh Cherian Joanna Nicole Lahey Gerianne Alexander Tracy Hammond
When job seekers are unsuccessful in getting a position, they often do not get feedback to inform them on how to develop a better application in the future. Therefore, there is a critical need to understand what qualifications recruiters value in order to help applicants. To address this need, we utilized eye-trackers to measure and record visual data of recruiters screening resumes to gain insight into which Areas of Interest (AOIs) influenced recruiters’ decisions the most. Using just this eye-tracking data, we trained a machine learning classifier to predict whether or not a recruiter would move a resume on to the next level of the hiring process with an AUC of 0.767. We found that features associated with recruiters looking outside the content of a resume were most predictive of their decision as well as total time viewing the resume and time spent on the Experience and Education sections. We hypothesize that this behavior is indicative of the recruiter reflecting on the content of the resume. These initial results show that applicants should focus on designing clear and concise resumes that are easy for recruiters to absorb and think about, with additional attention given to the Experience and Education sections.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5030037
Authors: Hanruo Zhu Ziquan Zhu Shuihua Wang Yudong Zhang
Since the COVID-19 pandemic outbreak, over 760 million confirmed cases and over 6.8 million deaths have been reported globally, according to the World Health Organization. While the SARS-CoV-2 virus carried by COVID-19 patients can be identified though the reverse transcription–polymerase chain reaction (RT-PCR) test with high accuracy, clinical misdiagnosis between COVID-19 and pneumonia patients remains a challenge. Therefore, we developed a novel CovC-ReDRNet model to distinguish COVID-19 patients from pneumonia patients as well as normal cases. ResNet-18 was introduced as the backbone model and tailored for the feature representation afterward. In our feature-based randomized neural network (RNN) framework, the feature representation automatically pairs with the deep random vector function link network (dRVFL) as the optimal classifier, producing a CovC-ReDRNet model for the classification task. Results based on five-fold cross-validation reveal that our method achieved 94.94%, 97.01%, 97.56%, 96.81%, and 95.84% MA sensitivity, MA specificity, MA accuracy, MA precision, and MA F1-score, respectively. Ablation studies evidence the superiority of ResNet-18 over different backbone networks, RNNs over traditional classifiers, and deep RNNs over shallow RNNs. Moreover, our proposed model achieved a better MA accuracy than the state-of-the-art (SOTA) methods, the highest score of which was 95.57%. To conclude, our CovC-ReDRNet model could be perceived as an advanced computer-aided diagnostic model with high speed and high accuracy for classifying and predicting COVID-19 diseases.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5020036
Authors: Maryam KafiKang Abdeltawab Hendawi
In the context of pharmaceuticals, drug-drug interactions (DDIs) occur when two or more drugs interact, potentially altering the intended effects of the drugs and resulting in adverse patient health outcomes. Therefore, it is essential to identify and comprehend these interactions. In recent years, an increasing number of novel compounds have been discovered, resulting in the discovery of numerous new DDIs. There is a need for effective methods to extract and analyze DDIs, as the majority of this information is still predominantly located in biomedical articles and sources. Despite the development of various techniques, accurately predicting DDIs remains a significant challenge. This paper proposes a novel solution to this problem by leveraging the power of Relation BioBERT (R-BioBERT) to detect and classify DDIs and the Bidirectional Long Short-Term Memory (BLSTM) to improve the accuracy of predictions. In addition to determining whether two drugs interact, the proposed method also identifies the specific types of interactions between them. Results show that the use of BLSTM leads to significantly higher F-scores compared to our baseline model, as demonstrated on three well-known DDI extraction datasets that includes SemEval 2013, TAC 2018, and TAC 2019.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5020035
Authors: Qinghua Zhou Jiaji Wang Xiang Yu Shuihua Wang Yudong Zhang
Alzheimer’s and related diseases are significant health issues of this era. The interdisciplinary use of deep learning in this field has shown great promise and gathered considerable interest. This paper surveys deep learning literature related to Alzheimer’s disease, mild cognitive impairment, and related diseases from 2010 to early 2023. We identify the major types of unsupervised, supervised, and semi-supervised methods developed for various tasks in this field, including the most recent developments, such as the application of recurrent neural networks, graph-neural networks, and generative models. We also provide a summary of data sources, data processing, training protocols, and evaluation methods as a guide for future deep learning research into Alzheimer’s disease. Although deep learning has shown promising performance across various studies and tasks, it is limited by interpretation and generalization challenges. The survey also provides a brief insight into these challenges and the possible pathways for future studies.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5020034
Authors: Andrée C. Ehresmann Mathias Béjean Jean-Paul Vanbremeersch
This paper presents a conceptual mathematical framework for developing rich human–machine interactions in order to improve decision-making in a social organisation, S. The idea is to model how S can create a “multi-level artificial cognitive system”, called a data analyser (DA), to collaborate with humans in collecting and learning how to analyse data, to anticipate situations, and to develop new responses, thus improving decision-making. In this model, the DA is “processed” to not only gather data and extend existing knowledge, but also to learn how to act autonomously with its own specific procedures or even to create new ones. An application is given in cases where such rich human–machine interactions are expected to allow the DA+S partnership to acquire deep anticipation capabilities for possible future changes, e.g., to prevent risks or seize opportunities. The way the social organization S operates over time, including the construction of DA, is described using the conceptual framework comprising “memory evolutive systems” (MES), a mathematical theoretical approach introduced by Ehresmann and Vanbremeersch for evolutionary multi-scale, multi-agent and multi-temporality systems. This leads to the definition of a “data analyser–MES”.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5020033
Authors: Shrooq Algarni Frederick Sheldon
Course recommender systems play an increasingly pivotal role in the educational landscape, driving personalization and informed decision-making for students. However, these systems face significant challenges, including managing a large and dynamic decision space and addressing the cold start problem for new students. This article endeavors to provide a comprehensive review and background to fully understand recent research on course recommender systems and their impact on learning. We present a detailed summary of empirical data supporting the use of these systems in educational strategic planning. We examined case studies conducted over the previous six years (2017–2022), with a focus on 35 key studies selected from 1938 academic papers found using the CADIMA tool. This systematic literature review (SLR) assesses various recommender system methodologies used to suggest course selection tracks, aiming to determine the most effective evidence-based approach.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5020032
Authors: Maresa Schröder Alireza Zamanian Narges Ahmidi
Saliency methods are designed to provide explainability for deep image processing models by assigning feature-wise importance scores and thus detecting informative regions in the input images. Recently, these methods have been widely adapted to the time series domain, aiming to identify important temporal regions in a time series. This paper extends our former work on identifying the systematic failure of such methods in the time series domain to produce relevant results when informative patterns are based on underlying latent information rather than temporal regions. First, we both visually and quantitatively assess the quality of explanations provided by multiple state-of-the-art saliency methods, including Integrated Gradients, Deep-Lift, Kernel SHAP, and Lime using univariate simulated time series data with temporal or latent patterns. In addition, to emphasize the severity of the latent feature saliency detection problem, we also run experiments on a real-world predictive maintenance dataset with known latent patterns. We identify Integrated Gradients, Deep-Lift, and the input-cell attention mechanism as potential candidates for refinement to yield latent saliency scores. Finally, we provide recommendations on using saliency methods for time series classification and suggest a guideline for developing latent saliency methods for time series.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5020031
Authors: Amar Shukla Rajeev Tiwari Shamik Tiwari
Alzheimer’s disease (AD) is an old-age disease that comes in different stages and directly affects the different regions of the brain. The research into the detection of AD and its stages has new advancements in terms of single-modality and multimodality approaches. However, sustainable techniques for the detection of AD and its stages still require a greater extent of research. In this study, a multimodal image-fusion method is initially proposed for the fusion of two different modalities, i.e., PET (Positron Emission Tomography) and MRI (Magnetic Resonance Imaging). Further, the features obtained from fused and non-fused biomarkers are passed to the ensemble classifier with a Random Forest-based feature selection strategy. Three classes of Alzheimer’s disease are used in this work, namely AD, MCI (Mild Cognitive Impairment) and CN (Cognitive Normal). In the resulting analysis, the Binary classifications, i.e., AD vs. CN and MCI vs. CN, attained an accuracy (Acc) of 99% in both cases. The class AD vs. MCI detection achieved an adequate accuracy (Acc) of 91%. Furthermore, the Multi Class classification, i.e., AD vs. MCI vs. CN, achieved 96% (Acc).
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5020030
Authors: Karthik Santhanaraj Dinakaran Devaraj Ramya MM Joshuva Dhanraj Kuppan Ramanathan
Recent technological advancements have fostered human–robot coexistence in work and residential environments. The assistive robot must exhibit humane behavior and consistent care to become an integral part of the human habitat. Furthermore, the robot requires an adaptive unsupervised learning model to explore unfamiliar conditions and collaborate seamlessly. This paper introduces variants of the growing hierarchical self-organizing map (GHSOM)-based computational models for assistive robots, which constructs knowledge from unsupervised exploration-based learning. Traditional self-organizing map (SOM) algorithms have shortcomings, including finite neuron structure, user-defined parameters, and non-hierarchical adaptive architecture. The proposed models overcome these limitations and dynamically grow to form problem-dependent hierarchical feature clusters, thereby allowing associative learning and symbol grounding. Infants can learn from their surroundings through exploration and experience, developing new neuronal connections as they learn. They can also apply their prior knowledge to solve unfamiliar problems. With infant-like emergent behavior, the presented models can operate on different problems without modifications, producing new patterns not present in the input vectors and allowing interactive result visualization. The proposed models are applied to the color, handwritten digits clustering, finger identification, and image classification problems to evaluate their adaptiveness and infant-like knowledge building. The results show that the proposed models are the preferred generalized models for assistive robots.
]]>Machine Learning and Knowledge Extraction doi: 10.3390/make5020029
Authors: Gaurav Nanda Aparajita Jaiswal Hugo Castellanos Yuzhe Zhou Alex Choi Alejandra J. Magana
Fields in the social sciences, such as education research, have started to expand the use of computer-based research methods to supplement traditional research approaches. Natural language processing techniques, such as topic modeling, may support qualitative data analysis by providing early categories that researchers may interpret and refine. This study contributes to this body of research and answers the following research questions: (RQ1) What is the relative coverage of the latent Dirichlet allocation (LDA) topic model and human coding in terms of the breadth of the topics/themes extracted from the text collection? (RQ2) What is the relative depth or level of detail among identified topics using LDA topic models and human coding approaches? A dataset of student reflections was qualitatively analyzed using LDA topic modeling and human coding approaches, and the results were compared. The findings suggest that topic models can provide reliable coverage and depth of themes present in a textual collection comparable to human coding but require manual interpretation of topics. The breadth and depth of human coding output is heavily dependent on the expertise of coders and the size of the collection; these factors are better handled in the topic modeling approach.
]]>