Next Issue
Volume 5, September
Previous Issue
Volume 5, March
 
 

AI, Volume 5, Issue 2 (June 2024) – 24 articles

Cover Story (view full-size image): Understanding the individual and coordinated movements of athletes in team sports is crucial to the effective design of playing strategies. The on-field position of every athlete during matches must be known to paint a complete picture of opposition strategies. Currently, the geolocation (GPS) data generated by opposing teams are not available to professional Australian Rules Football (AFL) teams. However, animations of the GPS data from all teams are made commercially available by the official statistics provider of the AFL. This paper leverages multiple object detection neural networks to detect and track athletes in on-field animations of GPS data during professional Australian Rules Football matches. The novel method developed in this paper unlocks the ability for tacticians to analyse the strategic patterns of their direct opposition. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
16 pages, 4233 KiB  
Article
Minimally Distorted Adversarial Images with a Step-Adaptive Iterative Fast Gradient Sign Method
by Ning Ding and Knut Möller
AI 2024, 5(2), 922-937; https://doi.org/10.3390/ai5020046 - 18 Jun 2024
Viewed by 530
Abstract
The safety and robustness of convolutional neural networks (CNNs) have raised increasing concerns, especially in safety-critical areas, such as medical applications. Although CNNs are efficient in image classification, their predictions are often sensitive to minor, for human observers, invisible modifications of the image. [...] Read more.
The safety and robustness of convolutional neural networks (CNNs) have raised increasing concerns, especially in safety-critical areas, such as medical applications. Although CNNs are efficient in image classification, their predictions are often sensitive to minor, for human observers, invisible modifications of the image. Thus, a modified, corrupted image can be visually equal to the legitimate image for humans but fool the CNN and make a wrong prediction. Such modified images are called adversarial images throughout this paper. A popular method to generate adversarial images is backpropagating the loss gradient to modify the input image. Usually, only the direction of the gradient and a given step size were used to determine the perturbations (FGSM, fast gradient sign method), or the FGSM is applied multiple times to craft stronger perturbations that change the model classification (i-FGSM). On the contrary, if the step size is too large, the minimum perturbation of the image may be missed during the gradient search. To seek exact and minimal input images for a classification change, in this paper, we suggest starting the FGSM with a small step size and adapting the step size with iterations. A few decay algorithms were taken from the literature for comparison with a novel approach based on an index tracking the loss status. In total, three tracking functions were applied for comparison. The experiments show our loss adaptive decay algorithms could find adversaries with more than a 90% success rate while generating fewer perturbations to fool the CNNs. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

24 pages, 5034 KiB  
Perspective
AI Detection of Human Understanding in a Gen-AI Tutor
by Earl Woodruff
AI 2024, 5(2), 898-921; https://doi.org/10.3390/ai5020045 - 18 Jun 2024
Viewed by 625
Abstract
Subjective understanding is a complex process that involves the interplay of feelings and cognition. This paper explores how computers can monitor a user’s sympathetic and parasympathetic nervous system activity in real-time to detect the nature of the understanding the user is experiencing as [...] Read more.
Subjective understanding is a complex process that involves the interplay of feelings and cognition. This paper explores how computers can monitor a user’s sympathetic and parasympathetic nervous system activity in real-time to detect the nature of the understanding the user is experiencing as they engage with study materials. By leveraging advancements in facial expression analysis, transdermal optical imaging, and voice analysis, I demonstrate how one can identify the physiological feelings that indicate a user’s mental state and level of understanding. The mental state model, which views understandings as composed of assembled beliefs, values, emotions, and feelings, provides a framework for understanding the multifaceted nature of the emotion–cognition relationship. As learners progress through the phases of nascent understanding, misunderstanding, confusion, emergent understanding, and deep understanding, they experience a range of cognitive processes, emotions, and physiological responses that can be detected and analyzed by AI-driven assessments. Based on the above approach, I further propose the development of Abel Tutor. This AI-driven system uses real-time monitoring of physiological feelings to provide individualized, adaptive tutoring support designed to guide learners toward deep understanding. By identifying the feelings associated with each phase of understanding, Abel Tutor can offer targeted interventions, such as clarifying explanations, guiding questions, or additional resources, to help students navigate the challenges they encounter and promote engagement. The ability to detect and respond to a student’s emotional state in real-time can revolutionize the learning experience, creating emotionally resonant learning environments that adapt to individual needs and optimize educational outcomes. As we continue to explore the potential of AI-driven assessments of subjective understanding, it is crucial to ensure that these technologies are grounded in sound pedagogical principles and ethical considerations, ultimately empowering learners and facilitating the attainment of deep understanding and lifelong learning for advantaged and disadvantaged students. Full article
Show Figures

Figure 1

11 pages, 1910 KiB  
Article
Use of Yolo Detection for 3D Pose Tracking of Cardiac Catheters Using Bi-Plane Fluoroscopy
by Sara Hashemi, Mohsen Annabestani, Mahdie Aghasizade, Amir Kiyoumarsioskouei, S. Chiu Wong and Bobak Mosadegh
AI 2024, 5(2), 887-897; https://doi.org/10.3390/ai5020044 - 13 Jun 2024
Viewed by 484
Abstract
The increasing rate of minimally invasive procedures and the growing prevalence of cardiovascular disease have led to a demand for higher-quality guidance systems for catheter tracking. Traditional methods for catheter tracking, such as detection based on single points and applying masking techniques, have [...] Read more.
The increasing rate of minimally invasive procedures and the growing prevalence of cardiovascular disease have led to a demand for higher-quality guidance systems for catheter tracking. Traditional methods for catheter tracking, such as detection based on single points and applying masking techniques, have been limited in their ability to provide accurate pose information. In this paper, we propose a novel deep learning-based method for catheter tracking and pose detection. Our method uses a Yolov5 bounding box neural network with postprocessing to perform landmark detection in four regions of the catheter: the tip, radio-opaque marker, bend, and entry point. This allows us to track the catheter’s position and orientation in real time, without the need for additional masking or segmentation techniques. We evaluated our method on a dataset of fluoroscopic images from two distinct datasets and achieved state-of-the-art results in terms of accuracy and robustness. Our model was able to detect all four landmark features (tip, marker, bend, and entry) used to generate a pose for a catheter with 0.285 ± 0.143 mm, 0.261 ± 0.138 mm, 0.424 ± 0.361 mm, and 0.235 ± 0.085 mm accuracy. We believe that our method has the potential to significantly improve the accuracy and efficiency of catheter tracking in medical procedures that utilize bi-plane fluoroscopy guidance. Full article
(This article belongs to the Section Medical & Healthcare AI)
Show Figures

Figure 1

14 pages, 943 KiB  
Article
Inside Production Data Science: Exploring the Main Tasks of Data Scientists in Production Environments
by Arno Schmetz and Achim Kampker
AI 2024, 5(2), 873-886; https://doi.org/10.3390/ai5020043 - 12 Jun 2024
Viewed by 589
Abstract
Modern production relies on data-based analytics for the prediction and optimization of production processes. Specialized data scientists perform tasks at companies and research institutions, dealing with real data from actual production environments. The roles of data preprocessing and data quality are crucial in [...] Read more.
Modern production relies on data-based analytics for the prediction and optimization of production processes. Specialized data scientists perform tasks at companies and research institutions, dealing with real data from actual production environments. The roles of data preprocessing and data quality are crucial in data science, and an active research field deals with methodologies and technologies for this. While anecdotes and generalized surveys indicate preprocessing is the major operational task for data scientists, a detailed view of the subtasks and the domain of production data is missing. In this paper, we present a multi-stage survey on data science tasks in practice in the field of production. Using expert knowledge and insights, we found data preprocessing to be the major part of the tasks of data scientists. In detail, we found that tackling missing values, finding data point meanings, and synchronization of multiple time-series were often the most time-consuming preprocessing tasks. Full article
Show Figures

Figure 1

31 pages, 13435 KiB  
Article
Real-Time Camera Operator Segmentation with YOLOv8 in Football Video Broadcasts
by Serhii Postupaiev, Robertas Damaševičius and Rytis Maskeliūnas
AI 2024, 5(2), 842-872; https://doi.org/10.3390/ai5020042 - 6 Jun 2024
Viewed by 849
Abstract
Using instance segmentation and video inpainting provides a significant leap in real-time football video broadcast enhancements by removing potential visual distractions, such as an occasional person or another object accidentally occupying the frame. Despite its relevance and importance in the media industry, this [...] Read more.
Using instance segmentation and video inpainting provides a significant leap in real-time football video broadcast enhancements by removing potential visual distractions, such as an occasional person or another object accidentally occupying the frame. Despite its relevance and importance in the media industry, this area remains challenging and relatively understudied, thus offering potential for research. Specifically, the segmentation and inpainting of camera operator instances from video remains an underexplored research area. To address this challenge, this paper proposes a framework designed to accurately detect and remove camera operators while seamlessly hallucinating the background in real-time football broadcasts. The approach aims to enhance the quality of the broadcast by maintaining its consistency and level of engagement to retain and attract users during the game. To implement the inpainting task, firstly, the camera operators instance segmentation method should be developed. We used a YOLOv8 model for accurate real-time operator instance segmentation. The resulting model produces masked frames, which are used for further camera operator inpainting. Moreover, this paper presents an extensive “Cameramen Instances” dataset with more than 7500 samples, which serves as a solid foundation for future investigations in this area. The experimental results show that the YOLOv8 model performs better than other baseline algorithms in different scenarios. The precision of 95.5%, recall of 92.7%, mAP50-95 of 79.6, and a high FPS rate of 87 in low-volume environment prove the solution efficacy for real-time applications. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)
Show Figures

Figure 1

39 pages, 1962 KiB  
Review
Error Correction and Adaptation in Conversational AI: A Review of Techniques and Applications in Chatbots
by Saadat Izadi and Mohamad Forouzanfar
AI 2024, 5(2), 803-841; https://doi.org/10.3390/ai5020041 - 4 Jun 2024
Viewed by 790
Abstract
This study explores the progress of chatbot technology, focusing on the aspect of error correction to enhance these smart conversational tools. Chatbots, powered by artificial intelligence (AI), are increasingly prevalent across industries such as customer service, healthcare, e-commerce, and education. Despite their use [...] Read more.
This study explores the progress of chatbot technology, focusing on the aspect of error correction to enhance these smart conversational tools. Chatbots, powered by artificial intelligence (AI), are increasingly prevalent across industries such as customer service, healthcare, e-commerce, and education. Despite their use and increasing complexity, chatbots are prone to errors like misunderstandings, inappropriate responses, and factual inaccuracies. These issues can have an impact on user satisfaction and trust. This research provides an overview of chatbots, conducts an analysis of errors they encounter, and examines different approaches to rectifying these errors. These approaches include using data-driven feedback loops, involving humans in the learning process, and adjusting through learning methods like reinforcement learning, supervised learning, unsupervised learning, semi-supervised learning, and meta-learning. Through real life examples and case studies in different fields, we explore how these strategies are implemented. Looking ahead, we explore the different challenges faced by AI-powered chatbots, including ethical considerations and biases during implementation. Furthermore, we explore the transformative potential of new technological advancements, such as explainable AI models, autonomous content generation algorithms (e.g., generative adversarial networks), and quantum computing to enhance chatbot training. Our research provides information for developers and researchers looking to improve chatbot capabilities, which can be applied in service and support industries to effectively address user requirements. Full article
Show Figures

Figure 1

13 pages, 3061 KiB  
Article
Quantifying Visual Differences in Drought-Stressed Maize through Reflectance and Data-Driven Analysis
by Sanjana Banerjee, James Reynolds, Matthew Taggart, Michael Daniele, Alper Bozkurt and Edgar Lobaton
AI 2024, 5(2), 790-802; https://doi.org/10.3390/ai5020040 - 4 Jun 2024
Viewed by 425
Abstract
Environmental factors, such as drought stress, significantly impact maize growth and productivity worldwide. To improve yield and quality, effective strategies for early detection and mitigation of drought stress in maize are essential. This paper presents a detailed analysis of three imaging trials conducted [...] Read more.
Environmental factors, such as drought stress, significantly impact maize growth and productivity worldwide. To improve yield and quality, effective strategies for early detection and mitigation of drought stress in maize are essential. This paper presents a detailed analysis of three imaging trials conducted to detect drought stress in maize plants using an existing, custom-developed, low-cost, high-throughput phenotyping platform. A pipeline is proposed for early detection of water stress in maize plants using a Vision Transformer classifier and analysis of distributions of near-infrared (NIR) reflectance from the plants. A classification accuracy of 85% was achieved in one of our trials, using hold-out trials for testing. Suitable regions on the plant that are more sensitive to drought stress were explored, and it was shown that the region surrounding the youngest expanding leaf (YEL) and the stem can be used as a more consistent alternative to analysis involving just the YEL. Experiments in search of an ideal window size showed that small bounding boxes surrounding the YEL and the stem area of the plant perform better in separating drought-stressed and well-watered plants than larger window sizes enclosing most of the plant. The results presented in this work show good separation between well-watered and drought-stressed categories for two out of the three imaging trials, both in terms of classification accuracy from data-driven features as well as through analysis of histograms of NIR reflectance. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)
Show Figures

Figure 1

44 pages, 11002 KiB  
Article
A Logical–Algebraic Approach to Revising Formal Ontologies: Application in Mereotopology
by Gonzalo A. Aranda-Corral, Joaquín Borrego-Díaz, Antonia M. Chávez-González and Nataliya M. Gulayeva
AI 2024, 5(2), 746-789; https://doi.org/10.3390/ai5020039 - 29 May 2024
Viewed by 384
Abstract
In ontology engineering, reusing (or extending) ontologies poses a significant challenge, requiring revising their ontological commitments and ensuring accurate representation and coherent reasoning. This study aims to address two main objectives. Firstly, it seeks to develop a methodological approach supporting ontology extension practices. [...] Read more.
In ontology engineering, reusing (or extending) ontologies poses a significant challenge, requiring revising their ontological commitments and ensuring accurate representation and coherent reasoning. This study aims to address two main objectives. Firstly, it seeks to develop a methodological approach supporting ontology extension practices. Secondly, it aims to demonstrate its feasibility by applying the approach to the case of extending qualitative spatial reasoning (QSR) theories. Key questions involve effectively interpreting spatial extensions while maintaining consistency. The framework systematically analyzes extensions of formal ontologies, providing a reconstruction of a qualitative calculus. Reconstructed qualitative calculus demonstrates improved interpretative capabilities and reasoning accuracy. The research underscores the importance of methodological approaches when extending formal ontologies, with spatial interpretation serving as a valuable case study. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

13 pages, 9978 KiB  
Article
The Eye in the Sky—A Method to Obtain On-Field Locations of Australian Rules Football Athletes
by Zachery Born, Marion Mundt, Ajmal Mian, Jason Weber and Jacqueline Alderson
AI 2024, 5(2), 733-745; https://doi.org/10.3390/ai5020038 - 16 May 2024
Viewed by 872
Abstract
The ability to overcome an opposition in team sports is reliant upon an understanding of the tactical behaviour of the opposing team members. Recent research is limited to a performance analysts’ own playing team members, as the required opposing team athletes’ geolocation (GPS) [...] Read more.
The ability to overcome an opposition in team sports is reliant upon an understanding of the tactical behaviour of the opposing team members. Recent research is limited to a performance analysts’ own playing team members, as the required opposing team athletes’ geolocation (GPS) data are unavailable. However, in professional Australian rules Football (AF), animations of athlete GPS data from all teams are commercially available. The purpose of this technical study was to obtain the on-field location of AF athletes from animations of the 2019 Australian Football League season to enable the examination of the tactical behaviour of any team. The pre-trained object detection model YOLOv4 was fine-tuned to detect players, and a custom convolutional neural network was trained to track numbers in the animations. The object detection and the athlete tracking achieved an accuracy of 0.94 and 0.98, respectively. Subsequent scaling and translation coefficients were determined through solving an optimisation problem to transform the pixel coordinate positions of a tracked player number to field-relative Cartesian coordinates. The derived equations achieved an average Euclidean distance from the athletes’ raw GPS data of 2.63 m. The proposed athlete detection and tracking approach is a novel methodology to obtain the on-field positions of AF athletes in the absence of direct measures, which may be used for the analysis of opposition collective team behaviour and in the development of interactive play sketching AF tools. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)
Show Figures

Figure 1

29 pages, 1051 KiB  
Review
Navigating the Cyber Threat Landscape: An In-Depth Analysis of Attack Detection within IoT Ecosystems
by Samar AboulEla, Nourhan Ibrahim, Sarama Shehmir, Aman Yadav and Rasha Kashef
AI 2024, 5(2), 704-732; https://doi.org/10.3390/ai5020037 - 15 May 2024
Viewed by 1218
Abstract
The Internet of Things (IoT) is seeing significant growth, as the quantity of interconnected devices in communication networks is on the rise. The increased connectivity of devices has heightened their susceptibility to hackers, underscoring the need to safeguard IoT devices. This research investigates [...] Read more.
The Internet of Things (IoT) is seeing significant growth, as the quantity of interconnected devices in communication networks is on the rise. The increased connectivity of devices has heightened their susceptibility to hackers, underscoring the need to safeguard IoT devices. This research investigates cybersecurity in the context of the Internet of Medical Things (IoMT), which encompasses the cybersecurity mechanisms used for various healthcare devices connected to the system. This study seeks to provide a concise overview of several artificial intelligence (AI)-based methodologies and techniques, as well as examining the associated solution approaches used in cybersecurity for healthcare systems. The analyzed methodologies are further categorized into four groups: machine learning (ML) techniques, deep learning (DL) techniques, a combination of ML and DL techniques, Transformer-based techniques, and other state-of-the-art techniques, including graph-based methods and blockchain methods. In addition, this article presents a detailed description of the benchmark datasets that are recommended for use in intrusion detection systems (IDS) for both IoT and IoMT networks. Moreover, a detailed description of the primary evaluation metrics used in the analysis of the discussed models is provided. Ultimately, this study thoroughly examines and analyzes the features and practicality of several cybersecurity models, while also emphasizing recent research directions. Full article
Show Figures

Figure 1

18 pages, 1201 KiB  
Article
Efficient Paddy Grain Quality Assessment Approach Utilizing Affordable Sensors
by Aditya Singh, Kislay Raj, Teerath Meghwar and Arunabha M. Roy
AI 2024, 5(2), 686-703; https://doi.org/10.3390/ai5020036 - 14 May 2024
Viewed by 489
Abstract
Paddy (Oryza sativa) is one of the most consumed food grains in the world. The process from its sowing to consumption via harvesting, processing, storage and management require much effort and expertise. The grain quality of the product is heavily affected [...] Read more.
Paddy (Oryza sativa) is one of the most consumed food grains in the world. The process from its sowing to consumption via harvesting, processing, storage and management require much effort and expertise. The grain quality of the product is heavily affected by the weather conditions, irrigation frequency, and many other factors. However, quality control is of immense importance, and thus, the evaluation of grain quality is necessary. Since it is necessary and arduous, we try to overcome the limitations and shortcomings of grain quality evaluation using image processing and machine learning (ML) techniques. Most existing methods are designed for rice grain quality assessment, noting that the key characteristics of paddy and rice are different. In addition, they have complex and expensive setups and utilize black-box ML models. To handle these issues, in this paper, we propose a reliable ML-based IoT paddy grain quality assessment system utilizing affordable sensors. It involves a specific data collection procedure followed by image processing with an ML-based model to predict the quality. Different explainable features are used for classifying the grain quality of paddy grain, like the shape, size, moisture, and maturity of the grain. The precision of the system was tested in real-world scenarios. To our knowledge, it is the first automated system to precisely provide an overall quality metric. The main feature of our system is its explainability in terms of utilized features and fuzzy rules, which increases the confidence and trustworthiness of the public toward its use. The grain variety used for experiments majorly belonged to the Indian Subcontinent, but it covered a significant variation in the shape and size of the grain. Full article
(This article belongs to the Special Issue Artificial Intelligence-Based Image Processing and Computer Vision)
Show Figures

Figure 1

19 pages, 598 KiB  
Article
Generative Adversarial Networks for Synthetic Data Generation in Finance: Evaluating Statistical Similarities and Quality Assessment
by Faisal Ramzan, Claudio Sartori, Sergio Consoli and Diego Reforgiato Recupero
AI 2024, 5(2), 667-685; https://doi.org/10.3390/ai5020035 - 13 May 2024
Viewed by 1297
Abstract
Generating synthetic data is a complex task that necessitates accurately replicating the statistical and mathematical properties of the original data elements. In sectors such as finance, utilizing and disseminating real data for research or model development can pose substantial privacy risks owing to [...] Read more.
Generating synthetic data is a complex task that necessitates accurately replicating the statistical and mathematical properties of the original data elements. In sectors such as finance, utilizing and disseminating real data for research or model development can pose substantial privacy risks owing to the inclusion of sensitive information. Additionally, authentic data may be scarce, particularly in specialized domains where acquiring ample, varied, and high-quality data is difficult or costly. This scarcity or limited data availability can limit the training and testing of machine-learning models. In this paper, we address this challenge. In particular, our task is to synthesize a dataset with similar properties to an input dataset about the stock market. The input dataset is anonymized and consists of very few columns and rows, contains many inconsistencies, such as missing rows and duplicates, and its values are not normalized, scaled, or balanced. We explore the utilization of generative adversarial networks, a deep-learning technique, to generate synthetic data and evaluate its quality compared to the input stock dataset. Our innovation involves generating artificial datasets that mimic the statistical properties of the input elements without revealing complete information. For example, synthetic datasets can capture the distribution of stock prices, trading volumes, and market trends observed in the original dataset. The generated datasets cover a wider range of scenarios and variations, enabling researchers and practitioners to explore different market conditions and investment strategies. This diversity can enhance the robustness and generalization of machine-learning models. We evaluate our synthetic data in terms of the mean, similarities, and correlations. Full article
(This article belongs to the Special Issue AI in Finance: Leveraging AI to Transform Financial Services)
Show Figures

Figure 1

32 pages, 4863 KiB  
Article
From Eye Movements to Personality Traits: A Machine Learning Approach in Blood Donation Advertising
by Stefanos Balaskas, Maria Koutroumani, Maria Rigou and Spiros Sirmakessis
AI 2024, 5(2), 635-666; https://doi.org/10.3390/ai5020034 - 10 May 2024
Viewed by 815
Abstract
Blood donation heavily depends on voluntary involvement, but the problem of motivating and retaining potential blood donors remains. Understanding the personality traits of donors can assist in this case, bridging communication gaps and increasing participation and retention. To this end, an eye-tracking experiment [...] Read more.
Blood donation heavily depends on voluntary involvement, but the problem of motivating and retaining potential blood donors remains. Understanding the personality traits of donors can assist in this case, bridging communication gaps and increasing participation and retention. To this end, an eye-tracking experiment was designed to examine the viewing behavior of 75 participants as they viewed various blood donation-related advertisements. The purpose of these stimuli was to elicit various types of emotions (positive/negative) and message framings (altruistic/egoistic) to investigate cognitive reactions that arise from donating blood using eye-tracking parameters such as the fixation duration, fixation count, saccade duration, and saccade amplitude. The results indicated significant differences among the eye-tracking metrics, suggesting that visual engagement varies considerably in response to different types of advertisements. The fixation duration also revealed substantial differences in emotions, logo types, and emotional arousal, suggesting that the nature of stimuli can affect how viewers disperse their attention. The saccade amplitude and saccade duration were also affected by the message framings, thus indicating their relevance to eye movement behavior. Generalised linear models (GLMs) showed significant influences of personality trait effects on eye-tracking metrics, including a negative association between honesty–humility and fixation duration and a positive link between openness and both the saccade duration and fixation count. These results indicate that personality traits can significantly impact visual attention processes. The present study broadens the current research frontier by employing machine learning techniques on the collected eye-tracking data to identify personality traits that can influence donation decisions and experiences. Participants’ eye movements were analysed to categorize their dominant personality traits using hierarchical clustering, while machine learning algorithms, including Support Vector Machine (SVM), Random Forest, and k-Nearest Neighbours (KNN), were employed to predict personality traits. Among the models, SVM and KNN exhibited high accuracy (86.67%), while Random Forest scored considerably lower (66.67%). This investigation reveals that computational models can infer personality traits from eye movements, which shows great potential for psychological profiling and human–computer interaction. This study integrates psychology research and machine learning, paving the way for further studies on personality assessment by eye tracking. Full article
(This article belongs to the Special Issue Machine Learning for HCI: Cases, Trends and Challenges)
Show Figures

Figure 1

17 pages, 3166 KiB  
Article
Remote Sensing Crop Water Stress Determination Using CNN-ViT Architecture
by Kawtar Lehouel, Chaima Saber, Mourad Bouziani and Reda Yaagoubi
AI 2024, 5(2), 618-634; https://doi.org/10.3390/ai5020033 - 9 May 2024
Viewed by 718
Abstract
Efficiently determining crop water stress is vital for optimising irrigation practices and enhancing agricultural productivity. In this realm, the synergy of deep learning with remote sensing technologies offers a significant opportunity. This study introduces an innovative end-to-end deep learning pipeline for within-field crop [...] Read more.
Efficiently determining crop water stress is vital for optimising irrigation practices and enhancing agricultural productivity. In this realm, the synergy of deep learning with remote sensing technologies offers a significant opportunity. This study introduces an innovative end-to-end deep learning pipeline for within-field crop water determination. This involves the following: (1) creating an annotated dataset for crop water stress using Landsat 8 imagery, (2) deploying a standalone vision transformer model ViT, and (3) the implementation of a proposed CNN-ViT model. This approach allows for a comparative analysis between the two architectures, ViT and CNN-ViT, in accurately determining crop water stress. The results of our study demonstrate the effectiveness of the CNN-ViT framework compared to the standalone vision transformer model. The CNN-ViT approach exhibits superior performance, highlighting its enhanced accuracy and generalisation capabilities. The findings underscore the significance of an integrated deep learning pipeline combined with remote sensing data in the determination of crop water stress, providing a reliable and scalable tool for real-time monitoring and resource management contributing to sustainable agricultural practices. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

16 pages, 3532 KiB  
Article
Robotics Perception: Intention Recognition to Determine the Handball Occurrence during a Football or Soccer Match
by Mohammad Mehedi Hassan, Stephen Karungaru and Kenji Terada
AI 2024, 5(2), 602-617; https://doi.org/10.3390/ai5020032 - 8 May 2024
Viewed by 706
Abstract
In football or soccer, a referee controls the game based on the set rules. The decisions made by the referee are final and can’t be appealed. Some of the decisions, especially after a handball event, whether to award a penalty kick or a [...] Read more.
In football or soccer, a referee controls the game based on the set rules. The decisions made by the referee are final and can’t be appealed. Some of the decisions, especially after a handball event, whether to award a penalty kick or a yellow/red card can greatly affect the final results of a game. It is therefore necessary that the referee does not make an error. The objective is therefore to create a system that can accurately recognize such events and make the correct decision. This study chose handball, an event that occurs in a football game (Not to be confused with the game of Handball). We define a handball event using object detection and robotic perception and decide whether it is intentional or not. Intention recognition is a robotic perception of emotion recognition. To define handball, we trained a model to detect the hand and ball which are primary objects. We then determined the intention using gaze recognition and finally combined the results to recognize a handball event. On our dataset, the results of the hand and the ball object detection were 96% and 100% respectively. With the gaze recognition at 100%, if all objects were recognized, then the intention and handball event recognition were at 100%. Full article
(This article belongs to the Section AI in Autonomous Systems)
Show Figures

Figure 1

8 pages, 215 KiB  
Communication
Ethical Considerations for Artificial Intelligence Applications for HIV
by Renee Garett, Seungjun Kim and Sean D. Young
AI 2024, 5(2), 594-601; https://doi.org/10.3390/ai5020031 - 7 May 2024
Viewed by 886
Abstract
Human Immunodeficiency Virus (HIV) is a stigmatizing disease that disproportionately affects African Americans and Latinos among people living with HIV (PLWH). Researchers are increasingly utilizing artificial intelligence (AI) to analyze large amounts of data such as social media data and electronic health records [...] Read more.
Human Immunodeficiency Virus (HIV) is a stigmatizing disease that disproportionately affects African Americans and Latinos among people living with HIV (PLWH). Researchers are increasingly utilizing artificial intelligence (AI) to analyze large amounts of data such as social media data and electronic health records (EHR) for various HIV-related tasks, from prevention and surveillance to treatment and counseling. This paper explores the ethical considerations surrounding the use of AI for HIV with a focus on acceptability, trust, fairness, and transparency. To improve acceptability and trust towards AI systems for HIV, informed consent and a Federated Learning (FL) approach are suggested. In regard to unfairness, stakeholders should be wary of AI systems for HIV further stigmatizing or even being used as grounds to criminalize PLWH. To prevent criminalization, in particular, the application of differential privacy on HIV data generated by data linkage should be studied. Participatory design is crucial in designing the AI systems for HIV to be more transparent and inclusive. To this end, the formation of a data ethics committee and the construction of relevant frameworks and principles may need to be concurrently implemented. Lastly, the question of whether the amount of transparency beyond a certain threshold may overwhelm patients, thereby unexpectedly triggering negative consequences, is posed. Full article
(This article belongs to the Special Issue Standards and Ethics in AI)
18 pages, 6698 KiB  
Article
Investigating Training Datasets of Real and Synthetic Images for Outdoor Swimmer Localisation with YOLO
by Mohsen Khan Mohammadi, Toni Schneidereit, Ashkan Mansouri Yarahmadi and Michael Breuß
AI 2024, 5(2), 576-593; https://doi.org/10.3390/ai5020030 - 1 May 2024
Viewed by 743
Abstract
In this study, we developed and explored a methodical image augmentation technique for swimmer localisation in northern German outdoor lake environments. When it comes to enhancing swimmer safety, a main issue we have to deal with is the lack of real-world training data [...] Read more.
In this study, we developed and explored a methodical image augmentation technique for swimmer localisation in northern German outdoor lake environments. When it comes to enhancing swimmer safety, a main issue we have to deal with is the lack of real-world training data of such outdoor environments. Natural lighting changes, dynamic water textures, and barely visible swimming persons are key issues to address. We account for these difficulties by adopting an effective background removal technique with available training data. This allows us to edit swimmers into natural environment backgrounds for use in subsequent image augmentation. We created 17 training datasets with real images, synthetic images, and a mixture of both to investigate different aspects and characteristics of the proposed approach. The datasets were used to train YOLO architectures for possible future applications in real-time detection. The trained frameworks were then tested and evaluated on outdoor environment imagery acquired using a safety drone to investigate and confirm their usefulness for outdoor swimmer localisation. Full article
Show Figures

Figure 1

21 pages, 14728 KiB  
Article
Development of an Attention Mechanism for Task-Adaptive Heterogeneous Robot Teaming
by Yibei Guo, Chao Huang and Rui Liu
AI 2024, 5(2), 555-575; https://doi.org/10.3390/ai5020029 - 23 Apr 2024
Viewed by 958
Abstract
The allure of team scale and functional diversity has led to the promising adoption of heterogeneous multi-robot systems (HMRS) in complex, large-scale operations such as disaster search and rescue, site surveillance, and social security. These systems, which coordinate multiple robots of varying functions [...] Read more.
The allure of team scale and functional diversity has led to the promising adoption of heterogeneous multi-robot systems (HMRS) in complex, large-scale operations such as disaster search and rescue, site surveillance, and social security. These systems, which coordinate multiple robots of varying functions and quantities, face the significant challenge of accurately assembling robot teams that meet the dynamic needs of tasks with respect to size and functionality, all while maintaining minimal resource expenditure. This paper introduces a pioneering adaptive cooperation method named inner attention (innerATT), crafted to dynamically configure teams of heterogeneous robots in response to evolving task types and environmental conditions. The innerATT method is articulated through the integration of an innovative attention mechanism within a multi-agent actor–critic reinforcement learning framework, enabling the strategic analysis of robot capabilities to efficiently form teams that fulfill specific task demands. To demonstrate the efficacy of innerATT in facilitating cooperation, experimental scenarios encompassing variations in task type (“Single Task”, “Double Task”, and “Mixed Task”) and robot availability are constructed under the themes of “task variety” and “robot availability variety.” The findings affirm that innerATT significantly enhances flexible cooperation, diminishes resource usage, and bolsters robustness in task fulfillment. Full article
Show Figures

Figure 1

5 pages, 174 KiB  
Editorial
Artificial Intelligence in Healthcare: ChatGPT and Beyond
by Tim Hulsen
AI 2024, 5(2), 550-554; https://doi.org/10.3390/ai5020028 - 19 Apr 2024
Viewed by 1611
Abstract
Artificial intelligence (AI), the simulation of human intelligence processes by machines, is having a growing impact on healthcare [...] Full article
17 pages, 8939 KiB  
Article
ANNs Predicting Noisy Signals in Electronic Circuits: A Model Predicting the Signal Trend in Amplification Systems
by Alessandro Massaro
AI 2024, 5(2), 533-549; https://doi.org/10.3390/ai5020027 - 17 Apr 2024
Viewed by 975
Abstract
In the proposed paper, an artificial neural network (ANN) algorithm is applied to predict the electronic circuit outputs of voltage signals in Industry 4.0/5.0 scenarios. This approach is suitable to predict possible uncorrected behavior of control circuits affected by unknown noises, and to [...] Read more.
In the proposed paper, an artificial neural network (ANN) algorithm is applied to predict the electronic circuit outputs of voltage signals in Industry 4.0/5.0 scenarios. This approach is suitable to predict possible uncorrected behavior of control circuits affected by unknown noises, and to reproduce a testbed method simulating the noise effect influencing the amplification of an input sinusoidal voltage signal, which is a basic and fundamental signal for controlled manufacturing systems. The performed simulations take into account different noise signals changing their time-domain trend and frequency behavior to prove the possibility of predicting voltage outputs when complex signals are considered at the control circuit input, including additive disturbs and noises. The results highlight that it is possible to construct a good ANN training model by processing only the registered voltage output signals without considering the noise profile (which is typically unknown). The proposed model behaves as an electronic black box for Industry 5.0 manufacturing processes automating circuit and machine tuning procedures. By analyzing state-of-the-art ANNs, the study offers an innovative ANN-based versatile solution that is able to process various noise profiles without requiring prior knowledge of the noise characteristics. Full article
Show Figures

Figure 1

17 pages, 255 KiB  
Review
Fetal Hypoxia Detection Using Machine Learning: A Narrative Review
by Nawaf Alharbi, Mustafa Youldash, Duha Alotaibi, Haya Aldossary, Reema Albrahim, Reham Alzahrani, Wahbia Ahmed Saleh, Sunday O. Olatunji and May Issa Aldossary
AI 2024, 5(2), 516-532; https://doi.org/10.3390/ai5020026 - 13 Apr 2024
Viewed by 953
Abstract
Fetal hypoxia is a condition characterized by a lack of oxygen supply in a developing fetus in the womb. It can cause potential risks, leading to abnormalities, birth defects, and even mortality. Cardiotocograph (CTG) monitoring is among the techniques that can detect any [...] Read more.
Fetal hypoxia is a condition characterized by a lack of oxygen supply in a developing fetus in the womb. It can cause potential risks, leading to abnormalities, birth defects, and even mortality. Cardiotocograph (CTG) monitoring is among the techniques that can detect any signs of fetal distress, including hypoxia. Due to the critical importance of interpreting the results of this test, it is essential to accompany these tests with the evolving available technology to classify cases of hypoxia into three cases: normal, suspicious, or pathological. Furthermore, Machine Learning (ML) is a blossoming technique constantly developing and aiding in medical studies, particularly fetal health prediction. Notwithstanding the past endeavors of health providers to detect hypoxia in fetuses, implementing ML and Deep Learning (DL) techniques ensures more timely and precise detection of fetal hypoxia by efficiently and accurately processing complex patterns in large datasets. Correspondingly, this review paper aims to explore the application of artificial intelligence models using cardiotocographic test data. The anticipated outcome of this review is to introduce guidance for future studies to enhance accuracy in detecting cases categorized within the suspicious class, an aspect that has encountered challenges in previous studies that holds significant implications for obstetricians in effectively monitoring fetal health and making informed decisions. Full article
(This article belongs to the Section Medical & Healthcare AI)
Show Figures

Figure 1

12 pages, 814 KiB  
Article
Towards an ELSA Curriculum for Data Scientists
by Maria Christoforaki and Oya Deniz Beyan
AI 2024, 5(2), 504-515; https://doi.org/10.3390/ai5020025 - 11 Apr 2024
Cited by 1 | Viewed by 874
Abstract
The use of artificial intelligence (AI) applications in a growing number of domains in recent years has put into focus the ethical, legal, and societal aspects (ELSA) of these technologies and the relevant challenges they pose. In this paper, we propose an ELSA [...] Read more.
The use of artificial intelligence (AI) applications in a growing number of domains in recent years has put into focus the ethical, legal, and societal aspects (ELSA) of these technologies and the relevant challenges they pose. In this paper, we propose an ELSA curriculum for data scientists aiming to raise awareness about ELSA challenges in their work, provide them with a common language with the relevant domain experts in order to cooperate to find appropriate solutions, and finally, incorporate ELSA in the data science workflow. ELSA should not be seen as an impediment or a superfluous artefact but rather as an integral part of the Data Science Project Lifecycle. The proposed curriculum uses the CRISP-DM (CRoss-Industry Standard Process for Data Mining) model as a backbone to define a vertical partition expressed in modules corresponding to the CRISP-DM phases. The horizontal partition includes knowledge units belonging to three strands that run through the phases, namely ethical and societal, legal and technical rendering knowledge units (KUs). In addition to the detailed description of the aforementioned KUs, we also discuss their implementation, issues such as duration, form, and evaluation of participants, as well as the variance of the knowledge level and needs of the target audience. Full article
(This article belongs to the Special Issue Standards and Ethics in AI)
Show Figures

Figure 1

22 pages, 5272 KiB  
Article
ECARRNet: An Efficient LSTM-Based Ensembled Deep Neural Network Architecture for Railway Fault Detection
by Salman Ibne Eunus, Shahriar Hossain, A. E. M. Ridwan, Ashik Adnan, Md. Saiful Islam, Dewan Ziaul Karim, Golam Rabiul Alam and Jia Uddin
AI 2024, 5(2), 482-503; https://doi.org/10.3390/ai5020024 - 8 Apr 2024
Viewed by 1898
Abstract
Accidents due to defective railway lines and derailments are common disasters that are observed frequently in Southeast Asian countries. It is imperative to run proper diagnosis over the detection of such faults to prevent such accidents. However, manual detection of such faults periodically [...] Read more.
Accidents due to defective railway lines and derailments are common disasters that are observed frequently in Southeast Asian countries. It is imperative to run proper diagnosis over the detection of such faults to prevent such accidents. However, manual detection of such faults periodically can be both time-consuming and costly. In this paper, we have proposed a Deep Learning (DL)-based algorithm for automatic fault detection in railway tracks, which we termed an Ensembled Convolutional Autoencoder ResNet-based Recurrent Neural Network (ECARRNet). We compared its output with existing DL techniques in the form of several pre-trained DL models to investigate railway tracks and determine whether they are defective or not while considering commonly prevalent faults such as—defects in rails and fasteners. Moreover, we manually collected the images from different railway tracks situated in Bangladesh and made our dataset. After comparing our proposed model with the existing models, we found that our proposed architecture has produced the highest accuracy among all the previously existing state-of-the-art (SOTA) architecture, with an accuracy of 93.28% on the full dataset. Additionally, we split our dataset into two parts having two different types of faults, which are fasteners and rails. We ran the models on those two separate datasets, obtaining accuracies of 98.59% and 92.06% on rail and fastener, respectively. Model explainability techniques like Grad-CAM and LIME were used to validate the result of the models, where our proposed model ECARRNet was seen to correctly classify and detect the regions of faulty railways effectively compared to the previously existing transfer learning models. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

17 pages, 4056 KiB  
Article
Visual Analytics in Explaining Neural Networks with Neuron Clustering
by Gulsum Alicioglu and Bo Sun
AI 2024, 5(2), 465-481; https://doi.org/10.3390/ai5020023 - 5 Apr 2024
Viewed by 991
Abstract
Deep learning (DL) models have achieved state-of-the-art performance in many domains. The interpretation of their working mechanisms and decision-making process is essential because of their complex structure and black-box nature, especially for sensitive domains such as healthcare. Visual analytics (VA) combined with DL [...] Read more.
Deep learning (DL) models have achieved state-of-the-art performance in many domains. The interpretation of their working mechanisms and decision-making process is essential because of their complex structure and black-box nature, especially for sensitive domains such as healthcare. Visual analytics (VA) combined with DL methods have been widely used to discover data insights, but they often encounter visual clutter (VC) issues. This study presents a compact neural network (NN) view design to reduce the visual clutter in explaining the DL model components for domain experts and end users. We utilized clustering algorithms to group hidden neurons based on their activation similarities. This design supports the overall and detailed view of the neuron clusters. We used a tabular healthcare dataset as a case study. The design for clustered results reduced visual clutter among neuron representations by 54% and connections by 88.7% and helped to observe similar neuron activations learned during the training process. Full article
(This article belongs to the Special Issue Machine Learning for HCI: Cases, Trends and Challenges)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop