applsci-logo

Journal Browser

Journal Browser

Machine/Deep Learning: Applications, Technologies and Algorithms

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (30 September 2023) | Viewed by 74659

Special Issue Editor


E-Mail Website
Guest Editor
IDAL, Electronic Engineering Department, University of Valencia, Av. Universitat, SN, Burjassot, 46100 Valencia, Spain
Interests: deep learning applications
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

This Special Issue focuses on new machine/deep learning algorithms as well as their applications in NLP, computer vision, and video processing in different fields of knowledge (medicine, industry, economy, society, environment, etc.).

Prof. Dr. Emilio Soria-Olivas
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • machine learning
  • NLP
  • artificial intelligence

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (26 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

21 pages, 5146 KiB  
Article
Construction Project Cost Prediction Method Based on Improved BiLSTM
by Chaoxue Wang and Jiale Qiao
Appl. Sci. 2024, 14(3), 978; https://doi.org/10.3390/app14030978 - 23 Jan 2024
Cited by 4 | Viewed by 1221
Abstract
In construction project management, accurate cost forecasting is critical for ensuring informed decision making. In this article, a construction cost prediction method based on an improved bidirectional long- and short-term memory (BiLSTM) network is proposed to address the high interactivity among construction cost [...] Read more.
In construction project management, accurate cost forecasting is critical for ensuring informed decision making. In this article, a construction cost prediction method based on an improved bidirectional long- and short-term memory (BiLSTM) network is proposed to address the high interactivity among construction cost data and difficulty in feature extraction. Firstly, the correlation between cost-influencing factors and the unilateral cost is calculated via grey correlation analysis to select the characteristic index. Secondly, a BiLSTM network is used to capture the temporal interactions in the cost data at a deep level, and the hybrid attention mechanism is incorporated to enhance the model’s feature extraction capability to comprehensively capture the interactions among the features in the cost data. Finally, a hyperparameter optimisation method based on the improved particle swarm optimisation algorithm is proposed using the prediction accuracy as the fitness function of the algorithm. The MAE, RMSE, MPE, MAPE, and coefficient of determination of the simulated prediction results of the proposed method on the dataset are 7.487, 8.936, 0.236, 0.393, and 0.996%, respectively, where MPE is a positive coefficient. This avoids the serious consequences of underestimating the cost. Compared with the unimproved BiLSTM, the MAE, RMSE, and MAPE are reduced by 15.271, 18.193, and 0.784%, respectively, which reflects the superiority and effectiveness of the method and can provide technical support for project cost estimation in the construction field. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

12 pages, 302 KiB  
Article
Kernel Geometric Mean Metric Learning
by Zixin Feng, Teligeng Yun, Yu Zhou, Ruirui Zheng and Jianjun He
Appl. Sci. 2023, 13(21), 12047; https://doi.org/10.3390/app132112047 - 6 Nov 2023
Viewed by 1093
Abstract
Geometric mean metric learning (GMML) algorithm is a novel metric learning approach proposed recently. It has many advantages such as unconstrained convex objective function, closed form solution, faster computational speed, and interpretability over other existing metric learning technologies. However, addressing the nonlinear problem [...] Read more.
Geometric mean metric learning (GMML) algorithm is a novel metric learning approach proposed recently. It has many advantages such as unconstrained convex objective function, closed form solution, faster computational speed, and interpretability over other existing metric learning technologies. However, addressing the nonlinear problem is not effective enough. The kernel method is an effective method to solve nonlinear problems. Therefore, a kernel geometric mean metric learning (KGMML) algorithm is proposed. The basic idea is to transform the input space into a high-dimensional feature space through nonlinear transformation, and use the integral representation of the weighted geometric mean and the Woodbury matrix identity in new feature space to generalize the analytical solution obtained in the GMML algorithm as a form represented by a kernel matrix, and then the KGMML algorithm is obtained through operations. Experimental results on 15 datasets show that the proposed algorithm can effectively improve the accuracy of the GMML algorithm and other metric algorithms. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

15 pages, 3138 KiB  
Article
An Intelligent Facial Expression Recognition System Using a Hybrid Deep Convolutional Neural Network for Multimedia Applications
by Ahmed J. Obaid and Hassanain K. Alrammahi
Appl. Sci. 2023, 13(21), 12049; https://doi.org/10.3390/app132112049 - 5 Nov 2023
Cited by 2 | Viewed by 2455
Abstract
Recognizing facial expressions plays a crucial role in various multimedia applications, such as human–computer interactions and the functioning of autonomous vehicles. This paper introduces a hybrid feature extraction network model to bolster the discriminative capacity of emotional features for multimedia applications. The proposed [...] Read more.
Recognizing facial expressions plays a crucial role in various multimedia applications, such as human–computer interactions and the functioning of autonomous vehicles. This paper introduces a hybrid feature extraction network model to bolster the discriminative capacity of emotional features for multimedia applications. The proposed model comprises a convolutional neural network (CNN) and deep belief network (DBN) series. First, a spatial CNN network processed static facial images, followed by a temporal CNN network. The CNNs were fine-tuned based on facial expression recognition (FER) datasets. A deep belief network (DBN) model was then applied to integrate the segment-level spatial and temporal features. Deep fusion networks were jointly used to learn spatiotemporal features for discrimination purposes. Due to its generalization capabilities, we used a multi-class support vector machine classifier to classify the seven basic emotions in the proposed model. The proposed model exhibited 98.14% recognition performance for the JaFFE database, 95.29% for the KDEF database, and 98.86% for the RaFD database. It is shown that the proposed method is effective for all three databases, compared with the previous schemes for JAFFE, KDEF, and RaFD databases. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

21 pages, 7943 KiB  
Article
Multi-Scale Residual Depthwise Separable Convolution for Metro Passenger Flow Prediction
by Taoying Li, Lu Liu and Meng Li
Appl. Sci. 2023, 13(20), 11272; https://doi.org/10.3390/app132011272 - 13 Oct 2023
Cited by 1 | Viewed by 1187
Abstract
Accurate prediction of metro passenger flow helps operating departments optimize scheduling plans, alleviate passenger flow pressure, and improve service quality. However, existing passenger flow prediction models tend to only consider the historical passenger flow of a single station while ignoring the spatial relationships [...] Read more.
Accurate prediction of metro passenger flow helps operating departments optimize scheduling plans, alleviate passenger flow pressure, and improve service quality. However, existing passenger flow prediction models tend to only consider the historical passenger flow of a single station while ignoring the spatial relationships between different stations and correlations between passenger flows, resulting in low prediction accuracy. Therefore, a multi-scale residual depthwise separable convolution network (MRDSCNN) is proposed for metro passenger flow prediction, which consists of three pivotal components, including residual depthwise separable convolution (RDSC), multi-scale depthwise separable convolution (MDSC), and attention bidirectional gated recurrent unit (AttBiGRU). The RDSC module is designed to capture local spatial and temporal correlations leveraging the diverse temporal patterns of passenger flows, and then the MDSC module is specialized in obtaining the inter-station correlations between the target station and other heterogeneous stations throughout the metro network. Subsequently, these correlations are fed into AttBiGRU to extract global interaction features and obtain passenger flow prediction results. Finally, the Hangzhou metro passenger inflow and outflow data are employed to assess the model performance, and the results show that the proposed model outperforms other models. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

16 pages, 2902 KiB  
Article
Generating Structurally Complete Stylish Chinese Font Based on Semi-Supervised Model
by Xianqing Tian, Fang Yang and Fan Tang
Appl. Sci. 2023, 13(19), 10650; https://doi.org/10.3390/app131910650 - 25 Sep 2023
Cited by 1 | Viewed by 2405
Abstract
Stylish fonts are widely utilized in social networks, driving research interest in stylish Chinese font generation. Most existing methods for generating Chinese characters rely on GAN-based deep learning approaches. However, persistent issues of mode collapse and missing structure in CycleGAN-generated results make font [...] Read more.
Stylish fonts are widely utilized in social networks, driving research interest in stylish Chinese font generation. Most existing methods for generating Chinese characters rely on GAN-based deep learning approaches. However, persistent issues of mode collapse and missing structure in CycleGAN-generated results make font generation a formidable task. This paper introduces a unique semi-supervised model specifically designed for generating stylish Chinese fonts. By incorporating a small amount of paired data and stroke encoding as information into CycleGAN, our model effectively captures both the global structural features and local characteristics of Chinese characters. Additionally, an attention module has been added and refined to improve the connection between strokes and character structures. To enhance the model’s understanding of the global structure of Chinese characters and stroke encoding reconstruction, two loss functions are included. The integration of these components successfully alleviates mode collapse and structural errors. To assess the effectiveness of our approach, comprehensive visual and quantitative assessments are conducted, comparing our method with benchmark approaches on six diverse datasets. The results clearly demonstrate the superior performance of our method in generating stylish Chinese fonts. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

15 pages, 2951 KiB  
Article
HPC Platform for Railway Safety-Critical Functionalities Based on Artificial Intelligence
by Mikel Labayen, Laura Medina, Fernando Eizaguirre, José Flich and Naiara Aginako
Appl. Sci. 2023, 13(15), 9017; https://doi.org/10.3390/app13159017 - 7 Aug 2023
Cited by 2 | Viewed by 1641
Abstract
The automation of railroad operations is a rapidly growing industry. In 2023, a new European standard for the automated Grade of Automation (GoA) 2 over European Train Control System (ETCS) driving is anticipated. Meanwhile, railway stakeholders are already planning their research initiatives for [...] Read more.
The automation of railroad operations is a rapidly growing industry. In 2023, a new European standard for the automated Grade of Automation (GoA) 2 over European Train Control System (ETCS) driving is anticipated. Meanwhile, railway stakeholders are already planning their research initiatives for driverless and unattended autonomous driving systems. As a result, the industry is particularly active in research regarding perception technologies based on Computer Vision (CV) and Artificial Intelligence (AI), with outstanding results at the application level. However, executing high-performance and safety-critical applications on embedded systems and in real-time is a challenge. There are not many commercially available solutions, since High-Performance Computing (HPC) platforms are typically seen as being beyond the business of safety-critical systems. This work proposes a novel safety-critical and high-performance computing platform for CV- and AI-enhanced technology execution used for automatic accurate stopping and safe passenger transfer railway functionalities. The resulting computing platform is compatible with the majority of widely-used AI inference methodologies, AI model architectures, and AI model formats thanks to its design, which enables process separation, redundant execution, and HW acceleration in a transparent manner. The proposed technology increases the portability of railway applications into embedded systems, isolates crucial operations, and effectively and securely maintains system resources. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

12 pages, 400 KiB  
Article
ATOSE: Audio Tagging with One-Sided Joint Embedding
by Jaehwan Lee, Daekyeong Moon, Jik-Soo Kim and Minkyoung Cho
Appl. Sci. 2023, 13(15), 9002; https://doi.org/10.3390/app13159002 - 6 Aug 2023
Viewed by 1320
Abstract
Audio auto-tagging is the process of assigning labels to audio clips for better categorization and management of audio file databases. With the advent of advanced artificial intelligence technologies, there has been increasing interest in directly using raw audio data as input for deep [...] Read more.
Audio auto-tagging is the process of assigning labels to audio clips for better categorization and management of audio file databases. With the advent of advanced artificial intelligence technologies, there has been increasing interest in directly using raw audio data as input for deep learning models in order to perform tagging and eliminate the need for preprocessing. Unfortunately, most current studies of audio auto-tagging cannot effectively reflect the semantic relationships between tags—for instance, the connection between “classical music” and “cello”. In this paper, we propose a novel method that can enhance audio auto-tagging performance via joint embedding. Our model has been carefully designed and architected to recognize the semantic information within the tag domains. In our experiments using the MagnaTagATune (MTAT) dataset, which has high inter-tag correlations, and the Speech Commands dataset, which has no inter-tag correlations, we showed that our approach improves the performance of existing models when there are strong inter-tag correlations. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

22 pages, 1215 KiB  
Article
Anomaly Detection in Microservice-Based Systems
by João Nobre, E. J. Solteiro Pires and Arsénio Reis
Appl. Sci. 2023, 13(13), 7891; https://doi.org/10.3390/app13137891 - 5 Jul 2023
Cited by 2 | Viewed by 4041
Abstract
Currently, distributed software systems have evolved at an unprecedented pace. Modern software-quality requirements are high and require significant staff support and effort. This study investigates the use of a supervised machine learning model, a Multi-Layer Perceptron (MLP), for anomaly detection in microservices. The [...] Read more.
Currently, distributed software systems have evolved at an unprecedented pace. Modern software-quality requirements are high and require significant staff support and effort. This study investigates the use of a supervised machine learning model, a Multi-Layer Perceptron (MLP), for anomaly detection in microservices. The study covers the creation of a microservices infrastructure, the development of a fault injection module that simulates application-level and service-level anomalies, the creation of a system monitoring dataset, and the creation and validation of the MLP model to detect anomalies. The results indicate that the MLP model effectively detects anomalies in both domains with higher accuracy, precision, recovery, and F1 score on the service-level anomaly dataset. The potential for more effective distributed system monitoring and management automation is highlighted in this study by focusing on service-level metrics such as service response times. This study provides valuable information about the effectiveness of supervised machine learning models in detecting anomalies across distributed software systems. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

22 pages, 9943 KiB  
Article
Machine Learning Algorithms Combining Slope Deceleration and Fetal Heart Rate Features to Predict Acidemia
by Luis Mariano Esteban, Berta Castán, Javier Esteban-Escaño, Gerardo Sanz-Enguita, Antonio R. Laliena, Ana Cristina Lou-Mercadé, Marta Chóliz-Ezquerro, Sergio Castán and Ricardo Savirón-Cornudella
Appl. Sci. 2023, 13(13), 7478; https://doi.org/10.3390/app13137478 - 25 Jun 2023
Cited by 2 | Viewed by 1679
Abstract
Electronic fetal monitoring (EFM) is widely used in intrapartum care as the standard method for monitoring fetal well-being. Our objective was to employ machine learning algorithms to predict acidemia by analyzing specific features extracted from the fetal heart signal within a 30 min [...] Read more.
Electronic fetal monitoring (EFM) is widely used in intrapartum care as the standard method for monitoring fetal well-being. Our objective was to employ machine learning algorithms to predict acidemia by analyzing specific features extracted from the fetal heart signal within a 30 min window, with a focus on the last deceleration occurring closest to delivery. To achieve this, we conducted a case–control study involving 502 infants born at Miguel Servet University Hospital in Spain, maintaining a 1:1 ratio between cases and controls. Neonatal acidemia was defined as a pH level below 7.10 in the umbilical arterial blood. We constructed logistic regression, classification trees, random forest, and neural network models by combining EFM features to predict acidemia. Model validation included assessments of discrimination, calibration, and clinical utility. Our findings revealed that the random forest model achieved the highest area under the receiver characteristic curve (AUC) of 0.971, but logistic regression had the best specificity, 0.879, for a sensitivity of 0.95. In terms of clinical utility, implementing a cutoff point of 31% in the logistic regression model would prevent unnecessary cesarean sections in 51% of cases while missing only 5% of acidotic cases. By combining the extracted variables from EFM recordings, we provide a practical tool to assist in avoiding unnecessary cesarean sections. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

33 pages, 26785 KiB  
Article
A Comparison Study of Generative Adversarial Network Architectures for Malicious Cyber-Attack Data Generation
by Nikolaos Peppes, Theodoros Alexakis, Konstantinos Demestichas and Evgenia Adamopoulou
Appl. Sci. 2023, 13(12), 7106; https://doi.org/10.3390/app13127106 - 14 Jun 2023
Cited by 4 | Viewed by 1772
Abstract
The digitization trend that prevails nowadays has led to increased vulnerabilities of tools and technologies of everyday life. One of the many different types of software vulnerabilities and attacks is botnets. Botnets enable attackers to gain remote control of the infected machines, often [...] Read more.
The digitization trend that prevails nowadays has led to increased vulnerabilities of tools and technologies of everyday life. One of the many different types of software vulnerabilities and attacks is botnets. Botnets enable attackers to gain remote control of the infected machines, often leading to disastrous consequences. Cybersecurity experts engage machine learning (ML) and deep learning (DL) technologies for designing and developing smart and proactive cybersecurity systems in order to tackle such infections. The development of such systems is, often, hindered by the lack of data that can be used to train them. Aiming to address this problem, this study proposes and describes a methodology for the generation of botnet-type data in tabular format. This methodology involves the design and development of two generative adversarial network (GAN) models, one with six layers and the other with eight layers, to identify the most efficient and reliable one in terms of the similarity of the generated data to the real ones. The two GAN models produce data in loops of 25, 50, 100, 250, 500 and 1000 epochs. The results are quite encouraging as, for both models, the similarity between the synthetic and the real data is around 80%. The eight-layer solution is slightly better as, after running for 1000 epochs, it achieved a similarity degree of 82%, outperforming the six-layer one, which achieved 77%. These results indicate that such solutions of data augmentation in the cybersecurity domain are feasible and reliable and can lead to new standards for developing and training trustworthy ML and DL solutions for detecting and tackling botnet attacks. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

19 pages, 3314 KiB  
Article
The Applicability of Machine Learning Methods to the Characterization of Fibrous Gas Diffusion Layers
by Dieter Froning, Eugen Hoppe and Ralf Peters
Appl. Sci. 2023, 13(12), 6981; https://doi.org/10.3390/app13126981 - 9 Jun 2023
Viewed by 1176
Abstract
Porous materials can be characterized by well-trained neural networks. In this study, fibrous paper-type gas diffusion layers were trained with artificial data created by a stochastic geometry model. The features of the data were calculated by means of transport simulations using the Lattice–Boltzmann [...] Read more.
Porous materials can be characterized by well-trained neural networks. In this study, fibrous paper-type gas diffusion layers were trained with artificial data created by a stochastic geometry model. The features of the data were calculated by means of transport simulations using the Lattice–Boltzmann method based on stochastic micro-structures. A convolutional neural network was developed that can predict the permeability and tortuosity of the material, through-plane and in-plane. The characteristics of real data, both uncompressed and compressed, were predicted. The data were represented by reconstructed images of different sizes and image resolutions. Image artifacts are also a source of potential errors in the prediction. The Kozeny–Carman trend was used to evaluate the prediction of permeability and tortuosity of compressed real data. Using this method, it was possible to decide if the predictions on compressed data were appropriate. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Graphical abstract

19 pages, 10099 KiB  
Article
An Adaptive Multitask Network for Detecting the Region of Water Leakage in Tunnels
by Liang Zhao, Jiawei Wang, Shipeng Liu and Xiaoyan Yang
Appl. Sci. 2023, 13(10), 6231; https://doi.org/10.3390/app13106231 - 19 May 2023
Cited by 4 | Viewed by 1389
Abstract
Tunnels water leakage detection in complex environments is difficult to detect the edge information due to the structural similarity between the region of water seepage and wet stains. In order to address the issue, this study proposes a model comprising a multilevel transformer [...] Read more.
Tunnels water leakage detection in complex environments is difficult to detect the edge information due to the structural similarity between the region of water seepage and wet stains. In order to address the issue, this study proposes a model comprising a multilevel transformer encoder and an adaptive multitask decoder. The multilevel transformer encoder is a layered transformer to extract the multilevel characteristics of water leakage information, and the adaptive multitask decoder comprises the adaptive network branches. The adaptive network branches generate the ground truths of wet stains and water seepage through the threshold value and transmit them to the network for training. The converged network, the U-net, fuses coarse images from the adaptive multitask decoder, and the fusion images are the final segmentation results of water leakage in tunnels. The experimental results indicate that the proposed model achieves 95.1% Dice and 90.4% MIOU, respectively. This proposed model demonstrates a superior level of precision and generalization when compared to other related models. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

12 pages, 364 KiB  
Article
Comparison between Machine Learning and Deep Learning Approaches for the Detection of Toxic Comments on Social Networks
by Andrea Bonetti, Marcelino Martínez-Sober, Julio C. Torres, Jose M. Vega, Sebastien Pellerin and Joan Vila-Francés
Appl. Sci. 2023, 13(10), 6038; https://doi.org/10.3390/app13106038 - 14 May 2023
Cited by 11 | Viewed by 2716
Abstract
The way we communicate has been revolutionised by the widespread use of social networks. Any kind of online message can reach anyone in the world almost instantly. The speed with which information spreads is undoubtedly the strength of social networks, but at the [...] Read more.
The way we communicate has been revolutionised by the widespread use of social networks. Any kind of online message can reach anyone in the world almost instantly. The speed with which information spreads is undoubtedly the strength of social networks, but at the same time, any user of these platforms can see how toxic messages spread in parallel with likes, comments and ratings about any person or entity. In such cases, the victim feels even more helpless and defenceless as a result of the rapid spread. For this reason, we have implemented an automatic detector of toxic messages on social media. This allows us to stop toxicity in its tracks and protect victims. In particular, the aim of the survey is to demonstrate how traditional Machine Learning methods of Natural Language Processing (NLP) work on equal terms with Deep Learning methods represented by a Transformer architecture and characterised by a higher computational cost. In particular, the paper describes the results obtained by testing different supervised Machine Learning classifiers (Logistic Regression, Random Forest and Support Vector Machine) combined with two topic-modelling techniques of NLP, (Latent Semantic Analysis and Latent Dirichlet Allocation). A pre-trained Transformer named BERTweet was also tested. All models performed well in this task, so much so that values close to or above 90% were achieved in terms of the F1 score evaluation metric. The best result achieved by Transformer BERTweet, 91.40%, was therefore not impressive in this context, as the performance gains are too small compared to the computational overhead. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

16 pages, 784 KiB  
Article
A Contrastive Learning Framework for Detecting Anomalous Behavior in Commodity Trading Platforms
by Yihao Li and Ping Yi
Appl. Sci. 2023, 13(9), 5709; https://doi.org/10.3390/app13095709 - 5 May 2023
Cited by 1 | Viewed by 1505
Abstract
For bulk commodity, stock, and e-commerce platforms, it is necessary to detect anomalous behavior for the security of users and platforms. Anomaly-detection methods currently used on these platforms train a model for each user since different users have different habits. However, the model [...] Read more.
For bulk commodity, stock, and e-commerce platforms, it is necessary to detect anomalous behavior for the security of users and platforms. Anomaly-detection methods currently used on these platforms train a model for each user since different users have different habits. However, the model cannot be trained adequately due to insufficient individual user behavior data. In this study, to utilize information between users and avoid underfitting, we propose a contrastive learning framework to train a complete global model (GM) for anomaly detection in a trading platform. By confusing the data between different users to generate negative samples, the model can learn the differences between users by contrastive learning. To reduce the need for individual user behavior data, this framework uses a GM instead of a model for each user to learn similarities between users. Experiments on four datasets show that models trained using our framework achieve better area-under-the-curve (AUC) scores than do the original models, proving that contrastive learning and GM are useful for anomaly detection in trading platforms. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

13 pages, 2320 KiB  
Article
Exploring Promising Biomarkers for Alzheimer’s Disease through the Computational Analysis of Peripheral Blood Single-Cell RNA Sequencing Data
by Marios G. Krokidis, Aristidis G. Vrahatis, Konstantinos Lazaros and Panagiotis Vlamos
Appl. Sci. 2023, 13(9), 5553; https://doi.org/10.3390/app13095553 - 29 Apr 2023
Cited by 4 | Viewed by 2443
Abstract
Alzheimer’s disease (AD) represents one of the most important healthcare challenges of the current century, characterized as an expanding, “silent pandemic”. Recent studies suggest that the peripheral immune system may participate in AD development; however, the molecular components of these cells in AD [...] Read more.
Alzheimer’s disease (AD) represents one of the most important healthcare challenges of the current century, characterized as an expanding, “silent pandemic”. Recent studies suggest that the peripheral immune system may participate in AD development; however, the molecular components of these cells in AD remain poorly understood. Although single-cell RNA sequencing (scRNA-seq) offers a sufficient exploration of various biological processes at the cellular level, the number of existing works is limited, and no comprehensive machine learning (ML) analysis has yet been conducted to identify effective biomarkers in AD. Herein, we introduced a computational workflow using both deep learning and ML processes examining scRNA-seq data obtained from the peripheral blood of both Alzheimer’s disease patients with an amyloid-positive status and healthy controls with an amyloid-negative status, totaling 36,849 cells. The output of our pipeline contained transcripts ranked by their level of significance, which could serve as reliable genetic signatures of AD pathophysiology. The comprehensive functional analysis of the most dominant genes in terms of biological relevance to AD demonstrates that the proposed methodology has great potential for discovering blood-based fingerprints of the disease. Furthermore, the present approach paves the way for the application of ML techniques to scRNA-seq data from complex disorders, providing new challenges to identify key biological processes from a molecular perspective. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

22 pages, 3802 KiB  
Article
Training Spiking Neural Networks with Metaheuristic Algorithms
by Amirhossein Javanshir, Thanh Thi Nguyen, M. A. Parvez Mahmud and Abbas Z. Kouzani
Appl. Sci. 2023, 13(8), 4809; https://doi.org/10.3390/app13084809 - 11 Apr 2023
Cited by 4 | Viewed by 2821
Abstract
Taking inspiration from the brain, spiking neural networks (SNNs) have been proposed to understand and diminish the gap between machine learning and neuromorphic computing. Supervised learning is the most commonly used learning algorithm in traditional ANNs. However, directly training SNNs with backpropagation-based supervised [...] Read more.
Taking inspiration from the brain, spiking neural networks (SNNs) have been proposed to understand and diminish the gap between machine learning and neuromorphic computing. Supervised learning is the most commonly used learning algorithm in traditional ANNs. However, directly training SNNs with backpropagation-based supervised learning methods is challenging due to the discontinuous and non-differentiable nature of the spiking neuron. To overcome these problems, this paper proposes a novel metaheuristic-based supervised learning method for SNNs by adapting the temporal error function. We investigated seven well-known metaheuristic algorithms called Harmony Search (HS), Cuckoo Search (CS), Differential Evolution (DE), Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Artificial Bee Colony (ABC), and Grammatical Evolution (GE) as search methods for carrying out network training. Relative target firing times were used instead of fixed and predetermined ones, making the computation of the error function simpler. The performance of our proposed approach was evaluated using five benchmark databases collected in the UCI Machine Learning Repository. The experimental results showed that the proposed algorithm had a competitive advantage in solving the four classification benchmark datasets compared to the other experimental algorithms, with accuracy levels of 0.9858, 0.9768, 0.7752, and 0.6871 for iris, cancer, diabetes, and liver datasets, respectively. Among the seven metaheuristic algorithms, CS reported the best performance. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

14 pages, 5476 KiB  
Article
Deep Learning Architectures for Diagnosis of Diabetic Retinopathy
by Alberto Solano, Kevin N. Dietrich, Marcelino Martínez-Sober, Regino Barranquero-Cardeñosa, Jorge Vila-Tomás and Pablo Hernández-Cámara
Appl. Sci. 2023, 13(7), 4445; https://doi.org/10.3390/app13074445 - 31 Mar 2023
Cited by 8 | Viewed by 2517
Abstract
For many years, convolutional neural networks dominated the field of computer vision, not least in the medical field, where problems such as image segmentation were addressed by such networks as the U-Net. The arrival of self-attention-based networks to the field of computer vision [...] Read more.
For many years, convolutional neural networks dominated the field of computer vision, not least in the medical field, where problems such as image segmentation were addressed by such networks as the U-Net. The arrival of self-attention-based networks to the field of computer vision through ViTs seems to have changed the trend of using standard convolutions. Throughout this work, we apply different architectures such as U-Net, ViTs and ConvMixer, to compare their performance on a medical semantic segmentation problem. All the models have been trained from scratch on the DRIVE dataset and evaluated on their private counterparts to assess which of the models performed better in the segmentation problem. Our major contribution is showing that the best-performing model (ConvMixer) is the one that shares the approach from the ViT (processing images as patches) while maintaining the foundational blocks (convolutions) from the U-Net. This mixture does not only produce better results (DICE=0.83) than both ViTs (0.80/0.077 for UNETR/SWIN-Unet) and the U-Net (0.82) on their own but reduces considerably the number of parameters (2.97M against 104M/27M and 31M, respectively), showing that there is no need to systematically use large models for solving image problems where smaller architectures with the optimal pieces can get better results. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

27 pages, 1427 KiB  
Article
An Empirical Analysis of State-of-Art Classification Models in an IT Incident Severity Prediction Framework
by Salman Ahmed, Muskaan Singh, Brendan Doherty, Effirul Ramlan, Kathryn Harkin, Magda Bucholc and Damien Coyle
Appl. Sci. 2023, 13(6), 3843; https://doi.org/10.3390/app13063843 - 17 Mar 2023
Cited by 7 | Viewed by 4161
Abstract
Large-scale companies across various sectors maintain substantial IT infrastructure to support their operations and provide quality services for their customers and employees. These IT operations are managed by teams who deal directly with incident reports (i.e., those generated automatically through autonomous systems or [...] Read more.
Large-scale companies across various sectors maintain substantial IT infrastructure to support their operations and provide quality services for their customers and employees. These IT operations are managed by teams who deal directly with incident reports (i.e., those generated automatically through autonomous systems or human operators). (1) Background: Early identification of major incidents can provide a significant advantage for reducing the disruption to normal business operations, especially for preventing catastrophic disruptions, such as a complete system shutdown. (2) Methods: This study conducted an empirical analysis of eleven (11) state-of-the-art models to predict the severity of these incidents using an industry-led use-case composed of 500,000 records collected over one year. (3) Results: The datasets were generated from three stakeholders (i.e., agency, customer, and employee). Separately, the bidirectional encoder representations from transformers (BERT), the robustly optimized BERT pre-training approach (RoBERTa), the enhanced representation through knowledge integration (ERNIE 2.0), and the extreme gradient boosting (XGBoost) methods performed the best for the agency records (93% AUC), while the convolutional neural network (CNN) was the best model for the rest (employee records at 95% AUC and customer records at 74% AUC, respectively). The average prediction horizon was approximately 150 min, which was significant for real-time deployment. (4) Conclusions: The study provided a comprehensive analysis that supported the deployment of artificial intelligence for IT operations (AIOps), specifically for incident management within large-scale organizations. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

18 pages, 6596 KiB  
Article
Fingerprint-Based Localization Approach for WSN Using Machine Learning Models
by Tareq Alhmiedat
Appl. Sci. 2023, 13(5), 3037; https://doi.org/10.3390/app13053037 - 27 Feb 2023
Cited by 19 | Viewed by 2141
Abstract
The area of localization in wireless sensor networks (WSNs) has received considerable attention recently, driven by the need to develop an accurate localization system with the minimum cost and energy consumption possible. On the other hand, machine learning (ML) algorithms have been employed [...] Read more.
The area of localization in wireless sensor networks (WSNs) has received considerable attention recently, driven by the need to develop an accurate localization system with the minimum cost and energy consumption possible. On the other hand, machine learning (ML) algorithms have been employed widely in several WSN-based applications (data gathering, clustering, energy-harvesting, and node localization) and showed an enhancement in the obtained results. In this paper, an efficient WSN-based fingerprinting localization system for indoor environments based on a low-cost sensor architecture, through establishing an indoor fingerprinting dataset and adopting four tailored ML models, is presented. The proposed system was validated by real experiments conducted in complex indoor environments with several obstacles and walls and achieves an efficient localization accuracy with an average of 1.4 m. In addition, through real experiments, we analyze and discuss the impact of reference point density on localization accuracy. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

27 pages, 6119 KiB  
Article
Optimizing the Parameters of Long Short-Term Memory Networks Using the Bees Algorithm
by Nawaf Mohammad H. Alamri, Michael Packianather and Samuel Bigot
Appl. Sci. 2023, 13(4), 2536; https://doi.org/10.3390/app13042536 - 16 Feb 2023
Cited by 7 | Viewed by 2837
Abstract
Improving the performance of Deep Learning (DL) algorithms is a challenging problem. However, DL is applied to different types of Deep Neural Networks, and Long Short-Term Memory (LSTM) is one of them that deals with time series or sequential data. This paper attempts [...] Read more.
Improving the performance of Deep Learning (DL) algorithms is a challenging problem. However, DL is applied to different types of Deep Neural Networks, and Long Short-Term Memory (LSTM) is one of them that deals with time series or sequential data. This paper attempts to overcome this problem by optimizing LSTM parameters using the Bees Algorithm (BA), which is a nature-inspired algorithm that mimics the foraging behavior of honey bees. In particular, it was used to optimize the adjustment factors of the learning rate in the forget, input, and output gates, in addition to cell candidate, in both forward and backward sides. Furthermore, the BA was used to optimize the learning rate factor in the fully connected layer. In this study, artificial porosity images were used for testing the algorithms; since the input data were images, a Convolutional Neural Network (CNN) was added in order to extract the features in the images to feed into the LSTM for predicting the percentage of porosity in the sequential layers of artificial porosity images that mimic real CT scan images of products manufactured by the Selective Laser Melting (SLM) process. Applying a Convolutional Neural Network Long Short-Term Memory (CNN-LSTM) yielded a porosity prediction accuracy of 93.17%. Although using Bayesian Optimization (BO) to optimize the LSTM parameters mentioned previously did not improve the performance of the LSTM, as the prediction accuracy was 93%, adding the BA to optimize the same LSTM parameters did improve its performance in predicting the porosity, with an accuracy of 95.17% where a hybrid Bees Algorithm Convolutional Neural Network Long Short-Term Memory (BA-CNN-LSTM) was used. Furthermore, the hybrid BA-CNN-LSTM algorithm was capable of dealing with classification problems as well. This was shown by applying it to Electrocardiogram (ECG) benchmark images, which improved the test set classification accuracy, which was 92.50% for the CNN-LSTM algorithm and 95% for both the BO-CNN-LSTM and BA-CNN-LSTM algorithms. In addition, the turbofan engine degradation simulation numerical dataset was used to predict the Remaining Useful Life (RUL) of the engines using the LSTM network. A CNN was not needed in this case, as there was no feature extraction for the images. However, adding the BA to optimize the LSTM parameters improved the prediction accuracy in the testing set for the LSTM and BO-LSTM, which increased from 74% to 77% for the hybrid BA-LSTM algorithm. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

14 pages, 2424 KiB  
Article
Semi-Global Stereo Matching Algorithm Based on Multi-Scale Information Fusion
by Changgen Deng, Deyuan Liu, Haodong Zhang, Jinrong Li and Baojun Shi
Appl. Sci. 2023, 13(2), 1027; https://doi.org/10.3390/app13021027 - 12 Jan 2023
Cited by 11 | Viewed by 3929
Abstract
Semi-global matching (SGM) has been widely used in binocular vision. In spite of its good efficiency, SGM still has difficulties in dealing with low-texture regions. In this paper, an SGM algorithm based on multi-scale information fusion (MSIF), named SGM-MSIF, is proposed by combining [...] Read more.
Semi-global matching (SGM) has been widely used in binocular vision. In spite of its good efficiency, SGM still has difficulties in dealing with low-texture regions. In this paper, an SGM algorithm based on multi-scale information fusion (MSIF), named SGM-MSIF, is proposed by combining multi-path cost aggregation and cross-scale cost aggregation (CSCA). Firstly, the stereo pairs at different scales are obtained by Gaussian pyramid down-sampling. The initial matching cost volumes at different scales are computed by combining census transform and color information. Then, the multi-path cost aggregation in SGM is introduced into the cost aggregation at each scale and the aggregated cost volumes are fused by CSCA. Thirdly, the disparity map is optimized by internal left-right consistency check and median filter. Finally, experiments are conducted on Middlebury datasets to evaluate the proposed algorithm. Experimental results show that the average error matching rate (EMR) of the proposed SGM-MSIF algorithm reduced by 1.96% compared with SGM. Compared with classical cross-scale stereo matching algorithm, the average EMR of SGM-MSIF algorithm reduced by 0.92%, while the processing efficiency increased by 58.7%. In terms of overall performance, the proposed algorithm outperforms the classic SGM and CSCA algorithms. It can achieve high matching accuracy and high processing efficiency for binocular vision applications, especially for those with low-texture regions. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

34 pages, 8281 KiB  
Article
Hybrid Decision Models of Leasing Business for Thailand Using Neural Network
by Nachapong Jiamahasap and Sakgasem Ramingwong
Appl. Sci. 2022, 12(22), 11730; https://doi.org/10.3390/app122211730 - 18 Nov 2022
Viewed by 2197
Abstract
The research aims to improve the effectiveness of financial lending business decision-making by developing dynamic models involved in the money-lending business. The objectives of this study are to identify preference factors that affect a customer’s decision of choosing a particular financial institution, to [...] Read more.
The research aims to improve the effectiveness of financial lending business decision-making by developing dynamic models involved in the money-lending business. The objectives of this study are to identify preference factors that affect a customer’s decision of choosing a particular financial institution, to determine the important approval factors that providers need to take into consideration while approving loans and to identify any relationship between and among the factors. The data are taken from a case study of a lending company in northern Thailand. The first model is the preference model, comprising 68 inputs factors, which are used to determine the reasons why a customer chooses service providers, which can be either commercial or non-commercial banks. The model is developed using a neural network (NN) with a history data of 2973 records and comprising four sub-models. The model is improved by varying the NN structure and EPOC. The best model provides an accuracy rate of 100%. The second model is the approval model, comprising 55 input factors for predicting the result of loan requests, which can determine if the loan should be approved with the full amount of the request, approved with a lesser amount or another outcome. The model is developed using a neural network with history data of 787 records. This model is composed of three sub-models; the best model of which gives an accuracy rate of 55%. The third model is the hybrid decision model, linking preference factors and approval factors with external factors. The model is constructed using system dynamics factors, approval factors, financial institutions and system dynamic modeling and the model can simulate the result if the input is changed. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

19 pages, 6457 KiB  
Article
Research on PPP Enterprise Credit Dynamic Prediction Model
by Likun Zhao, Shaotang Yang, Shouqing Wang and Jianxiong Shen
Appl. Sci. 2022, 12(20), 10362; https://doi.org/10.3390/app122010362 - 14 Oct 2022
Cited by 5 | Viewed by 2364
Abstract
The debt default risk of local government financing vehicles (LGFVs) has become a potential trigger for systemic financial risks. How to effectively prevent hidden debt risk has always been a hot issue in public-private partnership (PPP) financing management research. In recent years, machine [...] Read more.
The debt default risk of local government financing vehicles (LGFVs) has become a potential trigger for systemic financial risks. How to effectively prevent hidden debt risk has always been a hot issue in public-private partnership (PPP) financing management research. In recent years, machine learning has become more and more popular in the study of enterprise credit evaluation. However, most scholars only focus on the output of the model, and do not explain in detail the extent to which variables affect the model and the decision-making process of the model. In this paper, we aim to apply a better credit rating method to the key factors and analysis of LGFV’s default risk, and analyze the decision-making process of the model in a visual form. Firstly, this paper analyzes the financial data of LGFVs. Secondly, the XGBoost-logistic combination algorithm is introduced to integrate the typical characteristics of PPP projects and construct the credit evaluation model of LGFVs. Finally, we verify the feasibility of the model by K-fold cross validation and performance evaluation. The results show that: (1) net worth, total assets, operating income, and return on equity are the most critical factors affecting the credit risk of LGFVs, asset-liability ratio and tax revenue are also potentially important factors; (2) the XGBoost-logistic model can identify the key factors affecting the credit risk of LGFVs, and has better classification performance and predictive ability. (3) The influence of each characteristic variable on model decision can be quantified by the SHAP value, and the classification decision visualization of the model improves the interpretability of the model. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

19 pages, 364 KiB  
Article
A New Sentiment-Enhanced Word Embedding Method for Sentiment Analysis
by Qizhi Li, Xianyong Li, Yajun Du, Yongquan Fan and Xiaoliang Chen
Appl. Sci. 2022, 12(20), 10236; https://doi.org/10.3390/app122010236 - 11 Oct 2022
Cited by 5 | Viewed by 3035
Abstract
Since some sentiment words have similar syntactic and semantic features in the corpus, existing pre-trained word embeddings always perform poorly in sentiment analysis tasks. This paper proposes a new sentiment-enhanced word embedding (S-EWE) method to improve the effectiveness of sentence-level sentiment classification. This [...] Read more.
Since some sentiment words have similar syntactic and semantic features in the corpus, existing pre-trained word embeddings always perform poorly in sentiment analysis tasks. This paper proposes a new sentiment-enhanced word embedding (S-EWE) method to improve the effectiveness of sentence-level sentiment classification. This sentiment enhancement method takes full advantage of the mapping relationship between word embeddings and their corresponding sentiment orientations. This method first converts words to word embeddings and assigns sentiment mapping vectors to all word embeddings. Then, word embeddings and their corresponding sentiment mapping vectors are fused to S-EWEs. After reducing the dimensions of S-EWEs through a fully connected layer, the predicted sentiment orientations are obtained. The S-EWE method adopts the cross-entropy function to calculate the loss between predicted and true sentiment orientations, and backpropagates the loss to train the sentiment mapping vectors. Experiments show that the accuracy and macro-F1 values of six sentiment classification models using Word2Vec and GloVe with the S-EWEs are on average 1.07% and 1.58% higher than those without the S-EWEs on the SemEval-2013 dataset, and on average 1.23% and 1.26% higher than those without the S-EWEs on the SST-2 dataset. In all baseline models with S-EWEs, the convergence time of the attention-based bidirectional CNN-RNN deep model (ABCDM) with S-EWEs was significantly decreased by 51.21% of ABCDM on the SemEval-2013 dataset. The convergence time of CNN-LSTM with S-EWEs was vastly reduced by 41.34% of CNN-LSTM on the SST-2 dataset. In addition, the S-EWE method is not valid for contextualized word embedding models. The main reasons are that the S-EWE method only enhances the embedding layer of the models and has no effect on the models themselves. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

Review

Jump to: Research

13 pages, 1366 KiB  
Review
Advanced CT Imaging, Radiomics, and Artificial Intelligence to Evaluate Immune Checkpoint Inhibitors’ Effects on Metastatic Renal Cell Carcinoma
by Federico Greco, Bruno Beomonte Zobel, Gianfranco Di Gennaro and Carlo Augusto Mallio
Appl. Sci. 2023, 13(6), 3779; https://doi.org/10.3390/app13063779 - 16 Mar 2023
Cited by 1 | Viewed by 1796
Abstract
Advances in the knowledge of renal cell carcinoma (RCC)’s oncogenesis have led to the development of new therapeutic approaches, such as immune checkpoint inhibitors (ICIs), which have improved the clinical outcomes of metastatic RCC (mRCC) patients. Our literature search led to a series [...] Read more.
Advances in the knowledge of renal cell carcinoma (RCC)’s oncogenesis have led to the development of new therapeutic approaches, such as immune checkpoint inhibitors (ICIs), which have improved the clinical outcomes of metastatic RCC (mRCC) patients. Our literature search led to a series of studies that were divided into four subcategories: RECIST criteria, radiomics and artificial intelligence, atypical response patterns, and body composition. These studies provide novel and promising data aimed at improving patient management and clinical outcomes, further strengthening the concept of precision medicine. Radiomics and artificial intelligence allow us to obtain—in a non-invasive fashion—a multitude of data that cannot be detected with the naked eye, offering potential advantages that might help to predict the response to treatments and possibly improve patients’ outcomes through a personalized therapeutic approach. The purpose of this literature review is to describe the available evidence on the role of computed tomography (CT) in evaluating and predicting ICIs’ effects on mRCC patients by applying radiomics and artificial intelligence. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

17 pages, 1041 KiB  
Review
Recent Advances in Artificial Intelligence-Assisted Ultrasound Scanning
by Rebeca Tenajas, David Miraut, Carlos I. Illana, Rodrigo Alonso-Gonzalez, Fernando Arias-Valcayo and Joaquin L. Herraiz
Appl. Sci. 2023, 13(6), 3693; https://doi.org/10.3390/app13063693 - 14 Mar 2023
Cited by 20 | Viewed by 15378
Abstract
Ultrasound (US) is a flexible imaging modality used globally as a first-line medical exam procedure in many different clinical cases. It benefits from the continued evolution of ultrasonic technologies and a well-established US-based digital health system. Nevertheless, its diagnostic performance still presents challenges [...] Read more.
Ultrasound (US) is a flexible imaging modality used globally as a first-line medical exam procedure in many different clinical cases. It benefits from the continued evolution of ultrasonic technologies and a well-established US-based digital health system. Nevertheless, its diagnostic performance still presents challenges due to the inherent characteristics of US imaging, such as manual operation and significant operator dependence. Artificial intelligence (AI) has proven to recognize complicated scan patterns and provide quantitative assessments for imaging data. Therefore, AI technology has the potential to help physicians get more accurate and repeatable outcomes in the US. In this article, we review the recent advances in AI-assisted US scanning. We have identified the main areas where AI is being used to facilitate US scanning, such as standard plane recognition and organ identification, the extraction of standard clinical planes from 3D US volumes, and the scanning guidance of US acquisitions performed by humans or robots. In general, the lack of standardization and reference datasets in this field makes it difficult to perform comparative studies among the different proposed methods. More open-access repositories of large US datasets with detailed information about the acquisition are needed to facilitate the development of this very active research field, which is expected to have a very positive impact on US imaging. Full article
(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)
Show Figures

Figure 1

Back to TopTop