MDPI - Publisher of Open Access Journals

41 pages, 19770 KB

Open AccessArticle

Vision-Based Dual-Mode Collision Risk-Warning for Aircraft Apron Monitoring

by Emre Can Bingol, Hamed Al-Raweshidy and Konstantinos Banitsas

Drones 2026, 10(3), 173; https://doi.org/10.3390/drones10030173 - 2 Mar 2026

Viewed by 518

Ground incidents on airport aprons can cause substantial operational disruption and economic loss, while conventional surveillance (e.g., Surface Movement Radar (SMR), Closed-Circuit Television (CCTV)) often lacks the resolution and proactive decision support required for close-proximity operations. This study proposes a UAV-deployable, camera-agnostic Computer [...] Read more.

Ground incidents on airport aprons can cause substantial operational disruption and economic loss, while conventional surveillance (e.g., Surface Movement Radar (SMR), Closed-Circuit Television (CCTV)) often lacks the resolution and proactive decision support required for close-proximity operations. This study proposes a UAV-deployable, camera-agnostic Computer Vision (CV) framework for collision-risk warning from elevated viewpoints. An optimised YOLOv8-Seg backbone performs multi-class aircraft segmentation (airplane, wing, nose, tail, and fuselage) and is integrated with four MOT algorithms under identical evaluation settings. For quantitative tracker benchmarking, DeepSORT provides the strongest overall performance on the airplane-only MOTChallenge-format ground truth (MOTA 92.77%, recall 93.27%). To mitigate the scarcity of annotated apron-incident data, a labelled 997-frame MOT dataset is created via an MSFS simulation-based reenactment inspired by the 2018 Asiana–Turkish Airlines wing-to-tail event at Istanbul Ataturk Airport. The framework further introduces a dual-module warning mechanism that can operate independently: (i) a reactive module using image-plane proximity derived from segmentation masks, and (ii) a proactive module that predicts short-horizon conflicts via trajectory extrapolation and IoU-based future overlap analysis. The approach is evaluated on multiple simulated incident scenarios and assessed on a real apron video from Hong Kong International Airport; additionally, laboratory-scale UAV experiments using diecast aircraft models provide end-to-end feasibility evidence on unmanned-platform imagery. Overall, the results indicate timely warnings and practical feasibility for low-overhead UAV-enabled apron monitoring. Full article

► Show Figures

Figure 1

21 pages, 1192 KB

Open AccessArticle

Video Stabilization Algorithm Based on View Boundary Synthesis

by Wenchao Shan, Hejing Zhao, Xin Li, Qian Huang, Chuanxu Jiang, Yiming Wang, Ziqi Chen and Yao Tong

Symmetry 2025, 17(8), 1351; https://doi.org/10.3390/sym17081351 - 19 Aug 2025

Cited by 1 | Viewed by 1761

Abstract

Video stabilization is a critical technology for enhancing visual content quality in dynamic shooting scenarios, especially with the widespread adoption of mobile photography devices and Unmanned Aerial Vehicle (UAV) platforms. While traditional digital stabilization algorithms can improve frame stability by modeling global motion [...] Read more.

Video stabilization is a critical technology for enhancing visual content quality in dynamic shooting scenarios, especially with the widespread adoption of mobile photography devices and Unmanned Aerial Vehicle (UAV) platforms. While traditional digital stabilization algorithms can improve frame stability by modeling global motion trajectories, they often suffer from excessive cropping or boundary distortion, leading to a significant loss of valid image regions. To address this persistent challenge, we propose the View Out-boundary Synthesis Algorithm (VOSA), a symmetry-aware spatio-temporal consistency framework. By leveraging rotational and translational symmetry principles in motion dynamics, VOSA realizes optical flow field extrapolation through an encoder–decoder architecture and an iterative boundary extension strategy. Experimental results demonstrate that VOSA enhances conventional stabilization by increasing content retention by 6.3% while maintaining a 0.943 distortion score, outperforming mainstream methods in dynamic environments. The symmetry-informed design resolves stability–content conflicts and outperforms mainstream methods in dynamic environments, establishing a new paradigm for full-frame stabilization. Full article

(This article belongs to the Special Issue Symmetry/Asymmetry in Image Processing and Computer Vision)

► Show Figures

Figure 1

23 pages, 1894 KB

Open AccessArticle

ViViT-Prob: A Radar Echo Extrapolation Model Based on Video Vision Transformer and Spatiotemporal Sparse Attention

by Yunan Qiu, Bingjian Lu, Wenrui Xiong, Zhenyu Lu, Le Sun and Yingjie Cui

Remote Sens. 2025, 17(12), 1966; https://doi.org/10.3390/rs17121966 - 6 Jun 2025

Cited by 1 | Viewed by 1724

Abstract

Weather radar, as a crucial component of remote sensing data, plays a vital role in convective weather forecasting through radar echo extrapolation techniques. To address the limitations of existing deep learning methods in radar echo extrapolation, this paper proposes a radar echo extrapolation [...] Read more.

Weather radar, as a crucial component of remote sensing data, plays a vital role in convective weather forecasting through radar echo extrapolation techniques. To address the limitations of existing deep learning methods in radar echo extrapolation, this paper proposes a radar echo extrapolation model based on video vision transformer and spatiotemporal sparse attention (ViViT-Prob). The model takes historical sequences as input and initially maps them into a fixed-dimensional vector space through 3D convolutional patch encoding. Subsequently, a multi-head spatiotemporal fusion module with sparse attention encodes these vectors, effectively capturing spatiotemporal relationships between different regions in the sequences. The sparse constraint enables better utilization of data structural information, enhanced focus on critical regions, and reduced computational complexity. Finally, a parallel output decoder generates all time step predictions simultaneously, then maps back to the prediction space through a deconvolution module to reconstruct high-resolution images. Our experimental results on the Moving MNIST and real radar echo dataset demonstrate that the proposed model achieves superior performance in spatiotemporal sequence prediction and improves the prediction accuracy while maintaining structural consistency in radar echo extrapolation tasks, providing an effective solution for short-term precipitation forecasting. Full article

► Show Figures

Figure 1

23 pages, 2382 KB

Open AccessSystematic Review

Video Head Impulse Test in Children—A Systematic Review of Literature

by Soumit Dasgupta, Aditya Lal Mukherjee, Rosa Crunkhorn, Safaa Dawabah, Nesibe Gul Aslier, Sudhira Ratnayake and Leonardo Manzari

J. Clin. Med. 2025, 14(2), 369; https://doi.org/10.3390/jcm14020369 - 9 Jan 2025

Cited by 3 | Viewed by 4005

Abstract

Background and Objectives: The video head impulse test is a landmark in vestibular diagnostic methods to assess the high-frequency semicircular canal system. This test is well established in the adult population with immense research since its discovery. The usefulness and feasibility of [...] Read more.

Background and Objectives: The video head impulse test is a landmark in vestibular diagnostic methods to assess the high-frequency semicircular canal system. This test is well established in the adult population with immense research since its discovery. The usefulness and feasibility of the test in children is not very well defined, as research has been limited. This systematic review investigated and analysed the existing evidence regarding the test. The objectives were to derive meaningful inferences in terms of the feasibility, implementation, and normative vestibulo-ocular reflex (VOR gain) in normal children and in children with vestibular hypofunction. Methods: Research repositories were searched with keywords, along with inclusion and exclusion criteria, to select publications that investigated the vHIT in both a normative population of children as well as in pathological cohorts. The average normal VOR gain was then calculated in all semicircular canals for both the normal and the vestibular hypofunction groups. For the case–control studies, a meta-analysis was performed to assess the heterogeneity and pooled effect sizes. Results and Discussion: The review analysed 26 articles that included six case–control studies fulfilling the study selection criteria, out of more than 6000 articles that have been published on the vHIT. The described technique suggested 10–15 head impulses at 100–200°/s head velocity and 10–20° displacement fixating on a wall target 1 to 1.5 m away. The average VOR gain in the lateral semicircular canals combining all studies was 0.96 +/− 0.07; in anterior semicircular canals, it was 0.89 +/− 0.13, and for posterior semicircular canals, it was 0.9 +/− 0.12. The normal VOR gains measured with individual equipment (ICS Impulse, EyeSeeCam and Synapsys) in the lateral semicircular canals were largely similar (p > 0.05 when ICS Impulse and EyeSeeCam were compared). The pooled effect size in the control group was 1, and the heterogeneity was high. It was also observed that implementing the test is different from that in adults and requires considerable practice with children, factoring in the issue of peripheral and central vestibular maturation. Special considerations were suggested in terms of the pupillary calibration, goggle fitting, and slippage and play techniques. Conclusions: The vHIT as a diagnostic test is possible in children with important caveats, practice, and knowledge regarding a developing vestibular system. It yields significantly meaningful inferences about high-frequency semicircular canal function in children. Adult norms should not be extrapolated in children, as the VOR gain is different in children. Full article

(This article belongs to the Special Issue Recent Advances in Audio-Vestibular Medicine)

► Show Figures

Figure 1

18 pages, 8573 KB

Open AccessArticle

ResTUnet: A Novel Neural Network Model for Nowcasting Using Radar Echo Sequences by Ground-Based Remote Sensing

by Lei Zhang, Ruoyang Zhang, Yu Wu, Yadong Wang, Yanfeng Zhang, Lijuan Zheng, Chongbin Xu, Xin Zuo and Zeyu Wang

Remote Sens. 2024, 16(24), 4792; https://doi.org/10.3390/rs16244792 - 23 Dec 2024

Cited by 1 | Viewed by 1961

Abstract

Radar echo extrapolation by ground-based remote sensing is essential for weather prediction and flight guiding. Existing radar echo extrapolation methods can hardly capture complex spatiotemporal features, resulting in the low accuracy of predictions, and, therefore, severely restrict their use in extreme weather situations. [...] Read more.

Radar echo extrapolation by ground-based remote sensing is essential for weather prediction and flight guiding. Existing radar echo extrapolation methods can hardly capture complex spatiotemporal features, resulting in the low accuracy of predictions, and, therefore, severely restrict their use in extreme weather situations. A deep learning method was recently applied for extrapolating radar echoes; however, its accuracy declines too quickly over a short time. In this study, we introduce a solution: Residual Transformer and Unet (ResTUnet), a novel model that improves prediction accuracy and exhibits good stability with a slow rate of accuracy decline. This presented Rest-Net model is designed to solve the issue of declining prediction accuracy by integrating a 1*1 convolution to diminish the neural network parameters. We constructed an observed dataset by Zhengzhou East Airport radar observation from July 2022 to August 2022 and performed 90 min experiments comprising five aspects, including extrapolation images, the Probability of Detection (POD) index, the Critical Success Index (CSI), the False Alarm Rate (FAR) index, and the Heidke Skill Score (HSS) index. The experimental results show that the ResTUnet model improved the CSI, HSS index, and the POD index by 17.20%, 11.97%, and 11.35%, compared to current models, including Convolutional Long Short-Term Memory (convLSTM), the Convolutional Gated Recurrent Unit (convGRU), the Trajectory Gated Recurrent Unit (TrajGRU), and the improved recurrent network for video predictive learning, the Predictive Recurrent Neural Network++ (predRNN++). In addition, the mean squared error of the ResTUnet model remains stable at 15% between 0 and 60 min and starts to increase after 60–90 min, which is 12% better than the current models. This enhancement in prediction accuracy has practical applications in meteorological services and decision making. Full article

(This article belongs to the Special Issue Advance of Radar Meteorology and Hydrology II)

► Show Figures

Graphical abstract

16 pages, 4926 KB

Open AccessArticle

Architectonic Design Supported by Visual Environmental Simulation—A Comparison of Displays and Formats

by Juan Luis Higuera-Trujillo, Juan López-Tarruella Maldonado, Nuria Castilla and Carmen Llinares

Buildings 2024, 14(1), 216; https://doi.org/10.3390/buildings14010216 - 13 Jan 2024

Cited by 7 | Viewed by 2584

Abstract

Visual environmental simulations are fundamental in understanding the relationship between the built environment and psychological perception. The remarkable evolution of virtual immersion displays over recent years has provided a series of advantages to the architectural discipline, one of which is that non-specialists now [...] Read more.

Visual environmental simulations are fundamental in understanding the relationship between the built environment and psychological perception. The remarkable evolution of virtual immersion displays over recent years has provided a series of advantages to the architectural discipline, one of which is that non-specialists now have the potential to better understand architectural spaces. This work aimed to analyse the adequacy of the main displays and formats currently used in environmental simulations. As the objective was twofold, two experimental studies were carried out (with a sample of 100 participants). The studies evaluated users’ responses to different environmental representations of two environments, using differential semantic scales to measure key underlying factors (utility, credibility, realism, accuracy, abstraction). The first study examined simulation displays: a PC, an HTC Vive Pro 2 head-mounted display, a PowerWall Screen and a CAVE. In the second, formats were analysed: normal image, 360° image, video and 360° video. The results of this work revealed that users perceived the space differently depending on the representation displays and formats used. Such comparisons of these new means of representing architectural spaces can be helpful to researchers, architects and urban planning professionals and might provoke debate in, and be extrapolated into, the design field. Full article

(This article belongs to the Special Issue Sensing the Built Environment: Measurements, Correlations, and Implications)

► Show Figures

Figure 1

17 pages, 1223 KB

Open AccessArticle

Endoscopic Retrieval of Esophageal and Gastric Foreign Bodies in Cats and Dogs: A Retrospective Study of 92 Cases

by Giulia Maggi, Mattia Tessadori, Maria Luisa Marenzoni, Francesco Porciello, Domenico Caivano and Maria Chiara Marchesi

Vet. Sci. 2023, 10(9), 560; https://doi.org/10.3390/vetsci10090560 - 5 Sep 2023

Cited by 4 | Viewed by 6796

Abstract

Esophageal and gastric foreign bodies (FBs) commonly occur in small animal practices, and their endoscopic removal has been previously reported. However, few studies reported the endoscopic instruments used for the retrieval attempt and the time spent for endoscopic removal. Therefore, the aim of [...] Read more.

Esophageal and gastric foreign bodies (FBs) commonly occur in small animal practices, and their endoscopic removal has been previously reported. However, few studies reported the endoscopic instruments used for the retrieval attempt and the time spent for endoscopic removal. Therefore, the aim of this study is to evaluate the factors that can influence the success rate and timing of the endoscopic retrieval of FBs. The medical records of 92 animals undergoing endoscopic removal of esophageal (n = 12) and gastric (n = 84) FBs have been reviewed. Two dogs had FBs in both the esophagus and stomach. From medical records and video recordings, there were extrapolated data on signalment, clinical signs, endoscopic devices used, success of retrieval, and duration of endoscopy. Endoscopic removal of FBs was successful in 88% cases, and the mean time spent for the extraction was 59.74 min (range, 10–120 min). The success rate and timing for the removal of endoscopic foreign bodies (EFBs) are influenced by several factors in our population: medium-breed dogs, adult animals, and localization of FBs in the body of the stomach increased the probability of failure during the endoscopic retrieval attempt. Conversely, the success and timing of the retrieval of EFBs were higher in puppies and with increasing operator’s experience. Moreover, the use of combination devices such as polypectomy snare and grasping forceps negatively influenced the success of extraction of FBs. Further prospective and comparative studies in a large and multicentric population of patients can be useful to create interventional endoscopic guidelines, as in human medicine. Full article

► Show Figures

Figure 1

18 pages, 2785 KB

Open AccessArticle

Spatio-Temporal Coherence of mmWave/THz Channel Characteristics and Their Forecasting Using Video Frame Prediction Techniques

by Vladislav Prosvirov, Amjad Ali, Abdukodir Khakimov and Yevgeni Koucheryavy

Mathematics 2023, 11(17), 3634; https://doi.org/10.3390/math11173634 - 23 Aug 2023

Cited by 2 | Viewed by 2474

Abstract

Channel state information in millimeter wave (mmWave) and terahertz (THz) communications systems is vital for various tasks ranging from planning the optimal locations of BSs to efficient beam tracking mechanisms to handover design. Due to the use of large-scale phased antenna arrays and [...] Read more.

Channel state information in millimeter wave (mmWave) and terahertz (THz) communications systems is vital for various tasks ranging from planning the optimal locations of BSs to efficient beam tracking mechanisms to handover design. Due to the use of large-scale phased antenna arrays and high sensitivity to environmental geometry and materials, precise propagation models for these bands are obtained via ray-tracing modeling. However, the propagation conditions in mmWave/THz systems may theoretically change at very small distances, that is, 1 mm–1

μ

m, which requires extreme computational effort for modeling. In this paper, we first will assess the effective correlation distances in mmWave/THz systems for different outdoor scenarios, user mobility patterns, and line-of-sight (LoS) and non-LoS (nLoS) conditions. As the metrics of interest, we utilize the angle of arrival/departure (AoA/AoD) and path loss of the first few strongest rays. Then, to reduce the computational efforts required for the ray-tracing procedure, we propose a methodology for the extrapolation and interpolation of these metrics based on the convolutional long short-term memory (ConvLSTM) model. The proposed methodology is based on a special representation of the channel state information in a form suitable for state-of-the-art video enhancement machine learning (ML) techniques, which allows for the use of their powerful prediction capabilities. To assess the prediction performance of the ConvLSTM model, we utilize precision and recall as the main metrics of interest. Our numerical results demonstrate that the channel state correlation in AoA/AoD parameters is preserved up until approximately 0.3–0.6 m, which is 300–600 times larger than the wavelength at 300 GHz. The use of a ConvLSTM model allows us to accurately predict AoA and AoD angles up to the 0.6 m distance with AoA being characterized by a higher mean squared error (MSE). Our results can be utilized to speed up ray-tracing simulations by selecting the grid step size, resulting in the desired trade-off between modeling accuracy and computational time. Additionally, it can also be utilized to improve beam tracking in mmWave/THz systems via a selection of the time step between beam realignment procedures. Full article

(This article belongs to the Special Issue Applications of Mathematical Analysis in Telecommunications-II)

► Show Figures

Figure 1

17 pages, 5941 KB

Open AccessArticle

Motion Vector Extrapolation for Video Object Detection

by Julian True and Naimul Khan

J. Imaging 2023, 9(7), 132; https://doi.org/10.3390/jimaging9070132 - 29 Jun 2023

Cited by 3 | Viewed by 4059

Abstract

Despite the continued successes of computationally efficient deep neural network architectures for video object detection, performance continually arrives at the great trilemma of speed versus accuracy versus computational resources (pick two). Current attempts to exploit temporal information in video data to overcome this [...] Read more.

Despite the continued successes of computationally efficient deep neural network architectures for video object detection, performance continually arrives at the great trilemma of speed versus accuracy versus computational resources (pick two). Current attempts to exploit temporal information in video data to overcome this trilemma are bottlenecked by the state of the art in object detection models. This work presents motion vector extrapolation (MOVEX), a technique which performs video object detection through the use of off-the-shelf object detectors alongside existing optical flow-based motion estimation techniques in parallel. This work demonstrates that this approach significantly reduces the baseline latency of any given object detector without sacrificing accuracy performance. Further latency reductions up to 24 times lower than the original latency can be achieved with minimal accuracy loss. MOVEX enables low-latency video object detection on common CPU-based systems, thus allowing for high-performance video object detection beyond the domain of GPU computing. Full article

(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)

► Show Figures

Figure 1

16 pages, 2115 KB

Open AccessArticle

Camera- and Viewpoint-Agnostic Evaluation of Axial Postural Abnormalities in People with Parkinson’s Disease through Augmented Human Pose Estimation

by Stefano Aldegheri, Carlo Alberto Artusi, Serena Camozzi, Roberto Di Marco, Christian Geroin, Gabriele Imbalzano, Leonardo Lopiano, Michele Tinazzi and Nicola Bombieri

Sensors 2023, 23(6), 3193; https://doi.org/10.3390/s23063193 - 16 Mar 2023

Cited by 15 | Viewed by 3911

Abstract

Axial postural abnormalities (aPA) are common features of Parkinson’s disease (PD) and manifest in over 20% of patients during the course of the disease. aPA form a spectrum of functional trunk misalignment, ranging from a typical Parkinsonian stooped posture to progressively greater degrees [...] Read more.

Axial postural abnormalities (aPA) are common features of Parkinson’s disease (PD) and manifest in over 20% of patients during the course of the disease. aPA form a spectrum of functional trunk misalignment, ranging from a typical Parkinsonian stooped posture to progressively greater degrees of spine deviation. Current research has not yet led to a sufficient understanding of pathophysiology and management of aPA in PD, partially due to lack of agreement on validated, user-friendly, automatic tools for measuring and analysing the differences in the degree of aPA, according to patients’ therapeutic conditions and tasks. In this context, human pose estimation (HPE) software based on deep learning could be a valid support as it automatically extrapolates spatial coordinates of the human skeleton keypoints from images or videos. Nevertheless, standard HPE platforms have two limitations that prevent their adoption in such a clinical practice. First, standard HPE keypoints are inconsistent with the keypoints needed to assess aPA (degrees and fulcrum). Second, aPA assessment either requires advanced RGB-D sensors or, when based on the processing of RGB images, they are most likely sensitive to the adopted camera and to the scene (e.g., sensor–subject distance, lighting, background–subject clothing contrast). This article presents a software that augments the human skeleton extrapolated by state-of-the-art HPE software from RGB pictures with exact bone points for posture evaluation through computer vision post-processing primitives. This article shows the software robustness and accuracy on the processing of 76 RGB images with different resolutions and sensor–subject distances from 55 PD patients with different degrees of anterior and lateral trunk flexion. Full article

(This article belongs to the Special Issue Advances in Artificial Intelligence for Biomedical Signal and Image Analysis)

► Show Figures

Figure 1

13 pages, 1106 KB

Open AccessArticle

A DSC Test for the Early Detection of Neoplastic Gastric Lesions in a Medium-Risk Gastric Cancer Area

by Valli De Re, Stefano Realdon, Roberto Vettori, Alice Zaramella, Stefania Maiero, Ombretta Repetto, Vincenzo Canzonieri, Agostino Steffan and Renato Cannizzaro

Int. J. Mol. Sci. 2023, 24(4), 3290; https://doi.org/10.3390/ijms24043290 - 7 Feb 2023

Cited by 4 | Viewed by 3118

Abstract

In this study, we aimed to assess the accuracy of the proposed novel, noninvasive serum DSC test in predicting the risk of gastric cancer before the use of upper endoscopy. To validate the DSC test, we enrolled two series of individuals living in [...] Read more.

In this study, we aimed to assess the accuracy of the proposed novel, noninvasive serum DSC test in predicting the risk of gastric cancer before the use of upper endoscopy. To validate the DSC test, we enrolled two series of individuals living in Veneto and Friuli-Venezia Giulia, Italy (n = 53 and n = 113, respectively), who were referred for an endoscopy. The classification used for the DSC test to predict gastric cancer risk combines the coefficient of the patient’s age and sex and serum pepsinogen I and II, gastrin 17, and anti-Helicobacter pylori immunoglobulin G concentrations in two equations: Y1 and Y2. The coefficient of variables and the Y1 and Y2 cutoff points (>0.385 and >0.294, respectively) were extrapolated using regression analysis and an ROC curve analysis of two retrospective datasets (300 cases for the Y1 equation and 200 cases for the Y2 equation). The first dataset included individuals with autoimmune atrophic gastritis and first-degree relatives with gastric cancer; the second dataset included blood donors. Demographic data were collected; serum pepsinogen, gastrin G17, and anti-Helicobacter pylori IgG concentrations were assayed using an automatic Maglumi system. Gastroscopies were performed by gastroenterologists using an Olympus video endoscope with detailed photographic documentation during examinations. Biopsies were taken at five standardized mucosa sites and were assessed by a pathologist for diagnosis. The accuracy of the DSC test in predicting neoplastic gastric lesions was estimated to be 74.657% (65%CI; 67.333% to 81.079%). The DSC test was found to be a useful, noninvasive, and simple approach to predicting gastric cancer risk in a population with a medium risk of developing gastric cancer. Full article

(This article belongs to the Special Issue Cancer Prevention with Molecular Target Therapies 3.0)

► Show Figures

Figure 1

15 pages, 3456 KB

Open AccessArticle

D-STGCN: Dynamic Pedestrian Trajectory Prediction Using Spatio-Temporal Graph Convolutional Networks

by Bogdan Ilie Sighencea, Ion Rareș Stanciu and Cătălin Daniel Căleanu

Electronics 2023, 12(3), 611; https://doi.org/10.3390/electronics12030611 - 26 Jan 2023

Cited by 21 | Viewed by 6949

Abstract

Predicting pedestrian trajectories in urban scenarios is a challenging task that has a wide range of applications, from video surveillance to autonomous driving. The task is difficult since pedestrian behavior is affected by both their individual path’s history, their interactions with others, and [...] Read more.

Predicting pedestrian trajectories in urban scenarios is a challenging task that has a wide range of applications, from video surveillance to autonomous driving. The task is difficult since pedestrian behavior is affected by both their individual path’s history, their interactions with others, and with the environment. For predicting pedestrian trajectories, an attention-based interaction-aware spatio-temporal graph neural network is introduced. This paper introduces an approach based on two components: a spatial graph neural network (SGNN) for interaction-modeling and a temporal graph neural network (TGNN) for motion feature extraction. The SGNN uses an attention method to periodically collect spatial interactions between all pedestrians. The TGNN employs an attention method as well, this time to collect each pedestrian’s temporal motion pattern. Finally, in the graph’s temporal dimension characteristics, a time-extrapolator convolutional neural network (CNN) is employed to predict the trajectories. Using a lower variable size (data and model) and a better accuracy, the proposed method is compact, efficient, and better than the one represented by the social-STGCNN. Moreover, using three video surveillance datasets (ETH, UCY, and SDD), D-STGCN achieves better experimental results considering the average displacement error (ADE) and final displacement error (FDE) metrics, in addition to predicting more social trajectories. Full article

(This article belongs to the Special Issue Deep Perception in Autonomous Driving)

► Show Figures

Figure 1

19 pages, 4993 KB

Open AccessArticle

Experimental Study of Cloud-to-Ground Lightning Nowcasting with Multisource Data Based on a Video Prediction Method

by Shuchang Guo, Jinyan Wang, Ruhui Gan, Zhida Yang and Yi Yang

Remote Sens. 2022, 14(3), 604; https://doi.org/10.3390/rs14030604 - 27 Jan 2022

Cited by 14 | Viewed by 4649

Abstract

The evolution of lightning generation and extinction is a nonlinear and complex process, and the nowcasting results based on extrapolation and numerical models largely differ from the real situation. In this study, a multiple-input and multiple-output lightning nowcasting model, namely Convolutional Long- and [...] Read more.

The evolution of lightning generation and extinction is a nonlinear and complex process, and the nowcasting results based on extrapolation and numerical models largely differ from the real situation. In this study, a multiple-input and multiple-output lightning nowcasting model, namely Convolutional Long- and Short-Term Memory Lightning Forecast Net (CLSTM-LFN), is constructed to improve the lightning nowcasting results from 0 to 3 h based on video prediction methods in deep learning. The input variables to CLSTM-LFN include historical lightning occurrence frequency and physical variables significantly related to lightning occurrence from numerical model products, which are merged with each other to provide effective information for lightning nowcasting in time and space. The results of batch forecasting tests show that CLSTM-LFN can achieve effective forecasts of 0 to 3 h lightning occurrence areas, and the nowcasting results are better than those of the traditional lightning parameterization scheme and only inputting a single data source. After analyzing the importance of input variables, the results show that the role of numerical model products increases significantly with increasing forecast time, and the relative importance of convective available potential energy is significantly larger than that of other physical variables. Full article

(This article belongs to the Special Issue Remote Sensing of Lightning and Its Applications in Atmospheric Electricity Studies)

► Show Figures

Figure 1

18 pages, 7145 KB

Open AccessArticle

Supervised Learning Based Peripheral Vision System for Immersive Visual Experiences for Extended Display

by Muhammad Ayaz Shirazi, Riaz Uddin and Min-Young Kim

Appl. Sci. 2021, 11(11), 4726; https://doi.org/10.3390/app11114726 - 21 May 2021

Cited by 1 | Viewed by 3611

Abstract

Video display content can be extended to the walls of the living room around the TV using projection. The problem of providing appropriate projection content is hard for the computer and we solve this problem with deep neural network. We propose the peripheral [...] Read more.

Video display content can be extended to the walls of the living room around the TV using projection. The problem of providing appropriate projection content is hard for the computer and we solve this problem with deep neural network. We propose the peripheral vision system that provides the immersive visual experiences to the user by extending the video content using deep learning and projecting that content around the TV screen. The user may manually create the appropriate content for the existing TV screen, but it is too expensive to create it. The PCE (Pixel context encoder) network considers the center of the video frame as input and the outside area as output to extend the content using supervised learning. The proposed system is expected to pave a new road to the home appliance industry, transforming the living room into the new immersive experience platform. Full article

(This article belongs to the Special Issue Deep Image Semantic Segmentation and Recognition)

► Show Figures

Figure 1

17 pages, 1216 KB

Open AccessArticle

Feasibility Study on the Role of Personality, Emotion, and Engagement in Socially Assistive Robotics: A Cognitive Assessment Scenario

by Alessandra Sorrentino, Gianmaria Mancioppi, Luigi Coviello, Filippo Cavallo and Laura Fiorini

Informatics 2021, 8(2), 23; https://doi.org/10.3390/informatics8020023 - 26 Mar 2021

Cited by 16 | Viewed by 5319

Abstract

This study aims to investigate the role of several aspects that may influence human–robot interaction in assistive scenarios. Among all, we focused on semi-permanent qualities (i.e., personality and cognitive state) and temporal traits (i.e., emotion and engagement) of the user profile. To this [...] Read more.

This study aims to investigate the role of several aspects that may influence human–robot interaction in assistive scenarios. Among all, we focused on semi-permanent qualities (i.e., personality and cognitive state) and temporal traits (i.e., emotion and engagement) of the user profile. To this end, we organized an experimental session with 11 elderly users who performed a cognitive assessment with the non-humanoid ASTRO robot. ASTRO robot administered the Mini Mental State Examination test in Wizard of Oz setup. Temporal and long-term qualities of each user profile were assessed by self-report questionnaires and by behavioral features extrapolated by the recorded videos. Results highlighted that the quality of the interaction did not depend on the cognitive state of the participants. On the contrary, the cognitive assessment with the robot significantly reduced the anxiety of the users, by enhancing the trust in the robotic entity. It suggests that the personality and the affect traits of the interacting user have a fundamental influence on the quality of the interaction, also in the socially assistive context. Full article

(This article belongs to the Special Issue Feature Paper in Informatics)

► Show Figures

Figure 1

Search Results (20)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (20)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI