Next Article in Journal
SDN-Based Survivability Analysis for V2I Communications
Next Article in Special Issue
Feature Selection on 2D and 3D Geometric Features to Improve Facial Expression Recognition
Previous Article in Journal
Enhancement of Diversity in Production and Applications Utilizing Electrolytically Polymerized Rubber Sensors with MCF: The Second Report on Various Engineering Applications
Previous Article in Special Issue
An Innovative Multi-Model Neural Network Approach for Feature Selection in Emotion Recognition Using Deep Feature Clustering
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition

1
School of Information Engineering, Zhengzhou University, Zhengzhou 450000, China
2
Department of Electrical and Computer Engineering, Ryerson University, Toronto, ON M5B 2K3, Canada
*
Author to whom correspondence should be addressed.
Sensors 2020, 20(17), 4673; https://doi.org/10.3390/s20174673
Submission received: 20 July 2020 / Revised: 12 August 2020 / Accepted: 17 August 2020 / Published: 19 August 2020

Abstract

To achieve the satisfactory performance of human action recognition, a central task is to address the sub-action sharing problem, especially in similar action classes. Nevertheless, most existing convolutional neural network (CNN)-based action recognition algorithms uniformly divide video into frames and then randomly select the frames as inputs, ignoring the distinct characteristics among different frames. In recent years, depth videos have been increasingly used for action recognition, but most methods merely focus on the spatial information of the different actions without utilizing temporal information. In order to address these issues, a novel energy-guided temporal segmentation method is proposed here, and a multimodal fusion strategy is employed with the proposed segmentation method to construct an energy-guided temporal segmentation network (EGTSN). Specifically, the EGTSN had two parts: energy-guided video segmentation and a multimodal fusion heterogeneous CNN. The proposed solution was evaluated on a public large-scale NTU RGB+D dataset. Comparisons with state-of-the-art methods demonstrate the effectiveness of the proposed network.
Keywords: multimodal action recognition; motion energy; temporal segmentation network; heterogeneous convolutional neural networks multimodal action recognition; motion energy; temporal segmentation network; heterogeneous convolutional neural networks

Share and Cite

MDPI and ACS Style

Liu, Q.; Chen, E.; Gao, L.; Liang, C.; Liu, H. Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition. Sensors 2020, 20, 4673. https://doi.org/10.3390/s20174673

AMA Style

Liu Q, Chen E, Gao L, Liang C, Liu H. Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition. Sensors. 2020; 20(17):4673. https://doi.org/10.3390/s20174673

Chicago/Turabian Style

Liu, Qiang, Enqing Chen, Lei Gao, Chengwu Liang, and Hao Liu. 2020. "Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition" Sensors 20, no. 17: 4673. https://doi.org/10.3390/s20174673

APA Style

Liu, Q., Chen, E., Gao, L., Liang, C., & Liu, H. (2020). Energy-Guided Temporal Segmentation Network for Multimodal Human Action Recognition. Sensors, 20(17), 4673. https://doi.org/10.3390/s20174673

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop