Research

Jump to: Review

15 pages, 851 KiB

Open AccessArticle

Speech Emotion Recognition with Heterogeneous Feature Unification of Deep Neural Network

by Wei Jiang, Zheng Wang, Jesse S. Jin, Xianfeng Han and Chunguang Li

Sensors 2019, 19(12), 2730; https://doi.org/10.3390/s19122730 - 18 Jun 2019

Cited by 57 | Viewed by 7722

Automatic speech emotion recognition is a challenging task due to the gap between acoustic features and human emotions, which rely strongly on the discriminative acoustic features extracted for a given recognition task. We propose a novel deep neural architecture to extract the informative [...] Read more.

Automatic speech emotion recognition is a challenging task due to the gap between acoustic features and human emotions, which rely strongly on the discriminative acoustic features extracted for a given recognition task. We propose a novel deep neural architecture to extract the informative feature representations from the heterogeneous acoustic feature groups which may contain redundant and unrelated information leading to low emotion recognition performance in this work. After obtaining the informative features, a fusion network is trained to jointly learn the discriminative acoustic feature representation and a Support Vector Machine (SVM) is used as the final classifier for recognition task. Experimental results on the IEMOCAP dataset demonstrate that the proposed architecture improved the recognition performance, achieving accuracy of 64% compared to existing state-of-the-art approaches. Full article

(This article belongs to the Special Issue Affective and Immersive Human Computer Interaction via Effective Sensor and Sensing (AI-HCIs))

► Show Figures

Figure 1

23 pages, 8484 KiB

Open AccessFeature PaperArticle

Smart Sensing and Adaptive Reasoning for Enabling Industrial Robots with Interactive Human-Robot Capabilities in Dynamic Environments—A Case Study

by Jaime Zabalza, Zixiang Fei, Cuebong Wong, Yijun Yan, Carmelo Mineo, Erfu Yang, Tony Rodden, Jorn Mehnen, Quang-Cuong Pham and Jinchang Ren

Sensors 2019, 19(6), 1354; https://doi.org/10.3390/s19061354 - 18 Mar 2019

Cited by 17 | Viewed by 7055

Abstract

Traditional industry is seeing an increasing demand for more autonomous and flexible manufacturing in unstructured settings, a shift away from the fixed, isolated workspaces where robots perform predefined actions repetitively. This work presents a case study in which a robotic manipulator, namely a [...] Read more.

Traditional industry is seeing an increasing demand for more autonomous and flexible manufacturing in unstructured settings, a shift away from the fixed, isolated workspaces where robots perform predefined actions repetitively. This work presents a case study in which a robotic manipulator, namely a KUKA KR90 R3100, is provided with smart sensing capabilities such as vision and adaptive reasoning for real-time collision avoidance and online path planning in dynamically-changing environments. A machine vision module based on low-cost cameras and color detection in the hue, saturation, value (HSV) space is developed to make the robot aware of its changing environment. Therefore, this vision allows the detection and localization of a randomly moving obstacle. Path correction to avoid collision avoidance for such obstacles with robotic manipulator is achieved by exploiting an adaptive path planning module along with a dedicated robot control module, where the three modules run simultaneously. These sensing/smart capabilities allow the smooth interactions between the robot and its dynamic environment, where the robot needs to react to dynamic changes through autonomous thinking and reasoning with the reaction times below the average human reaction time. The experimental results demonstrate that effective human-robot and robot-robot interactions can be realized through the innovative integration of emerging sensing techniques, efficient planning algorithms and systematic designs. Full article

(This article belongs to the Special Issue Affective and Immersive Human Computer Interaction via Effective Sensor and Sensing (AI-HCIs))

► Show Figures

Figure 1

16 pages, 2327 KiB

Open AccessArticle

Riemannian Spatio-Temporal Features of Locomotion for Individual Recognition

by Jianhai Zhang, Zhiyong Feng, Yong Su, Meng Xing and Wanli Xue

Sensors 2019, 19(1), 56; https://doi.org/10.3390/s19010056 - 23 Dec 2018

Cited by 1 | Viewed by 2851

Abstract

Individual recognition based on skeletal sequence is a challenging computer vision task with multiple important applications, such as public security, human–computer interaction, and surveillance. However, much of the existing work usually fails to provide any explicit quantitative differences between different individuals. In this [...] Read more.

Individual recognition based on skeletal sequence is a challenging computer vision task with multiple important applications, such as public security, human–computer interaction, and surveillance. However, much of the existing work usually fails to provide any explicit quantitative differences between different individuals. In this paper, we propose a novel 3D spatio-temporal geometric feature representation of locomotion on Riemannian manifold, which explicitly reveals the intrinsic differences between individuals. To this end, we construct mean sequence by aligning related motion sequences on the Riemannian manifold. The differences in respect to this mean sequence are modeled as spatial state descriptors. Subsequently, a temporal hierarchy of covariance are imposed on the state descriptors, making it a higher-order statistical spatio-temporal feature representation, showing unique biometric characteristics for individuals. Finally, we introduce a kernel metric learning method to improve the classification accuracy. We evaluated our method on two public databases: the CMU Mocap database and the UPCV Gait database. Furthermore, we also constructed a new database for evaluating running and analyzing two major influence factors of walking. As a result, the proposed approach achieves promising results in all experiments. Full article

(This article belongs to the Special Issue Affective and Immersive Human Computer Interaction via Effective Sensor and Sensing (AI-HCIs))

► Show Figures

Figure 1

19 pages, 552 KiB

Open AccessArticle

A Novel Instantaneous Phase Detection Approach and Its Application in SSVEP-Based Brain-Computer Interfaces

by Xiangdong Huang, Jingwen Xu and Zheng Wang

Sensors 2018, 18(12), 4334; https://doi.org/10.3390/s18124334 - 07 Dec 2018

Cited by 4 | Viewed by 2185

Abstract

This paper proposes a novel phase estimator based on fully-traversed Discrete Fourier Transform (DFT) which takes all possible truncated DFT spectra into account such that it possesses two merits of `direct phase extraction’ (namely accurate instantaneous phase information can be extracted without any [...] Read more.

This paper proposes a novel phase estimator based on fully-traversed Discrete Fourier Transform (DFT) which takes all possible truncated DFT spectra into account such that it possesses two merits of `direct phase extraction’ (namely accurate instantaneous phase information can be extracted without any correction) and suppressing spectral leakage. This paper also proves that the proposed phase estimator complies with the 2-parameter joint estimation model rather than the conventional 3-parameter joint model. Numerical results verify the above two merits and demonstrate that the proposed estimator can extract phase information from noisy multi-tone signals. Finally, real data analysis shows that fully-traversed DFT can achieve a better classification on the phase of steady-state visual evoked potential (SSVEP) brain-computer interface (BCI) than the conventional DFT estimator does. Besides, the proposed phase estimator imposes no restrictions on the relationship between the sampling rates and the stimulus frequencies, thus it is capable of wider applications in phase-coded SSVEP BCIs, when compared with the existing estimators. Full article

(This article belongs to the Special Issue Affective and Immersive Human Computer Interaction via Effective Sensor and Sensing (AI-HCIs))

► Show Figures

Figure 1

17 pages, 1410 KiB

Open AccessArticle

Two-Way Affective Modeling for Hidden Movie Highlights’ Extraction

by Zheng Wang, Xinyu Yan, Wei Jiang and Meijun Sun

Sensors 2018, 18(12), 4241; https://doi.org/10.3390/s18124241 - 03 Dec 2018

Viewed by 3234

Abstract

Movie highlights are composed of video segments that induce a steady increase of the audience’s excitement. Automatic movie highlights’ extraction plays an important role in content analysis, ranking, indexing, and trailer production. To address this challenging problem, previous work suggested a direct mapping [...] Read more.

Movie highlights are composed of video segments that induce a steady increase of the audience’s excitement. Automatic movie highlights’ extraction plays an important role in content analysis, ranking, indexing, and trailer production. To address this challenging problem, previous work suggested a direct mapping from low-level features to high-level perceptual categories. However, they only considered the highlight as intense scenes, like fighting, shooting, and explosions. Many hidden highlights are ignored because their low-level features’ values are too low. Driven by cognitive psychology analysis, combined top-down and bottom-up processing is utilized to derive the proposed two-way excitement model. Under the criteria of global sensitivity and local abnormality, middle-level features are extracted in excitement modeling to bridge the gap between the feature space and the high-level perceptual space. To validate the proposed approach, a group of well-known movies covering several typical types is employed. Quantitative assessment using the determined excitement levels has indicated that the proposed method produces promising results in movie highlights’ extraction, even if the response in the low-level audio-visual feature space is low. Full article

(This article belongs to the Special Issue Affective and Immersive Human Computer Interaction via Effective Sensor and Sensing (AI-HCIs))

► Show Figures

Figure 1

16 pages, 3623 KiB

Open AccessArticle

Exploring the Consequences of Crowd Compression Through Physics-Based Simulation

by Libo Sun and Norman I. Badler

Sensors 2018, 18(12), 4149; https://doi.org/10.3390/s18124149 - 27 Nov 2018

Cited by 1 | Viewed by 3194

Abstract

Statistical analysis of accidents in recent years shows that crowd crushes have become significant non-combat, non-environmental public disasters. Unlike common accidents such as fires, crowd crushes may occur without obvious external causes, and may arise quickly and unexpectedly in otherwise normal surroundings. We [...] Read more.

Statistical analysis of accidents in recent years shows that crowd crushes have become significant non-combat, non-environmental public disasters. Unlike common accidents such as fires, crowd crushes may occur without obvious external causes, and may arise quickly and unexpectedly in otherwise normal surroundings. We use physics-based simulations to understand the processes and consequences of compressive forces on high density static crowds consisting of up to 400 agents in a restricted space characterized by barriers to free movement. According to empirical observation and experimentation by others, we know that local high packing density is an important factor leading to crowd crushes and consequent injuries. We computationally verify our hypothesis that compressive forces create high local crowd densities which exceed human tolerance. Affected agents may thus be unable to move or escape and will present additional movement obstacles to others. Any high density crowd simulation should therefore take into account these possible negative effects on crowd mobility and behavior. Such physics-based simulations may therefore assist in the design of crowded spaces that could reduce the possibility of crushes and their consequences. Full article

(This article belongs to the Special Issue Affective and Immersive Human Computer Interaction via Effective Sensor and Sensing (AI-HCIs))

► Show Figures

Figure 1

16 pages, 3705 KiB

Open AccessArticle

Emotion Recognition Based on Multichannel Physiological Signals with Comprehensive Nonlinear Processing

by Xingxing Zhang, Chao Xu, Wanli Xue, Jing Hu, Yongchuan He and Mengxin Gao

Sensors 2018, 18(11), 3886; https://doi.org/10.3390/s18113886 - 11 Nov 2018

Cited by 19 | Viewed by 4222

Abstract

Multichannel physiological datasets are usually nonlinear and separable in the field of emotion recognition. Many researchers have applied linear or partial nonlinear processing in feature reduction and classification, but these applications did not work well. Therefore, this paper proposed a comprehensive nonlinear method [...] Read more.

Multichannel physiological datasets are usually nonlinear and separable in the field of emotion recognition. Many researchers have applied linear or partial nonlinear processing in feature reduction and classification, but these applications did not work well. Therefore, this paper proposed a comprehensive nonlinear method to solve this problem. On the one hand, as traditional feature reduction may cause the loss of significant amounts of feature information, Kernel Principal Component Analysis (KPCA) based on radial basis function (RBF) was introduced to map the data into a high-dimensional space, extract the nonlinear information of the features, and then reduce the dimension. This method can provide many features carrying information about the structure in the physiological dataset. On the other hand, considering its advantages of predictive power and feature selection from a large number of features, Gradient Boosting Decision Tree (GBDT) was used as a nonlinear ensemble classifier to improve the recognition accuracy. The comprehensive nonlinear processing method had a great performance on our physiological dataset. Classification accuracy of four emotions in 29 participants achieved 93.42%. Full article

(This article belongs to the Special Issue Affective and Immersive Human Computer Interaction via Effective Sensor and Sensing (AI-HCIs))

► Show Figures

Figure 1

Review

Jump to: Research

34 pages, 1959 KiB

Open AccessReview

EEG-Based Brain-Computer Interfaces Using Motor-Imagery: Techniques and Challenges

by Natasha Padfield, Jaime Zabalza, Huimin Zhao, Valentin Masero and Jinchang Ren

Sensors 2019, 19(6), 1423; https://doi.org/10.3390/s19061423 - 22 Mar 2019

Cited by 330 | Viewed by 19268

Abstract

Electroencephalography (EEG)-based brain-computer interfaces (BCIs), particularly those using motor-imagery (MI) data, have the potential to become groundbreaking technologies in both clinical and entertainment settings. MI data is generated when a subject imagines the movement of a limb. This paper reviews state-of-the-art signal processing [...] Read more.

Electroencephalography (EEG)-based brain-computer interfaces (BCIs), particularly those using motor-imagery (MI) data, have the potential to become groundbreaking technologies in both clinical and entertainment settings. MI data is generated when a subject imagines the movement of a limb. This paper reviews state-of-the-art signal processing techniques for MI EEG-based BCIs, with a particular focus on the feature extraction, feature selection and classification techniques used. It also summarizes the main applications of EEG-based BCIs, particularly those based on MI data, and finally presents a detailed discussion of the most prevalent challenges impeding the development and commercialization of EEG-based BCIs. Full article

(This article belongs to the Special Issue Affective and Immersive Human Computer Interaction via Effective Sensor and Sensing (AI-HCIs))

► Show Figures