Deep 3D Convolutional Neural Network for Facial Micro-Expression Analysis from Video Images
Abstract
:1. Introduction
- We have developed a 3D CNN model architecture for micro-expression recognition which is able to extract spatial and temporal features simultaneously;
- A novel pre-processing technique is employed by selecting the Apex frame sequence from the entire video, where the timestamp of the most pronounced emotion is centered within this sequence;
- Stratified K-fold was applied for model evaluation because it is suitable for small datasets with imbalanced class distribution as in our case;
- Comprehensive experimental validation was performed by comparing the proposed model with two reimplemented state-of-the-art methods in intra-dataset as well as cross-dataset evaluations in a total of eight different scenarios. To the best of our knowledge, such an extensive evaluation in this or comparable manner has not been conducted for micro-expression recognition so far.
2. Related Works
2.1. Handcrafted Methods
2.2. Deep Learning-Based Methods
3. Datasets
3.1. CASME II
3.2. SMIC
3.3. SAMM
4. Pre-Processing
4.1. Face Detection and Alignment
4.2. Facial Landmark Detection
4.3. Apex Frame Spotting
4.4. Selection of Apex Frame Sequence
5. Network Architectures
5.1. Model-A (Proposed 3D CNN Model)
5.2. Split-Model
5.3. Model-B
5.4. Model-C
6. Model Training Parameters
6.1. Model-A
6.2. Model-B
6.3. Model-C
7. Experimental Analysis
8. Results And Discussions
8.1. Train–Test Split
8.2. Stratified K-Fold
8.2.1. Scenario-1
8.2.2. Scenario-2
8.2.3. Scenario-3
8.2.4. Scenario-4
8.2.5. Scenario-5
8.2.6. Scenario-6
8.2.7. Scenario-7
8.2.8. Scenario-8
9. Applications and Use Cases
10. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
3D CNN | 3D Convolutional Neural Networks |
3DHOG | 3D Histogram-Oriented Gradients |
ANN | Artificial Neural Networks |
AU | Action Unit |
CASME | Chinese Academy of Sciences Micro-Expression |
CASME II | Chinese Academy of Sciences Micro-Expression II |
CNN | Convolutional Neural Networks |
EVM | Eulerian Video Magnification |
FACS | Facial Action Coding System |
FPS | Frames per second |
HOG | Histogram of Gradients |
LBP | Local Binary Pattern |
LBP-TOP | Local Binary Pattern histograms from Three Orthogonal Planes |
LSTM | Long Short-Term Memory |
MMOD | max-margin object-detection algorithm |
ReLU | Rectified Linear Unit |
ROI | Region of Interest |
SAM | Self-Assessment Manikins |
SAMM | Spontaneous Actions and Micro-Movements |
SGD | Stochastic Gradient Descent |
SMIC | Spontaneous Micro-Expression Corpus |
SVM | Support Vector Machine |
TIM | Temporal interpolation model |
References
- Yan, W.J.; Wu, Q.; Liang, J.; Chen, Y.H.; Fu, X. How fast are the leaked facial expressions: The duration of micro-expressions. J. Nonverbal Behav. 2013, 37, 217–230. [Google Scholar] [CrossRef]
- Ekman, P. Darwin, deception, and facial expression. Ann. N. Y. Acad. Sci. 2003, 1000, 205–221. [Google Scholar] [CrossRef] [Green Version]
- Haggard, E.A.; Isaacs, K.S. Micromomentary facial expressions as indicators of ego mechanisms in psychotherapy. In Methods of Research in Psychotherapy; The Century Psychology Series; Springer: Boston, MA, USA, 1966; pp. 154–165. [Google Scholar] [CrossRef]
- Ekman, P.; Friesen, W.V. Nonverbal leakage and clues to deception. Psychiatry 1969, 32, 88–106. [Google Scholar] [CrossRef] [PubMed]
- Endres, J.; Laidlaw, A. Micro-expression recognition training in medical students: A pilot study. BMC Med. Educ. 2009, 9, 1–6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Huang, X.; Wang, S.J.; Zhao, G.; Piteikainen, M. Facial micro-expression recognition using spatiotemporal local binary pattern with integral projection. In Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), Santiago, Chile, 7–13 December 2015; pp. 1–9. [Google Scholar]
- Liu, Y.J.; Zhang, J.K.; Yan, W.J.; Wang, S.J.; Zhao, G.; Fu, X. A main directional mean optical flow feature for spontaneous micro-expression recognition. IEEE Trans. Affect. Comput. 2015, 7, 299–310. [Google Scholar] [CrossRef]
- Wang, Y.; See, J.; Phan, R.C.W.; Oh, Y.H. Lbp with six intersection points: Reducing redundant information in lbp-top for micro-expression recognition. In ACCV 2014: Computer Vision—ACCV 2014; Lecture Notes in Computer Science Book Series; Springer: Cham, Switzerland, 2015; pp. 525–537. [Google Scholar]
- Guo, J.; Zhou, S.; Wu, J.; Wan, J.; Zhu, X.; Lei, Z.; Li, S.Z. Multi-modality network with visual and geometrical information for micro emotion recognition. In Proceedings of the 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), Washington, DC, USA, 30 May–3 June 2017; pp. 814–819. [Google Scholar]
- Liong, S.T.; Gan, Y.S.; Yau, W.C.; Huang, Y.C.; Ken, T.L. Off-apexnet on micro-expression recognition system. arXiv 2018, arXiv:1805.08699. [Google Scholar]
- Patel, D.; Hong, X.; Zhao, G. Selective deep features for micro-expression recognition. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 4–8 December 2016; pp. 2258–2263. [Google Scholar]
- Li, J.; Wang, Y.; See, J.; Liu, W. Micro-expression recognition based on 3D flow convolutional neural network. Pattern Anal. Appl. 2019, 22, 1331–1339. [Google Scholar] [CrossRef]
- Khor, H.Q.; See, J.; Phan, R.C.W.; Lin, W. Enriched long-term recurrent convolutional network for facial micro-expression recognition. In Proceedings of the 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an, China, 15–19 May 2018; pp. 667–674. [Google Scholar]
- Kim, D.H.; Baddar, W.J.; Jang, J.; Ro, Y.M. Multi-objective based spatio-temporal feature representation learning robust to expression intensity variations for facial expression recognition. IEEE Trans. Affect. Comput. 2017, 10, 223–236. [Google Scholar] [CrossRef]
- Wang, L.; Jia, J.; Mao, N. Micro-Expression Recognition Based on 2D-3D CNN. In Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020; pp. 3152–3157. [Google Scholar]
- Polikovsky, S.; Kameda, Y.; Ohta, Y. Facial micro-expressions recognition using high speed camera and 3D-gradient descriptor. In Proceedings of the 3rd International Conference on Imaging for Crime Detection and Prevention (ICDP 2009), London, UK, 3 December 2009. [Google Scholar]
- Polikovsky, S.; Kameda, Y.; Ohta, Y. Facial micro-expression detection in hi-speed video based on facial action coding system (FACS). IEICE Trans. Inf. Syst. 2013, 96, 81–92. [Google Scholar] [CrossRef] [Green Version]
- Pfister, T.; Li, X.; Zhao, G.; Pietikäinen, M. Recognising spontaneous facial micro-expressions. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 1449–1456. [Google Scholar]
- Lu, H.; Kpalma, K.; Ronsin, J. Motion descriptors for micro-expression recognition. Signal Process. Image Commun. 2018, 67, 108–117. [Google Scholar] [CrossRef]
- Shreve, M.; Godavarthy, S.; Goldgof, D.; Sarkar, S. Macro-and micro-expression spotting in long videos using spatio-temporal strain. In Proceedings of the 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG), Santa Barbara, CA, USA, 21–25 March 2011; pp. 51–56. [Google Scholar]
- Wang, Y.; Yu, H.; Stevens, B.; Liu, H. Dynamic facial expression recognition using local patch and LBP-TOP. In Proceedings of the 2015 8th International Conference on Human System Interaction (HSI), Warsaw, Poland, 25–27 June 2015; pp. 362–367. [Google Scholar]
- Liong, S.T.; Gan, Y.S.; See, J.; Khor, H.Q.; Huang, Y.C. Shallow Triple Stream Three-dimensional CNN (STSTNet) for Micro-expression Recognition. In Proceedings of the 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), Lille, France, 14–18 May 2019; pp. 1–5. [Google Scholar]
- Reddy, S.P.T.; Karri, S.T.; Dubey, S.R.; Mukherjee, S. Spontaneous facial micro-expression recognition using 3D spatiotemporal convolutional neural networks. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
- Zhang, H.; Liu, B.; Tao, J.; Lv, Z. Facial Micro-Expression Recognition Based on Multi-Scale Temporal and Spatial Features. In Proceedings of the ICMI ’21: International Conference on Multimodal Interaction, Montreal, QC, Canada, 18–22 October 2021; pp. 80–84. [Google Scholar]
- Jin, W.; Meng, X.; Wei, D.; Lei, W.; Xinran, W. Micro-expression recognition algorithm based on the combination of spatial and temporal domains. High Technol. Lett. 2021, 27, 303–309. [Google Scholar]
- Xu, W.; Zheng, H.; Yang, Z.; Yang, Y. Micro-Expression Recognition Base on Optical Flow Features and Improved MobileNetV2. KSII Trans. Internet Inf. Syst. (TIIS) 2021, 15, 1981–1995. [Google Scholar]
- Wu, C.; Guo, F. TSNN: Three-Stream Combining 2D and 3D Convolutional Neural Network for Micro-Expression Recognition. IEEJ Trans. Electr. Electron. Eng. 2021, 16, 98–107. [Google Scholar] [CrossRef]
- Wang, Y.; Ma, H.; Xing, X.; Pan, Z. Eulerian Motion Based 3DCNN Architecture for Facial Micro-Expression Recognition. In International Conference on Multimedia Modeling; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2020; pp. 266–277. [Google Scholar]
- Chen, B.; Liu, K.H.; Xu, Y.; Wu, Q.Q.; Yao, J.F. Block Division Convolutional Network with Implicit Deep Features Augmentation for Micro-Expression Recognition. IEEE Trans. Multimed. 2022; early access. [Google Scholar] [CrossRef]
- Liong, S.T.; See, J.; Wong, K.; Phan, R.C.W. Less is more: Micro-expression recognition from video using apex frame. Signal Process. Image Commun. 2018, 62, 82–92. [Google Scholar] [CrossRef] [Green Version]
- Li, J.; Wang, T.; Wang, S.J. Facial micro-expression recognition based on deep local-holistic network. Appl. Sci. 2022, 12, 4643. [Google Scholar] [CrossRef]
- Yan, W.J.; Li, X.; Wang, S.J.; Zhao, G.; Liu, Y.J.; Chen, Y.H.; Fu, X. CASME II: An improved spontaneous micro-expression database and the baseline evaluation. PLoS ONE 2014, 9, e86041. [Google Scholar] [CrossRef] [Green Version]
- Li, X.; Pfister, T.; Huang, X.; Zhao, G.; Pietikäinen, M. A spontaneous micro-expression database: Inducement, collection and baseline. In Proceedings of the 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Shanghai, China, 22–26 April 2013; pp. 1–6. [Google Scholar]
- Davison, A.K.; Lansley, C.; Costen, N.; Tan, K.; Yap, M.H. Samm: A spontaneous micro-facial movement dataset. IEEE Trans. Affect. Comput. 2016, 9, 116–129. [Google Scholar] [CrossRef] [Green Version]
- King, D.E. Dlib-ml: A Machine Learning Toolkit. J. Mach. Learn. Res. 2009, 10, 1755–1758. [Google Scholar]
- Oh, Y.H.; See, J.; Le Ngo, A.C.; Phan, R.C.W.; Baskaran, V.M. A survey of automatic facial micro-expression analysis: Databases, methods, and challenges. Front. Psychol. 2018, 9, 1128. [Google Scholar] [CrossRef] [Green Version]
- Liong, S.T.; See, J.; Wong, K.; Le Ngo, A.C.; Oh, Y.H.; Phan, R. Automatic apex frame spotting in micro-expression database. In Proceedings of the 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur, Malaysia, 3–6 November 2015; pp. 665–669. [Google Scholar]
- Yan, W.J.; Chen, Y.H. Measuring dynamic micro-expressions via feature extraction methods. J. Comput. Sci. 2018, 25, 318–326. [Google Scholar] [CrossRef]
- Ma, H.; An, G.; Wu, S.; Yang, F. A region histogram of oriented optical flow (RHOOF) feature for apex frame spotting in micro-expression. In Proceedings of the 2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), Xiamen, China, 6–9 November 2017; pp. 281–286. [Google Scholar]
- Jiao, Y.; Jing, M.; Hu, Y.; Sun, K. Research on a Micro-Expression Recognition Algorithm based on 3D-CNN. In Proceedings of the 2021 3rd International Conference on Intelligent Control, Measurement and Signal Processing and Intelligent Oil Field (ICMSP), Xi’an, China, 23–25 July 2021; pp. 221–225. [Google Scholar]
- Takalkar, M.; Xu, M.; Wu, Q.; Chaczko, Z. A survey: Facial micro-expression recognition. Multimed. Tools Appl. 2018, 77, 19301–19325. [Google Scholar] [CrossRef]
Emotions | CASME II | SMIC | SAMM |
---|---|---|---|
Happy | 32 | 51 | 26 |
Disgust | 63 | 70 | 15 |
Surprise | 28 | 43 | 9 |
Total | 123 | 164 | 50 |
Layer Type | Filter Size | Output Shape |
---|---|---|
Conv3D-1 | 3 × 3 × 3 | 36 × 128 × 128 × 16 |
BatchNorm-1 | - | 36 × 128 × 128 × 16 |
3D-MaxPooling-1 | 3 × 3 × 3 | 12 × 42 × 42 × 16 |
Dropout-1 | - | 12 × 42 × 42 × 16 |
Conv3D-2 | 3 × 3 × 3 | 12 × 42 × 42 × 16 |
BatchNorm-2 | - | 12 × 42 × 42 × 16 |
3D-MaxPooling-2 | 3 × 3 × 3 | 4 × 14 × 14 × 16 |
Dropout-2 | - | 4 × 14 × 14 × 16 |
Flatten | - | 12,544 |
Dense-1 | - | 128 |
Dropout-3 | - | 128 |
Dense-2 | - | 3 |
Type of Frame Sequence | ||
---|---|---|
Dataset | Initial Frame Sequence | Apex Frame Sequence |
CASME II | 46.9% | 56.5% |
SMIC | 34.3% | 43.7% |
Scenario | Train | Test | Model-A | Model-B | Model-C |
---|---|---|---|---|---|
01 | SMIC | SMIC | 43.7% | 33.5% | 37.3% |
02 | CASME II | CASME II | 56.5% | 45.4% | 48.1% |
03 | Combined | Combined | 88.2% | 85.4% | 80.4% |
04 | SMIC | SAMM | 44.3% | 31.1% | 42.0% |
05 | CASME II | SAMM | 24.8% | 24.3% | 23.1% |
06 | SMIC | CASME II | 44.7% | 43.7% | 39.1% |
07 | CASME II | SMIC | 37.7% | 35.4% | 36.5% |
08 | Combined | SAMM | 27.1% | 23.1% | 36.9% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Talluri, K.K.; Fiedler, M.-A.; Al-Hamadi, A. Deep 3D Convolutional Neural Network for Facial Micro-Expression Analysis from Video Images. Appl. Sci. 2022, 12, 11078. https://doi.org/10.3390/app122111078
Talluri KK, Fiedler M-A, Al-Hamadi A. Deep 3D Convolutional Neural Network for Facial Micro-Expression Analysis from Video Images. Applied Sciences. 2022; 12(21):11078. https://doi.org/10.3390/app122111078
Chicago/Turabian StyleTalluri, Kranthi Kumar, Marc-André Fiedler, and Ayoub Al-Hamadi. 2022. "Deep 3D Convolutional Neural Network for Facial Micro-Expression Analysis from Video Images" Applied Sciences 12, no. 21: 11078. https://doi.org/10.3390/app122111078