A Multi-Layer Fusion-Based Facial Expression Recognition Approach with Optimal Weighted AUs
Abstract
:1. Introduction
2. Related Work
2.1. Features for Facial Expression Recognition
2.2. Fusion Strategy for Facial Expression Recognition
2.3. Facial Partition Strategy for Expression Recognition
3. Overview of Our Proposed Multi-Layer Fusing Facial Recognition Approach
4. Selection of Affective Relative AUs and Feature Extraction
4.1. Choosing Affective Relative AUs Based on FACS
4.2. AU Feature Extracting Solution
- Gabor feature: Selecting a two-dimension Gabor wavelet with the scale and direction and respectively. Together with PCA, dimension reduction Gabor is extracted to represent each AU.
- Geometry feature: Utilizing the AAM shape model to calibrate feature points and generate corresponding geometry features for AU regions, and making the coordinate of feature points the geometry feature vector.
- LBP feature: Choosing a flexible circle operator to extract the local texture of each AU, and using a histogram to record the LBP feature vectors of AU regions.
- WLD feature: Using an emerging method that synthesizes multiple orientation and excitation operators of sub-image sets, and forming a histogram to record WLD feature vectors of each AU region.
5. Analysing Feature Fusion Performance According to Kappa-Error Diagrams
5.1. Theory of Kappa-Error Diagram Evaluation
5.2. Analyzing Feature Fusion Performance for AU Prediction Based on Kappa-Error Diagram Evaluation
6. Multi-Layer Fusion with Weighted AU for Facial Expression Recognition
6.1. Multi-Layer AUs’ Fusion with Stacking Framework
Stacking Procedure | |
Step 1 | i = 1. |
Step 2 | If i ≤ n, go to Step 3. Otherwise, go to Step 4. |
Step 3 | Classifying with the base classifier Fi. For each sample S, the probability vector Pi = {Pi1, …, Pij, …, Pim} under the base classifier Fi is derived, where pij indicates the probability that sample S is assigned to class Cj (j = 1, 2, …, m). Then i++ go to Step 2. |
Step 4 | The classification results of all base classifiers are obtained, here marked as P = {P1, P2, …, Pn}. Then go to Step 5. |
Step 5 | The meta-classifier M processes the input data, viz. the matrix P from base classifiers, and outputs the ultimate recognition result. |
6.2. AUs’ Weight Learning by Association Rules
6.3. AU Weighted Fusion Algorithm for Facial Expression Recognition
Abbreviations and notation AU—action unit—active for each instance/image AUi—action unit block—e.g., use (2, 4, 10, 13, 15, 16, 24, 27) G, A, L, W—features—e.g., use (Gabor, AAM, LBP, WLD) PP—prediction probability Ei—facial expressions—e.g., angry, disgust, fear, happy, sad, surprise |
Fusion level 1 |
A. AUs’segmentation: divide facial image into eight AU blocks which related with six basic expressions. AUi consists of AU2, AU4, AU10, AU13, AU15, AU16, AU24, AU27; |
B. Feature extraction: select G, A, L, W feature descriptors; extract each AU region. For one kind of AU, , i ∈ {1, 2, ..., 8}; |
C. Base-classifiers recognition: treat SMO, MLP, IBK as a base classifier to recognize AUs; calculate PP of each AUi. For one kind of AU, the , and j is SMO, MLP, IBK; |
Fusion level 2 |
D. AUi PP combination: integrate PP from three classifiers matched to four features, for one AU region, format 12(3 × 4) dimensions vector. , and i ∈ {1, 2, ..., 8}, m ∈ {1, 2, 3, ..., 12}; |
E. AUi weighted optimal: employ an a priori algorithm, mining the strong relationship between AUi and expressions; produce the AU tuning matrix W6×8; Note: Each row is a weighted AU feature input for one basic expression classification. |
F. Meta-classifiers recognition: use one type of classifier to recognize facial expression; including MLP, SMO, IBK and NB. Note: there are six binary classifiers for determining the possibility of each expression. The final expression is the one with the highest prediction possibility. |
7. Experimental Results
7.1. Image Datasets and Experimental Setup
7.2. Comparison on Multi Features or Classifiers Fusion Performance Analysis
7.3. Performance Analysis with a Variety of Single Classifier Performance
7.4. Experiment on AU Weighting Multi-Layer Fusion Expression Recognition Performance Analysis
7.5. Comparison with Several State-Of-The-Art Expression Recognition Methods
8. Conclusions and Future Work
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Mehrabian, A. Communication without words. Psychol. Today 1968, 2, 53–56. [Google Scholar]
- Bartlett, M.S.; Littlewort, G.; Frank, M.; Lainscsek, C.; Fasel, I.; Movellan, J. Recognizing facial expression: Machine learning and application to spontaneous behavior. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, CA, USA, 20–25 June 2005; pp. 568–573.
- Tian, Y.; Kanade, T.; Cohn, J.F. Evaluation of Gabor-wavelet-based facial action unit recognition in image sequences of increasing complexity. In Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition, Washington, DC, USA, 20–21 May 2002; pp. 229–234.
- Lu, H.; Wang, Z.; Liu, X. Facial expression recognition using NKFDA method with Gabor features. In Proceedings of the Sixth World Congress on Intelligent Control and Automation (WCICA), Dalian, China, 21–23 June 2006; pp. 9902–9906.
- Zhang, L.; Tjondronegoro, D. Selecting, optimizing and fusing ‘salient’ Gabor features for facial expression recognition. In Proceedings of the 16th International Conference, ICONIP 2009, Bangkok, Thailand, 1–5 December 2009.
- Lyons, M.; Akamatsu, S.; Kamachi, M.; Gyoba, J. Coding facial expressions with Gabor wavelets. In Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 14–16 April 1998; pp. 200–205.
- Hu, Y.; Zeng, Z.; Yin, L.; Wei, X.; Zhou, X.; Huang, T.S. Multiview facial expression recognition. In Proceedings of the 8th IEEE International Conference on Automatic Face & Gesture Recognition, Amsterdam, The Netherlands, 17–19 September 2008; pp. 1–6.
- Dahmane, M.; Meunier, J. Emotion recognition using dynamic grid-based HoG features. In Proceedings of the 2011 IEEE International Conference Automatic Face & Gesture Recognition and Workshops (FG 2011), Santa Barbara, CA, USA, 21–25 March 2011.
- Zhao, G.; Pietikainen, M. Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 915–928. [Google Scholar] [CrossRef] [PubMed]
- Valstar, M.F.; Mehu, M.; Jiang, B.; Pantic, M.; Scherer, K. Metaanalyis of the first facial expression recognition challenge. IEEE Trans. Syst. Man Cybern. B Cybern. 2012, 42, 966–979. [Google Scholar] [CrossRef] [PubMed]
- Jain, S.; Hu, C.; Aggarwal, J.K. Facial expression recognition with temporal modeling of shapes. In Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, 6–13 November 2011; pp. 1642–1649.
- Chen, J.; Shan, S.; He, C.; Zhao, G.; Pietikainen, M.; Chen, X.; Gao, W. WLD: A robust local image descriptor. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 1705–1720. [Google Scholar] [CrossRef] [PubMed]
- Gong, D.; Li, S.; Xiang, Y. Face recognition using the Weber Local Descriptor. Neurocomputing 2011, 122, 589–592. [Google Scholar]
- Whitehill, J.; Bartlett, M.S.; Littlewort, G.; Fasel, I.; Movellan, J.R. Towards practical smile detection. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 31, 2106–2111. [Google Scholar] [CrossRef] [PubMed]
- Yang, P.; Liu, Q.; Metaxas, D.N. Boosting coded dynamic features for facial action units and facial expression recognition. In Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA, 17–22 June 2007; pp. 1–6.
- Hui, J.; Wen, G. Analysis and application of the facial expression motions based on Eigen-flow. J. Softw. 2003, 14, 2098–2105. [Google Scholar]
- Cohn, J.F.; Zlochower, A. J.; Lien, J.; Kanade, T. Automated face analysis by feature point tracking has high concurrent validity with manual FACS coding. Psychophysiology 1999, 36, 35–43. [Google Scholar] [CrossRef] [PubMed]
- Senechal, T.; Rapp, V.; Salam, H.; Seguier, R.; Bailly, K.; Prevost, L. Combining AAM coefficients with LGBP histograms in the multi-kernel SVM framework to detect facial action units. In Proceedings of the Automatic Face & Gesture Recognition and Workshops (FG 2011), Barbara, CA, USA, 21–25 March 2011; pp. 860–865.
- Eckert, M.; Gil, A.; Zapatero, D.; Meneses, J.; Martínez Ortega, J.F. Fast facial expression recognition for emotion awareness disposal. In Proceedings of the 6th International conference on Consumer Electronics, Berlin, Germany, 5–7 September 2016.
- Vezzetti, E.; Marcolin, F.; Fracastoro, G. 3D face recognition: An automatic strategy based on geometrical descriptors and landmarks. Robot. Auton. Syst. 2014, 62, 1768–1776. [Google Scholar] [CrossRef]
- Vezzetti, E.; Marcolin, F. 3D Landmarking in multiexpression face analysis: A Preliminary study on eyebrows and mouth. Aesthet. Plast. Surg. 2014, 38, 796–811. [Google Scholar] [CrossRef] [PubMed]
- Essa, I.A.; Pentland, A.P. Coding, analysis, interpretation, and recognition of facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 757–763. [Google Scholar] [CrossRef]
- Ghimire, D.; Lee, J. Geometric Feature-Based Facial Expression Recognition in Image Sequences Using Multi-Class AdaBoost and Support Vector Machines. Sensors 2013, 13, 7714–7734. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Lam, E.Y. Facial expression recognition using deep neural networks. In Proceedings of the 2015 IEEE International Conference on Imaging Systems and Techniques (IST), Macau, China, 16–18 September 2015.
- Kuncheva, L.I. A bound on kappa-error diagrams for analysis of classifier ensembles. IEEE Trans. Knowl. Data Eng. 2013, 25, 494–501. [Google Scholar] [CrossRef]
- Zavaschi, T.H.H.; Britto, A.S., Jr.; Oliveira, L.E.S.; Koerich, A.L. Fusion of feature sets and classifiers for facial expression recognition. Expert Syst. Appl. 2013, 40, 646–655. [Google Scholar] [CrossRef]
- Koelstra, S.; Pantic, M.; Patras, I. A dynamic texture-based approach to recognition of facial actions and their temporal models. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 1940–1954. [Google Scholar] [CrossRef] [PubMed]
- Valstar, M.; Pantic, M. Fully automatic facial action unit detection and temporal analysis. In Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition, New York, NY, USA, 17–22 June 2006; p. 149.
- Jia, X.; Zhang, Y.; Powers, D.; Ali, H.B. Multi-classifier fusion based facial expression recognition approach. KSII Trans. Internet Inf. Syst. 2014, 8, 196–212. [Google Scholar]
- Cunningham, D.W.; Kleiner, M.; Wallraven, C.; Lthoff, H.H. Manipulating video sequences to determine the components of conversational facial expressions. ACM Trans. Appl. Percept. 2005, 2, 251–269. [Google Scholar] [CrossRef]
- Hsieh, C.C.; Hsih, M.H.; Jiang, M.K.; Cheng, Y.M.; Liang, E.H. Effective semantic features for facial expressions recognition using SVM. Multimed. Tools Appl. 2016, 75, 6663–6682. [Google Scholar] [CrossRef]
- Ekman, P.; Friesen, W.V. Facial Action Coding System; Consulting Psychologists Press: Palo Alto, CA, USA, 1978. [Google Scholar]
- Margineantu, D.D.; Dietterich, T.G. Pruning adaptive boosting. In Proceedings of the 14th International Conference on Machine Learning, Nashville, TN, USA, 8–12 July 1997; Volume 97, pp. 211–218.
- Kittler, J.; Hatef, M.; Duin, R.P.W.; Matas, J. On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 226–239. [Google Scholar] [CrossRef]
- Piatetsky-Shapiro, G. Discovery, analysis, and presentation of strong rules. In Knowledge Discovery in Databases; AAAI Press: Menlo Park, CA, USA, 1991. [Google Scholar]
- Agrawal, R.; Srikant, R. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), Santiago, Chile, 12–15 September 1994; pp. 487–499.
- John, G.H.; Langley, P. Estimating continuous distributions in bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Montréal, QC, Canada, 18–20 August 1995.
- Cohen, I.; Sebe, N.; Garg, A.; Chen, L.; Huang, T.S. Facial expression recognition from video sequences: Temporal and static modeling. Comput. Vis. Image Underst. 2003, 91, 160–187. [Google Scholar] [CrossRef]
- Shan, C.; Gong, S.; McOwan, P.W. Facial expression recognition based on local binary patterns: A comprehensive study. Image Vis. Comput. 2009, 27, 803–816. [Google Scholar] [CrossRef]
- Xu, W.; Sun, Z.X. Facial expression recognition from image sequences with LSVM. J. Comput. Aided Des. Comput. Gr. 2009, 21, 542–548. [Google Scholar]
- Liu, P.; Han, S.; Meng, Z.; Tong, Y. Facial Expression Recognition via a Boosted Deep Belief Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1805–1812.
- Bashyal, S.; Venayagamoorthy, G.K. Recognition of facial expressions using Gabor wavelets and learning vector quantization. Eng. Appl. Artif. Intell. 2008, 21, 1056–1064. [Google Scholar] [CrossRef]
- Koutlas, A.; Fotiadis, D.I. An automatic region based methodology for facial expression recognition. In Proceedings of the IEEE Conference on Systems, Man and Cybernetics, Singapore, 12–15 October 2008; pp. 662–666.
- Yu, K.; Wang, Z.; Zhuo, L.; Wang, J.; Chi, Z.; Feng, D. Learning realistic facial expressions from web images. Pattern Recognit. 2013, 46, 2144–2155. [Google Scholar] [CrossRef]
- Kyperountas, M.; Tefas, A.; Pitas, I. Salient feature and reliable classifier selection for facial expression classification. Pattern Recognit. 2010, 43, 972–986. [Google Scholar] [CrossRef]
- Fu, X.F. Research on Binary Pattern-Based Face Recognition and Expression Recognition; Zhejiang University: Hangzhou, China, 2008. [Google Scholar]
AU | FACS Name |
---|---|
AU2 | Outer Brow Raiser |
AU4 | Brow Lowerer |
AU10 | Upper Lip Raiser |
AU13 | Sharp Lip Puller |
AU15 | Lip Corner Depressor |
AU16 | Lower Lip Depressor |
AU24 | Lip Pressor |
AU27 | Mouth Stretch |
Image | Gabor | Geometry | LBP | WLD |
---|---|---|---|---|
Predict Result | Correct | Wrong |
---|---|---|
Correct | a | b |
Wrong | c | d |
Num | Expression Prediction Label | Real Label of Expression | |||||||
---|---|---|---|---|---|---|---|---|---|
AU2 | AU4 | AU10 | AU13 | AU15 | AU16 | AU24 | AU27 | ||
1 | angry | disgust | angry | angry | angry | angry | angry | sad | angry |
2 | angry | happy | angry | angry | disgust | angry | angry | angry | angry |
3 | angry | angry | angry | angry | sad | angry | angry | angry | angry |
··· | ··· | ··· |
Expression | Association Rules of AU |
---|---|
E1 (39) | AU24 = angry ≥ Expression = angry 39 |
E2 (43) | AU10 = disgust ≥ Expression = disgust 41; AU27 = disgust ≥ Expression = disgust 39 |
E3 (55) | AU16 = fear ≥ Expression = fear 54 |
E4 (81) | AU13 = happy ≥ Expression = happy 81; AU2 = happy ≥ Expression = happy 75; AU4 = happy ≥ Expression = happy 74 |
E5 (62) | AU15 = sad ≥ Expression = sad 60; AU2 = sad ≥Expression = sad 56; AU16 = sad ≥ Expression = sad 53 |
E6 (75) | AU27 = surprise ≥ Expression = surprise 75; AU2 = surprise ≥ Expression = surprise 70 |
Emotions | CK | JAFFE |
---|---|---|
1: angry | 39 | 30 |
2: disgust | 43 | 29 |
3: fear | 55 | 32 |
4: happy | 81 | 32 |
5: sad | 62 | 30 |
6: surprise | 75 | 30 |
Total images | 355 | 183 |
Accuracy (%) | Feature Description | MLP | SMO | IBK | NB |
---|---|---|---|---|---|
JAFFE | Gabor feature | 63.19 | 59.89 | 62.09 | 53.30 |
Geometry feature | 83.52 | 79.67 | 84.07 | 78.02 | |
LBP feature | 46.15 | 45.60 | 47.25 | 47.80 | |
WLD feature | 78.02 | 71.98 | 75.82 | 70.33 | |
Gab + Geo + LBP + WLD | 89.07 | 90.16 | 81.42 | 83.61 | |
CK | Gabor feature | 64.97 | 55.65 | 63.84 | 62.99 |
Geometry feature | 88.98 | 84.18 | 86.44 | 87.29 | |
LBP feature | 62.43 | 53.67 | 59.32 | 62.71 | |
WLD feature | 72.32 | 65.82 | 72.32 | 73.73 | |
Gab + Geo + LBP + WLD | 94.65 | 91.83 | 85.92 | 93.52 |
Accuracy (%) | Base Classifiers | Gabor Feature | Geometry Feature | LBP | WLD |
---|---|---|---|---|---|
JAFFE | IBK | 53.30 | 64.29 | 38.87 | 60.85 |
MLP | 50.27 | 74.45 | 48.764 | 64.42 | |
SMO | 50.27 | 60.16 | 42.99 | 60.99 | |
SMO + IBK + MLP | 59.62 | 81.32 | 46.70 | 74.04 | |
CK | IBK | 53.74 | 74.79 | 50.49 | 58.55 |
MLP | 57.98 | 87.64 | 54.94 | 64.55 | |
SMO | 58.90 | 80.00 | 58.05 | 64.97 | |
SMO + IBK + MLP | 61.86 | 89.22 | 59.53 | 71.04 |
CK | Linear Support Vector Machine | Polynomial Kernel Support Vector Machine | Radial Basis Function Kernel Support Vector Machine |
---|---|---|---|
Accuracy (%) | 91.83% | 92.39% | 93.24% |
Database | Mean Accuracy (%) | Angry | Disgust | Fear | Happy | Sad | Surprise |
---|---|---|---|---|---|---|---|
CK | 96.62% | 95.49% (SMO) | 97.18% (NB) | 94.65% (SMO) | 97.46% (NB) | 94.93% (SMO) | 100.00% (NB) |
JAFFE | 94.63% | 89.61% (NB) | 96.72% (SMO) | 92.90% (NB) | 96.72% (SMO/NB) | 94.54% (MLP) | 97.26% (SMO) |
Reference | Accuracy | Method |
---|---|---|
Cohen et al. (2003) [38] | 73.2% | Geometric feature + tree-augmented-NB |
Bartlett et al. (2005) [2] | 89.1% | Gabor filter + Adaboost + SVM |
Shan et al. (2009) [39] | 88.9% | Boosted-LBP + SVM |
Thiago et al. (2013) [26] | 88.9% | Ensemble Gabor and LBP |
Xu et al. (2009) [40] | 89.1% | ASM + geometry feature + LSVM classifier |
Jia et al. (2013) [29] | 90.7% | Dynamic geometry feature + stacking |
Tian Y et al. (2002) [3] | 92.7% | Gabor wavelets + geometry with FACS |
Chen-Chiung Hsieh et al. (2016) [31] | 94.7% | dynamic face regions + A multi-class SVM |
Deepak Ghimire et al. (2013) [23] | 95.17% | Feature selective multi-class AdaBoost |
Our proposed approach | 96.62% | Multi-layer of optimally weighted AU |
Ping et al. (2014) [41] | 96.7% | Boosted deep belief network |
Deepak Ghimire et al. (2013) [23] | 97.35% | SVM on boosted features |
Reference | Accuracy | Method |
---|---|---|
Bashyal and Venayagamoorthy (2008) [42] | 90.2% | Gabor filters + Learning Vector Quantization (LVQ) |
Koutlasand Fotiadis (2008) [43] | 92.3% | Gabor filters + artificial neural networks |
Yu et al. (2013) [44] | 85.7% | WLD + pool + SVM |
Kyperountas M et al. (2010) [45] | 85.9% | Gabor filters + Salient feature and reliable classifier selection (SFRCS) |
Fu (2008) [46] | 87.5% | Convolution Neural Network (CNN) |
Jia et al. (2013) [29] | 92.3% | Gabor filters + stacking |
Ping et al. (2014) [41] | 93.0% | Boosted deep belief network |
Our proposed approach | 94.63% | Multi-layer of optimal weighted AU |
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jia, X.; Liu, S.; Powers, D.; Cardiff, B. A Multi-Layer Fusion-Based Facial Expression Recognition Approach with Optimal Weighted AUs. Appl. Sci. 2017, 7, 112. https://doi.org/10.3390/app7020112
Jia X, Liu S, Powers D, Cardiff B. A Multi-Layer Fusion-Based Facial Expression Recognition Approach with Optimal Weighted AUs. Applied Sciences. 2017; 7(2):112. https://doi.org/10.3390/app7020112
Chicago/Turabian StyleJia, Xibin, Shuangqiao Liu, David Powers, and Barry Cardiff. 2017. "A Multi-Layer Fusion-Based Facial Expression Recognition Approach with Optimal Weighted AUs" Applied Sciences 7, no. 2: 112. https://doi.org/10.3390/app7020112
APA StyleJia, X., Liu, S., Powers, D., & Cardiff, B. (2017). A Multi-Layer Fusion-Based Facial Expression Recognition Approach with Optimal Weighted AUs. Applied Sciences, 7(2), 112. https://doi.org/10.3390/app7020112