Emotion Recognition in Individuals with Down Syndrome: A Convolutional Neural Network-Based Algorithm Proposal
Abstract
:1. Introduction
2. Materials and Methods
2.1. Dataset
2.2. Analysis of Techniques for Recognizing Emotions in People with DS
2.3. Improving the Recognition of Emotions for People with DS
- Feature Extraction: OpenFace employs a convolutional neural network to extract facial features from an image. This CNN is trained explicitly for facial recognition and can accurately detect and locate key facial landmarks such as the eyes, nose, and mouth;
- Feature Embedding: Once the facial features are extracted, OpenFace stores them in a high-dimensional vector space. Each face is represented as a numeric vector, commonly called an “inlay,” which captures the unique facial characteristics;
- Feature Comparison: To perform facial recognition, OpenFace compares the feature vectors of an individual’s face with the vectors stored in a database. It utilizes similarity measures, such as Euclidean distance or dot product, to determine the degree of similarity between the vectors and whether the faces belong to the same person.
- Input Blocks: The network begins with a convolutional layer that processes the input image and extracts initial features. It utilizes a small filter size and a low number of channels;
- Parallel Convolution Blocks: The architecture employs a series of parallel convolution blocks. These blocks consist of depth-separable convolution layers followed by normalization and activation layers. This configuration enables more efficient feature representation and reduces the number of parameters compared to standard convolutions;
- Reduce Blocks: Following the parallel convolution blocks, reduced blocks decrease the spatial resolution of features and reduce their dimensionality. These blocks typically consist of a convolutional layer followed by spatial subsampling, such as pooling operations;
- Global Pooling Layer: A global pooling layer is applied, further reducing the dimensionality of the features;
- Fully Connected Layers: The architecture incorporates fully connected layers specific to the task, such as classification. These layers usually combine convolutional and activation layers, culminating in an output layer with some units corresponding to the output classes or categories.
2.4. Analysis of Ablation in the Proposed Neural Network
3. Results
Hyperparameter Tuning
- Filter size: The filter size is an essential parameter in a one-dimensional convolutional neural network. It determines the width of the filter used for convolutions and is represented as a positive integer. The standard options for the filter size are two, three, and five. In Figure 6, where the non-optimized proposal is shown, a value of five was used for this hyperparameter, whereas in our optimized system, a filter of size two was used. Choosing a filter size of two allows us to capture local features and short patterns in the input data. This is advantageous as it allows the detection of relevant patterns at a more localized level. By using smaller filters, we can extract more precise information than larger ones, especially those with more than three. By taking advantage of these smaller filters, our system benefits from greater sensitivity to local variations and can capture finer details in the data. This granularity can be crucial for tasks where the shortest patterns or specific local features are important for accurate predictions or analysis. In general, using a filter size of two in our optimized system improves our ability to extract and interpret local features effectively, leading to better performance in capturing relevant patterns in the input data [35].
- Number of filters: This hyperparameter, known as the number of filters, determines the number of filters utilized in the convolutional layer. Each filter detects specific patterns in the input data, generating an output pipeline through the convolution operation. The available options for the number of filters typically include values like 8, 16, 32, 64, or 128. In Figure 6, where the non-optimized proposal is shown, a value of 64 was used for this hyperparameter, while in our optimized system, a filter of size 128 was used. The network can capture more patterns and details within the input data by selecting more filters. This increased capacity allows for improved differentiation among different classes of emotions. Additionally, a more significant number of filters facilitates the network’s adaptation to each specific dataset’s unique characteristics [35].
- The number of input channels: In our approach, we employ the ‘automatic’ configuration for the optimized model, which automatically determines the number of filters based on the architecture and specific characteristics of the model or data. This parameter provides the network with flexibility as it can seamlessly adapt to various datasets without adjusting the number of input channels. Consequently, this approach enhances the performance and efficiency of the network by automatically adapting to different datasets [35].
- Dilation factor: This parameter controls the spacing between elements the filter considers during the convolution operation. Expanding the filters and inserting zeros between each filter element can detect patterns on a larger scale. We experimented with one, two, four, and eight dilation factors. A factor of one indicates no spacing, while factors of two, four, and eight skip one, three, and seven elements, enlarging the filter’s receptive field [35].
- Optimizer: The optimizer refers to the algorithm that adjusts the weights during training. Standard optimization algorithms include stochastic gradient descent (SGD), Adam, RMSprop, and Adagrad. In our study, we employed the Adam optimizer due to its efficiency in optimization, adaptability to learning rates, and less sensitivity to the selection of hyperparameters. Adam simplifies the process of tuning the chosen hyperparameters [34]. By carefully selecting and tuning these hyperparameters, we aim to enhance the performance of the CNN architecture in terms of pattern recognition, accuracy, and efficiency.
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Carvalho, P.; Menezes, P. Classification of FACS-Action Units with CNN Trained from Emotion Labelled Data Sets. In Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 6–9 October 2019; pp. 3766–3770. [Google Scholar]
- Matsumoto, D.; Hwang, H.S. Lectura de la Expresión Facial de las Emociones: Investigación Básica en la Mejora del Reconocimiento de Emociones. Ansiedad Estres 2013, 19, 121–129. [Google Scholar]
- Ruiz, E. Temas de Interés Evaluación de la Capacidad Intelectual en Personas Con Síndrome de Down. 2012. Available online: http://wwww.centrodocumentaciondown.com/uploads/documentos/27dcb0a3430e95ea8358a7baca4b423404c386e2.pdf (accessed on 3 July 2023).
- Ruiz, E.; Álvarez, R.; Arce, A.; Palazuelos, I.; Schelstraete, G. Programa de educación emocional. Aplicación práctica en niños con síndrome de Down. Rev. Sindr. De Down 2009, 103, 126–139. [Google Scholar]
- Soler Ruiz, V. Lógica Difusa Aplicada a Conjuntos Imbalanceados: Aplicación a la Detección del Síndrome de Down. 2007. Available online: https://www.tesisenred.net/handle/10803/5777?locale-attribute=ca (accessed on 1 July 2023).
- Agbolade, O.; Nazri, A.; Yaakob, R.; Ghani, A.A.; Cheah, Y.K. Down syndrome face recognition: A review. Symmetry 2020, 12, 1182. [Google Scholar] [CrossRef]
- Cornejo, J.Y.R.; Pedrini, H.; Lima, A.M.; Nunes, F.D.L.D.S. Down syndrome detection based on facial features using a geometric descriptor. J. Med. Imaging 2017, 4, 044008. [Google Scholar] [CrossRef] [PubMed]
- Lucey, P.; Cohn, J.F.; Kanade, T.; Saragih, J.; Ambadar, Z.; Matthews, I. The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 94–101. [Google Scholar] [CrossRef] [Green Version]
- Eroğul, O.; Sipahi, M.E.; Tunca, Y.; Vurucu, S. Recognition of Down syndromes using image analysis. In Proceedings of the 14th National Biomedical Engineering Meeting, Izmir, Turkey, 20–22 May 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 1–4. [Google Scholar] [CrossRef]
- Zhao, Q.; Okada, K.; Rosenbaum, K.; Kehoe, L.; Zand, D.J.; Sze, R.; Summar, M.; Linguraru, M.G. Digital facial dysmorphology for genetic screening: Hierarchical constrained local model using ICA. Med. Image Anal. 2014, 18, 699–710. [Google Scholar] [CrossRef] [PubMed]
- Alsharekh, M.F. Facial Emotion Recognition in Verbal Communication Based on Deep Learning. Sensors 2022, 22, 6105. [Google Scholar] [CrossRef] [PubMed]
- Atabansi, C.C.; Chen, T.; Cao, R.; Xu, X. Transfer Learning Technique with VGG-16 for Near-Infrared Facial Expression Recognition. J. Phys. Conf. Ser. 2021, 1873, 012033. [Google Scholar] [CrossRef]
- Bodapati, J.D.; Naik, D.S.B.; Suvarna, B.; Naralasetti, V. A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition. J. Inst. Eng. Ser. B 2022, 103, 1395–1405. [Google Scholar] [CrossRef]
- Yang, G.; Ortoneda, J.S.Y.; Saniie, J. Emotion Recognition Using Deep Neural Network with Vectorized Facial Features. In Proceedings of the 2018 IEEE International Conference on Electro/Information Technology (EIT), Rochester, MI, USA, 3–5 May 2018; pp. 318–322. [Google Scholar] [CrossRef]
- Liu, L. Human face expression recognition based on deep learning-deep convolutional neural network. In Proceedings of the Proceedings—2019 International Conference on Smart Grid and Electrical Automation, ICSGEA 2019, Xiangtan, China, 10–11 August 2019; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2019; pp. 221–224. [Google Scholar] [CrossRef]
- Pranav, E.; Suraj, K.; Satheesh, C.; Supriya, M.H. Facial Emotion Recognition Using Deep Convolutional Neural Network. In Proceedings of the 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 6–7 April 2020; pp. 317–320. [Google Scholar] [CrossRef]
- Bishay, M.; Ghoneim, A.; Ashraf, M.; Mavadati, M. Which CNNs and Training Settings to Choose for Action Unit Detection? A Study Based on a Large-Scale Dataset. In Proceedings of the Proceedings—2021 16th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2021, Jodhpur, India, 15–18 December 2021; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2021. [Google Scholar] [CrossRef]
- Hammal, Z.; Chu, W.S.; Cohn, J.F.; Heike, C.; Speltz, M.L. Automatic Action Unit Detection in Infants Using Convolutional Neural Network. In Proceedings of the 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), San Antonio, TX, USA, 23–26 October 2017; pp. 216–221. [Google Scholar] [CrossRef]
- Yen, C.T.; Li, K.H. Discussions of Different Deep Transfer Learning Models for Emotion Recognitions. IEEE Access 2022, 10, 102860–102875. [Google Scholar] [CrossRef]
- Tan, C.; Sun, F.; Kong, T.; Zhang, W.; Yang, C.; Liu, C. A Survey on Deep Transfer Learning. 2018. Available online: http://arxiv.org/abs/1808.01974 (accessed on 29 June 2023).
- Paredes, N.; Bravo, E.C.; Cortes, B.B. Experimental Analysis Using Action Units as Feature Descriptor for Emotion in People with down Syndrome. Lect. Notes Electr. Eng. 2021, 762, 253–265. [Google Scholar]
- Paredes, N.; Caicedo-Bravo, E.F.; Bacca, B.; Olmedo, G. Emotion Recognition of Down Syndrome People Based on the Evaluation of Artificial Intelligence and Statistical Analysis Methods. Symmetry 2022, 14, 2492. [Google Scholar] [CrossRef]
- Doctor, F.; Karyotis, C.; Iqbal, R.; James, A. An intelligent framework for emotion aware e-healthcare support systems. In Proceedings of the 2016 IEEE Symposium Series on Computational Intelligence, SSCI 2016, Athens, Greece, 6–9 December 2016. [Google Scholar] [CrossRef]
- Baltrusaitis, T.; Zadeh, A.; Lim, Y.C.; Morency, L.P. OpenFace 2.0: Facial behavior analysis toolkit. In Proceedings of the Proceedings—13th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2018, Xi’an, China, 15–19 May 2018; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2018; pp. 59–66. [Google Scholar] [CrossRef]
- Amos, B.; Bartosz, L.; Mahadev, S. OpenFace: A General-Purpose Face Recognition Library with Mobile Applications; Carnegie Mellon University: Pittsburgh, PA, USA, 2016. [Google Scholar]
- Santoso, K.; Kusuma, G.P. Face Recognition Using Modified OpenFace. Procedia Comput. Sci. 2018, 135, 510–517. [Google Scholar] [CrossRef]
- Fatima, S.A.; Kumar, A.; Raoof, S.S. Real Time Emotion Detection of Humans Using Mini-Xception Algorithm. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1042, 012027. [Google Scholar] [CrossRef]
- Sun, L.; Ge, C.; Zhong, Y. Design and implementation of face emotion recognition system based on CNN Mini_Xception frameworks. J. Phys. Conf. Ser. 2021, 2010, 012123. [Google Scholar] [CrossRef]
- Behera, B.; Prakash, A.; Gupta, U.; Semwal, V.B.; Chauhan, A. Statistical Prediction of Facial Emotions Using Mini Xception CNN and Time Series Analysis; Springer: Berlin/Heidelberg, Germany, 2021; pp. 397–410. [Google Scholar] [CrossRef]
- Arriaga, O.; Plöger, P.G.; Valdenegro, M. Real-time Convolutional Neural Networks for Emotion and Gender Classification. arXiv 2017, arXiv:1710.07557. [Google Scholar]
- Williams, K.R.; Wishart, J.G.; Pitcairn, T.K.; Willis, D.S.; Williams, J.G.W.K.R.; Dykens, E.; Channell, M.M.; Conners, F.A.; Barth, J.M.; Virji-Babul, N.; et al. Emotion Recognition by Children with Down Syndrome: Investigation of Specific Impairments and Error Patterns. Am. J. Ment. Retard. 2005, 110, 378. [Google Scholar] [CrossRef] [PubMed]
- Bai, S.; Kolter, J.Z.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. 2018. Available online: http://github.com/locuslab/TCN (accessed on 29 June 2023).
- Echeverri, C.J.O.; Cordoba, D.A.L.; Quintero, E.A. Ajuste de hiperparámetros de una red neuronal convolucional para el reconocimiento de lengua de señas. Con-Cienc. Técnica 2021, 5, 48–55. Available online: https://revistas.sena.edu.co/index.php/conciencia/article/view/3926 (accessed on 9 April 2023).
- Zhou, S.; Song, W. Deep learning-based roadway crack classification using laser-scanned range images: A comparative study on hyperparameter selection. Autom. Constr. 2020, 114, 103171. [Google Scholar] [CrossRef]
- 1-D Convolutional layer-MATLAB-MathWorks América Latina. Available online: https://la.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.convolution1dlayer.html (accessed on 4 June 2023).
- Batta, M. Machine Learning Algorithms—A Review. Int. J. Sci. Res. IJSR 2019, 9, 381–386. [Google Scholar] [CrossRef]
- Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput. Sci. 2021, 2, 3. [Google Scholar] [CrossRef] [PubMed]
Expression | AUs |
---|---|
Anger | 4, 9, 15, 17 |
Happiness | 6, 7, 10, 12, 14, 20 |
Sadness | 1, 4, 6, 7, 9, 12, 15, 17, 20 |
Surprise | 1, 2, 5, 25, 26 |
Neutral | 2, 5 |
Type |
---|
conv1 (7 × 7 × 3, 2) |
max pool + norm |
inception (2) |
norm + max pool |
inception (3a) |
inception (3b) |
inception (3c) |
inception (4a) |
inception (4e) |
inception (5a) |
inception (5b) |
avg pool |
Linear |
ℓ2 normalization |
Hyperparameters | Mean Accuracy | |||||
---|---|---|---|---|---|---|
Filter Size | Num Channels | Num Filters | Dilation Factor | Optimizer | ||
Set values | 3 | auto | 64 | 1, 2, 4, 8 | ADAM | 0.89 |
3 | auto | 16 | 1, 2, 4, 8 | ADAM | 0.77 | |
3 | auto | 8 | 1, 2, 4, 8 | ADAM | 0.61 | |
3 | auto | 64 | 1 | ADAM | 0.87 | |
3 | auto | 128 | 2 | ADAM | 0.88 | |
2 | auto | 128 | 1, 2, 4, 8 | ADAM | 0.91 | |
3 | auto | 128 | 1, 2, 4, 8 | RMSPROP | 0.89 | |
2 | auto | 128 | 1, 2, 4, 8 | RMSPROP | 0.90 | |
2 | auto | 128 | 1, 2, 4, 8 | SGDM | 0.37 |
Techniques Applied | True Positive Rates (%) | ||||||
---|---|---|---|---|---|---|---|
Anger | Happiness | Neutral | Sadness | Surprise | Accuracy | ||
Machine Learning [36,37] | KNN | 53.8 | 91.3 | 76.1 | 13.8 | 60.3 | 64.9 |
Ensemble Subspace Discriminant | 46.2 | 89.4 | 71.6 | 26.2 | 60.3 | 64.7 | |
SVM | 42.3 | 82.7 | 78.9 | 30.8 | 64.1 | 66.2 | |
Transfer Learning | Mini-Xception | 66.7 | 99 | 88.4 | 41 | 62.2 | 74.8 |
Proposed | CNN (OpenFace) CNN (Tranfer Learning) CNN | 99.5 | 93.5 | 93.5 | 90.5 | 85.2 | 91.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Paredes, N.; Caicedo-Bravo, E.; Bacca, B. Emotion Recognition in Individuals with Down Syndrome: A Convolutional Neural Network-Based Algorithm Proposal. Symmetry 2023, 15, 1435. https://doi.org/10.3390/sym15071435
Paredes N, Caicedo-Bravo E, Bacca B. Emotion Recognition in Individuals with Down Syndrome: A Convolutional Neural Network-Based Algorithm Proposal. Symmetry. 2023; 15(7):1435. https://doi.org/10.3390/sym15071435
Chicago/Turabian StyleParedes, Nancy, Eduardo Caicedo-Bravo, and Bladimir Bacca. 2023. "Emotion Recognition in Individuals with Down Syndrome: A Convolutional Neural Network-Based Algorithm Proposal" Symmetry 15, no. 7: 1435. https://doi.org/10.3390/sym15071435
APA StyleParedes, N., Caicedo-Bravo, E., & Bacca, B. (2023). Emotion Recognition in Individuals with Down Syndrome: A Convolutional Neural Network-Based Algorithm Proposal. Symmetry, 15(7), 1435. https://doi.org/10.3390/sym15071435