Leveraging the Sensitivity of Plants with Deep Learning to Recognize Human Emotions
Abstract
:1. Introduction
2. State of the Art
2.1. Emotion Recognition
2.2. Using Plants as Sensors
3. Method
3.1. Experimental Setup
- First, the participant’s consent was collected, and the observer in charge of running the session answered any open questions.
- Then, the observer quickly described the task that the participant would be asked to perform. The task was to watch a video sequence designed to elicit strong emotional responses from participants. The video sequence was created based on previous work by Gloor et al. [32], and the details of the video are described in Table 1.
- The participant then sat in the experimental room, and the sensors (plant and camera) were activated. Figure 2 is a photograph of the experimental setup.A screen displayed the videos to elicit the participants’ emotional reactions (see Table 1). These reactions were filmed by a wide-angle Logitech Meetup camera placed just below the screen. The camera was set up to obtain a zoomed image of the participant’s face. Additionally, the participants each wore a wired headset microphone connected to a hand-held voice recorder that was used to record their voices. The camera also recorded the voice and provided an additional backup data source for this modality. Finally, a basil plant, Ocimum basilicum, equipped with a sensor, SpikerBox [31], was positioned in front of the participant.
- Then, the observer started the video sequence and left the room to allow the participant to watch the video.
- Once the video sequence was finished, all sensors were deactivated, and the data collected by the plant sensor and the camera were saved.
3.2. Analysis
3.3. Data Preparation
3.3.1. Initial Data Preparation Approach—MFCC Extraction with Windowing
- The first dataset consists of the downsampled short plant signals. Since the plant sensor has a high sampling rate (10 kHz), a downsampling of the signals is required before feeding them into the different training models. Downsampling reduces the complexity of the signal while retaining the relevant information [11]. Its rate is a hyperparameter called downsampling rate, the value of which is determined by trial and error.
- The second dataset consists of the computation of the MFCC features from each window. The result is a 2D matrix [time steps; number of MFCCs] that can be processed by various deep learning algorithms, such as LSTM or CNN.
3.3.2. Alternative Data Preparation Approach—Raw Electrical Signal Analysis
3.4. Model Generation
3.5. Model Training
3.6. Classification
4. Results
4.1. Data Collection
4.2. Analysis
4.3. Evaluation
5. Discussion
Limitations and Further Research
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Lerner, J.S.; Li, Y.; Valdesolo, P.; Kassam, K.S. Emotion and decision making. Annu. Rev. Psychol. 2015, 66, 799–823. [Google Scholar] [CrossRef]
- Ekman, P.; Friesen, W. Constants across cultures in the face and emotion. J. Personal. Soc. Psychol. 1971, 17, 124. [Google Scholar] [CrossRef]
- Ko, B.C. A brief review of facial emotion recognition based on visual information. Sensors 2018, 18, 401. [Google Scholar] [CrossRef]
- Li, I.H. Technical report for valence-arousal estimation on affwild2 dataset. arXiv 2021, arXiv:2105.01502. [Google Scholar] [CrossRef]
- Verma, G.; Tiwary, U. Affect representation and recognition in 3D continuous valence–arousal–dominance space. Multimed. Tools Appl. 2017, 76, 2159–2183. [Google Scholar] [CrossRef]
- El Ayadi, M.; Kamel, M.S.; Karray, F. Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognit. 2011, 44, 572–587. [Google Scholar] [CrossRef]
- Khalil, R.A.; Jones, E.; Babar, M.I.; Jan, T.; Zafar, M.H.; Alhussain, T. Speech emotion recognition using deep learning techniques: A review. IEEE Access 2019, 7, 117327–117345. [Google Scholar] [CrossRef]
- Bi, J. Stock market prediction based on financial news text mining and investor sentiment recognition. Math. Probl. Eng. 2022, 2022, 2427389. [Google Scholar] [CrossRef]
- Kusal, S.; Patil, S.; Choudrie, J.; Kotecha, K.; Vora, D.; Pappas, I. A review on text-based emotion detection—Techniques, applications, datasets, and future directions. arXiv 2022, arXiv:2205.03235. [Google Scholar] [CrossRef]
- Alswaidan, N.; Menai, M. A survey of state-of-the-art approaches for emotion recognition in text. Knowl. Inf. Syst. 2020, 62, 2937–2987. [Google Scholar] [CrossRef]
- Oezkaya, B.; Gloor, P.A. Recognizing individuals and their emotions using plants as bio-sensors through electro-static discharge. arXiv 2020, arXiv:2005.04591. [Google Scholar] [CrossRef]
- Relf, P.D. People-plant relationship. In Horticulture as Therapy: Principles and Practice; Sharon, P.S., Martha, C.S., Eds.; Haworth Press: Philadelphia, PA, USA, 1998; pp. 21–42. [Google Scholar]
- Peter, P.K. Do Plants Sense Music? An Evaluation of the Sensorial Abilities of the Codariocalyx motorius. Ph.D. Thesis, Universität zu Köln, Cologne, Germany, 2021. [Google Scholar]
- Paul Ekman Group Universal Emotions. 2022. Available online: https://www.paulekman.com/universal-emotions/ (accessed on 26 May 2022).
- Ekman, P. Emotions Revealed: Recognizing Faces and Feelings to Improve Communication and Emotional Life; Henry Holt and Company: New York, NY, USA, 2003. [Google Scholar]
- Izard, C.E. Human Emotions; Springer Science & Business: Berlin, Germany; New York, NY, USA, 2013. [Google Scholar] [CrossRef]
- Thanapattheerakul, T.; Mao, K.; Amoranto, J.; Chan, J.H. Emotion in a Century: A Review of Emotion Recognition; ACM: New York, NY, USA, 2018. [Google Scholar] [CrossRef]
- Darwin, C. The Expression of the Emotions in Man and Animals; John Murray: London, UK, 1872. [Google Scholar]
- Ekman, P. Expression and the nature of emotion. In Approaches to Emotion; Routledge: London, UK, 1984. [Google Scholar]
- Russell, J. A circumplex model of affect. J. Personal. Soc. Psychol. 1980, 39, 1161–1178. [Google Scholar] [CrossRef]
- Ekman, P.; Friesen, W. Facial action coding system (FACS). APA PsycTests 1978. [Google Scholar] [CrossRef]
- Mühler, V. JavaScript Face Recognition API for the Browser and Nodejs Implemented on Top of tensorflow.js core. Version: V0.22.2. 2022. Available online: https://github.com/justadudewhohacks/face-api.js (accessed on 24 May 2023).
- Tao, H.; Duan, Q. Hierarchical attention network with progressive feature fusion for facial expression recognition. Neural Netw. 2024, 170, 337–348. [Google Scholar] [CrossRef]
- Shu, L.; Xie, J.; Yang, M.; Li, Z.; Li, Z.; Liao, D.; Xu, X.; Yang, X. A review of emotion recognition using physiological signals. Sensors 2018, 18, 2074. [Google Scholar] [CrossRef]
- Kruse, J. Comparing Unimodal and Multimodal Emotion Classification Systems on Cohesive Data. Master’s Thesis, Technical University Munich, Munich, Germany, 2022. [Google Scholar]
- Volkov, A.G.; Ranatunga, D.R.A. Plants as environmental biosensors. Plant Signal. Behav. 2006, 1, 105–115. [Google Scholar] [CrossRef]
- Volkov, A.G.; Courtney, L.B. Electrochemistry of plant life. In Plant Electrophysiology; Volkov, A.G., Ed.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 437–457. [Google Scholar]
- Volkov, A.G. Plant Electrophysiology; Volkov, A.G., Ed.; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar] [CrossRef]
- Chatterjee, S. An Approach towards Plant Electrical Signal Based External Stimuli Monitoring System. Ph.D. Thesis, University of Southampton, Southampton, UK, 2017. [Google Scholar]
- Chatterjee, S.K.; Das, S.; Maharatna, K.; Masi, E.; Santopolo, L.; Mancuso, S.; Vitaletti, A. Exploring strategies for classification of external stimuli using statistical features of the plant electrical response. J. R. Soc. Interface 2015, 12, 20141225. [Google Scholar] [CrossRef]
- Brains, I.B. The Plant Spikerbox. 2022. Available online: https://backyardbrains.com/products/plantspikerbox (accessed on 24 May 2023).
- Gloor, P.A.; Fronzetti Colladon, A.; Altuntas, E.; Cetinkaya, C.; Kaiser, M.F.; Ripperger, L.; Schaefer, T. Your face mirrors your deepest beliefs Predicting personality and morals through facial emotion recognition. Future Internet 2022, 14, 5. [Google Scholar] [CrossRef]
- Kit, N.C.; Ooi, C.P.; Tan, W.H.; Tan, Y.F.; Cheong, S.N. Facial emotion recognition using deep learning detector and classifier. Int. J. Elect. Comput. Syst. Eng. 2023, 13, 3375–3383. [Google Scholar] [CrossRef]
- Guo, Y.; Wünsche, B.C. Comparison of Face Detection Algorithms on Mobile Devices. In Proceedings of the 2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ), Wellington, New Zealand, 25–27 November 2020; pp. 1–6. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. arXiv 2015, arXiv:1512.03385. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Li, F.F. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef]
- Qin, Z.; Kim, D.; Gedeon, T. Rethinking softmax with cross-entropy: Neural network classifier as mutual information estimator. arXiv 2020, arXiv:1911.10688. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar] [CrossRef]
- Chollet, F. Keras. 2015. Available online: https://keras.io/ (accessed on 13 March 2024).
- Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental algorithms for scientific computing in python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef] [PubMed]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Harris, C.R.; Millman, K.J.; van der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
- Reuther, A.; Kepner, J.; Byun, C.; Samsi, S.; Arcand, W.; Bestor, D.; Bergeron, B.; Gadepally, V.; Houle, M.; Hubbell, M.; et al. Interactive supercomputing on 40,000 cores for machine learning and data analysis. In Proceedings of the 2018 IEEE High Performance extreme Computing Conference (HPEC), Waltham, MA, USA, 25–27 September 2018; pp. 1–6. [Google Scholar]
- Rooney, B.; Benson, C.; Hennessy, E. The apparent reality of movies and emotional arousal: A study using physiological and self-report measures. Poetics 2012, 40, 405–422. [Google Scholar] [CrossRef]
- Shirai, M.; Suzuki, N. Is sadness only one emotion? Psychological and physiological responses to sadness induced by two different situations: “loss of someone” and “failure to achieve a goal”. Front. Psychol. 2017, 8, 288. [Google Scholar] [CrossRef]
- Yu, D.; Sun, S. A systematic exploration of deep neural networks for EDA-based emotion recognition. Information 2020, 11, 212. [Google Scholar] [CrossRef]
- Ramm, T.M.M.W.; Otto, T.; Gloor, P.A.; Salingaros, N.A. Artificial Intelligence Evaluates How Humans Connect to the Built Environment: A Pilot Study of Two Experiments in Biophilia. Sustainability 2024, 16, 868. [Google Scholar] [CrossRef]
Video ID | Name | Short Description | Expected Emotion | Duration (s) |
---|---|---|---|---|
1 | Puppies | Cute puppies running | Happiness | 13 |
2 | Avocado | A toddler holding an avocado | Happiness | 8 |
3 | Runner | Competitive runners supporting a girl from another team over the finish line | Happiness | 24 |
4 | Maggot | A man eating a maggot | Disgust | 37 |
5 | Raccoon | Man beating raccoon to death | Anger | 16 |
6 | Trump | Donald Trump talking about foreigners | Anger | 52 |
7 | Montain bike | Mountain biker riding down a rock bridge | Surprise | 29 |
8 | Roof run | Runner almost falling of a skyscraper | Surprise | 18 |
9 | Abandoned | Social worker feeding a starved toddler | Sadness | 64 |
10 | Waste | Residents collecting electronic waste in the slums of Accra | Sadness | 31 |
11 | Dog | Sad dog on the gravestone of his master | Sadness | 11 |
12 | Roof bike | Person biking on top of a skyscraper | Fear | 28 |
13 | Monster | A man discovering a monster through his camera | Fear | 156 |
14 | Condom ad | Child throwing a tantrum in a supermarket | Multiple | 38 |
15 | Soldier | Soldiers in battle | Multiple | 35 |
Model Name | Utility | Input | Architecture |
---|---|---|---|
MLP | Baseline | Downsampled plant signal | Alternation of ReLu-activated densely connected layers with dropout layers to limit overfitting. The last layer is a SoftMax-activated dense layer of neurons. |
biLSTM | Considers the temporal dependencies of the plant signal | Two blocks’ model
| |
MFCC-CNN | Specialized in 2D or 3D inputs, as in multifeatured time-series | MFCCs features | Two blocks’ model.
|
MFCC-ResNet | Pretrained DeepCNN to emphasize the importance of the network depth | ResNet architecture slightly modified to fit the emotion detection task. The top dense layers used for classification are replaced by a dense layer of 1024 neurons, followed by a dropout layer. The last layer is a SoftMax-activated dense layer of nbemotion neurons | |
Random Forest not windowed | Effective for diverse datasets. Good overall robustness. | Raw plant signal normalized, not windowed | Utilizes an ensemble of decision trees. Parameters include : 300 (number of trees), : 20 (maximum depth of each tree), and : None. This configuration is aimed at handling complex classification tasks, balancing bias and variance. |
1-Dimensional CNN not windowed | Suitable for time series analysis | Sequential model with a 1D convolutional layer (64 filters, kernel size of 3, ‘swish’ activation, input shape of (10,000, 1)). Followed by a MaxPooling layer (pool size of 2), a Flatten layer, a dense layer (100 neurons, ‘swish’ activation), and an output dense layer (number of neurons equal to unique classes in ‘y’, ‘softmax’ activation). Compiled with Adam optimizer, ‘’ loss, and accuracy metrics. The last layer is a SoftMax-activated dense layer of nbemotion neurons | |
biLSTM not windowed | Considers the temporal dependencies of the plant signal | Sequential model with a Bidirectional LSTM layer (1024 units, return sequences true, input shape based on reshaped training data), followed by another Bidirectional LSTM layer (1024 units). Concludes with a dense layer (100 neurons, ‘swish’ activation) and an output dense layer (number of neurons equal to unique classes in ‘y’, ‘softmax’ activation). Optimized with Adam (learning rate 0.0003), using loss and accuracy metrics. The last layer is a SoftMax-activated dense layer of nbemotion neurons |
Model Name | Parameter | Values | Number of Configurations |
---|---|---|---|
MLP | dense Units dense layers Dropout Rate Learning Rate Balancing Window Hop | 1024, 4096 2, 4 0, 0.2 3 × , 1 × Balance, Weights, None 5, 10, 20 5, 10 | 288 |
biLSTM | LSTM Units LSTM layers Dropout Rate Learning Rate Balancing Window Hop | 64, 256, 1024 1, 2, 3 0, 0.2 3 × , 1 × Balance, Weights, None 5, 10, 20 5, 10 | 648 |
MFCC-CNN | Conv Filters Conv layers Conv Kernel Size Dropout Rate Learning Rate Balancing Window Hop | 64, 128 2, 3 3, 5, 7 0, 0.2 3 × , 1 × Balance, Weights, None 5, 10, 20 5, 10 | 864 |
MFCC-ResNet | Pretrained Number of MFCCs Dropout Rate Learning Rate Balancing Window Hop | Yes, No 20, 40, 60 0, 0.2 3 × , 1 × Balance, Weights, None 5, 10, 20 5, 10 | 432 |
RF no windowing | Number of estimators Max Depth Balancing | 100, 200, 300, 500, 700 None, 10, 20, 30 Balance, Weights, None | 60 |
1D CNN no windowing | Conv Filters Conv layers Conv Kernel Size Dropout Rate Learning Rate Balancing | 64, 128 2, 3 3, 5, 7 0, 0.2 3 × , 1 × Balance, Weights, None | 144 |
biLSTM no windowing | LSTM Units LSTM layers Dropout Rate Learning Rate Balancing | 64, 256, 1024 1, 2, 3 0, 0.2 3 × , 1 × Balance, Weights, None | 108 |
Model Name | Parameters | Values |
---|---|---|
MLP | Dense Units | 4096 |
Dense Layers | 2 | |
Dropout Rate | 0.2 | |
Learning Rate | 0.001 | |
Balancing | Balanced | |
Window | 20 s | |
Hop | 10 s | |
biLSTM | LSTM Units | 1024 |
LSTM Layers | 2 | |
Dropout Rate | 0 | |
Learning Rate | 0.0003 | |
Balancing | Balanced | |
Window | 20 s | |
Hop | 10 s | |
MFCC-CNN | Conv Filters | 96 |
Conv Layers | 2 | |
Conv Kernel Size | 7 | |
Dropout Rate | 0.2 | |
Learning Rate | 0.0003 | |
Balancing | Balanced | |
Window | 20 s | |
Hop | 10 s | |
MFCC-ResNet | Pretrained | No |
Number of MFCCs | 60 | |
Dropout Rate | 0.2 | |
Learning Rate | 0.001 | |
Balancing | Balanced | |
Window | 20 s | |
Hop | 10 s | |
RF no windowing | Number of Estimators | 300 |
Max Depth | 20 | |
Balancing | None | |
1D CNN no windowing | Conv Filters | 96 |
Conv Layers | 2 | |
Conv Kernel Size | 7 | |
Dropout Rate | 0.2 | |
Learning Rate | 0.0003 | |
Balancing | None | |
biLSTM no windowing | LSTM Units | 1024 |
LSTM Layers | 2 | |
Dropout Rate | 0 | |
Learning Rate | 0.0003 | |
Balancing | None |
Model | Test Set Accuracy | Test Set Recall |
---|---|---|
MLP | 0.399 | 0.220 |
biLSTM | 0.260 | 0.351 |
MFCC-CNN | 0.377 | 0.275 |
MFCC-RestNet | 0.318 | 0.324 |
RF (no windowing) | 0.552 | 0.552 |
1D CNN (no windowing) | 0.461 | 0.514 |
biLSTM (no windowing) | 0.448 | 0.380 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kruse, J.A.; Ciechanowski, L.; Dupuis, A.; Vazquez, I.; Gloor, P.A. Leveraging the Sensitivity of Plants with Deep Learning to Recognize Human Emotions. Sensors 2024, 24, 1917. https://doi.org/10.3390/s24061917
Kruse JA, Ciechanowski L, Dupuis A, Vazquez I, Gloor PA. Leveraging the Sensitivity of Plants with Deep Learning to Recognize Human Emotions. Sensors. 2024; 24(6):1917. https://doi.org/10.3390/s24061917
Chicago/Turabian StyleKruse, Jakob Adrian, Leon Ciechanowski, Ambre Dupuis, Ignacio Vazquez, and Peter A. Gloor. 2024. "Leveraging the Sensitivity of Plants with Deep Learning to Recognize Human Emotions" Sensors 24, no. 6: 1917. https://doi.org/10.3390/s24061917
APA StyleKruse, J. A., Ciechanowski, L., Dupuis, A., Vazquez, I., & Gloor, P. A. (2024). Leveraging the Sensitivity of Plants with Deep Learning to Recognize Human Emotions. Sensors, 24(6), 1917. https://doi.org/10.3390/s24061917