ApeTI: A Thermal Image Dataset for Face and Nose Segmentation with Apes
Abstract
:1. Introduction
2. ApeTI Dataset
2.1. Acquisition
2.2. Dataset Annotation
2.3. Evaluation Strategy
3. Proposed Methods
3.1. Face Detection
3.2. Landmark Regression
4. Results
4.1. Face Detection
4.2. Landmark Regression
5. Application in Studies
5.1. The Apparatus
5.2. Physiological Signal Retrieval
6. Conclusions
- Metal mesh segmentation and removal from thermal images;
- Heart rate and breath rate estimation from thermal videos; and
- Cognitive load estimation and monitoring from thermal videos.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
TI | thermal image |
ApeTI | Ape Thermal Image (dataset) |
mAP | mean average precision |
WKPRC | Wolfgang Köhler Primate Research Center |
MPI EVA | Max Planck Institute for Evolutionary Anthropology |
IoU | intersections over union |
OKS | object keypoint similarity |
AP | average precision |
Tifa | Thermal Image Face Ape |
Tina | Thermal Image Nose Ape |
ROI | region of interest |
EAZA | European Association of Zoos and Aquaria |
WAZA | World Association of Zoos and Aquariums |
ASAB | Association for the Study of Animal Behaviour |
ABS | Animal Behavior Society |
IACUC | Institutional Animal Care and Use Committee |
AR | average recall |
Appendix A. Results
Appendix A.1. Face Detection
- Average Precision (AP):
- –
- mAP at IoU = 0.50:0.05:0.95 (primary metric)
- –
- AP50 at IoU = 0.50 (loose metric)
- –
- AP75 at IoU = 0.75 (strict metric)
- AP Across Scales:
- –
- APsmall (small objects: area < )
- –
- APmedium (medium objects: < area < )
- –
- APlarge (large objects: area > )
- Average Recall (AR):
- –
- AR1 (AR given 1 detection per image)
- –
- AR10 (AR given 10 detections per image)
- –
- AR100 (AR given 100 detections per image)
- AR Across Scales:
- –
- ARsmall (small objects: area < )
- –
- ARmedium (medium objects: < area < )
- –
- ARlarge (large objects: area > )
Appendix A.2. Landmark Regression
- Average Precision (AP):
- –
- mAP at OKS = 0.50:0.05:0.95 (primary metric)
- –
- AP50 at OKS = 0.50 (loose metric)
- –
- AP75 at OKS = 0.75 (strict metric)
- AP Across Scales:
- –
- APmedium (medium objects: < area < )
- –
- APlarge (large objects: area > )
- Average Recall (AR):
- –
- mAR at OKS = 0.50:0.05:0.95
- –
- AR50 at OKS = 0.50
- –
- AR75 at OKS = 0.75
- AR Across Scales:
- –
- ARmedium (medium objects: < area < )
- –
- ARlarge (large objects: area > )
Appendix B. Application in Studies
Appendix B.1. The Apparatus
Appendix B.2. Physiological Signal Retrieval
References
- Cardone, D.; Pinti, P.; Merla, A. Thermal Infrared Imaging-Based Computational Psychophysiology for Psychometrics. Comput. Math. Methods Med. 2015, 2015, 984353. [Google Scholar] [CrossRef] [PubMed]
- Sonkusare, S.; Breakspear, M.; Pang, T.; Nguyen, V.T.; Frydman, S.; Guo, C.C.; Aburn, M.J. Data-driven analysis of facial thermal responses and multimodal physiological consistency among subjects. Sci. Rep. 2021, 11, 12059. [Google Scholar] [CrossRef] [PubMed]
- Paolini, D.; Alparone, F.R.; Cardone, D.; van Beest, I.; Merla, A. “The face of ostracism”: The impact of the social categorization on the thermal facial responses of the target and the observer. Acta Psychol. 2016, 163, 65–73. [Google Scholar] [CrossRef] [PubMed]
- Derakhshan, A.; Mikaeili, M.; Gedeon, T.; Nasrabadi, A.M. Identifying the Optimal Features in Multimodal Deception Detection. Multimodal Technol. Interact. 2020, 4, 25. [Google Scholar] [CrossRef]
- Stukelj, M.; Hajdinjak, M.; Pusnik, I. Stress-free measurement of body temperature of pigs by using thermal imaging—Useful fact or wishful thinking. Comput. Electron. Agric. 2022, 193, 106656. [Google Scholar] [CrossRef]
- Yadav, S.S.; Jadhav, S.M. Thermal infrared imaging based breast cancer diagnosis using machine learning techniques. Multim. Tools Appl. 2022, 81, 13139–13157. [Google Scholar] [CrossRef]
- Perpetuini, D.; Formenti, D.; Cardone, D.; Trecroci, A.; Rossi, A.; Di Credico, A.; Merati, G.; Alberti, G.; Di Baldassarre, A.; Merla, A. Can Data-Driven Supervised Machine Learning Approaches Applied to Infrared Thermal Imaging Data Estimate Muscular Activity and Fatigue? Sensors 2023, 23, 832. [Google Scholar] [CrossRef]
- Choi, J.; Oh, K.; Kwon, O.; Kwon, J.; Kim, J.; Yoo, S.K. Non-Contact Respiration Rate Measurement From Thermal Images Using Multi-Resolution Window and Phase-Sensitive Processing. IEEE Access 2023, 11, 112706–112718. [Google Scholar] [CrossRef]
- Cordoni, F.G.; Bacchiega, G.; Bondani, G.; Radu, R.; Muradore, R. A multi-modal unsupervised fault detection system based on power signals and thermal imaging via deep AutoEncoder neural network. Eng. Appl. Artif. Intell. 2022, 110, 104729. [Google Scholar] [CrossRef]
- Garbey, M.; Sun, N.; Merla, A.; Pavlidis, I.T. Contact-Free Measurement of Cardiac Pulse Based on the Analysis of Thermal Imagery. IEEE Trans. Biomed. Eng. 2007, 54, 1418–1426. [Google Scholar] [CrossRef]
- Fei, J.; Pavlidis, I.T. Thermistor at a Distance: Unobtrusive Measurement of Breathing. IEEE Trans. Biomed. Eng. 2010, 57, 988–998. [Google Scholar] [CrossRef] [PubMed]
- Shastri, D.J.; Papadakis, M.; Tsiamyrtzis, P.; Bass, B.; Pavlidis, I.T. Perinasal Imaging of Physiological Stress and Its Affective Potential. IEEE Trans. Affect. Comput. 2012, 3, 366–378. [Google Scholar] [CrossRef]
- Taamneh, S.; Tsiamyrtzis, P.; Dcosta, M.; Buddharaju, P.; Khatri, A.; Manser, M.; Ferris, T.; Wunderlich, R.; Pavlidis, I. A multimodal dataset for various forms of distracted driving. Sci. Data 2017, 4, 170110. [Google Scholar] [CrossRef] [PubMed]
- Kajiwara, S. Evaluation of driver’s mental workload by facial temperature and electrodermal activity under simulated driving conditions. Int. J. Automot. Technol. 2014, 15, 65–70. [Google Scholar] [CrossRef]
- Kano, F.; Hirata, S.; Deschner, T.; Behringer, V.; Call, J. Nasal temperature drop in response to a playback of conspecific fights in chimpanzees: A thermo-imaging study. Physiol. Behav. 2015, 155, 83–94. [Google Scholar] [CrossRef] [PubMed]
- Demartsev, V.; Manser, M.B.; Tattersall, G.J. Vocalization-associated respiration patterns: Thermography-based monitoring and detection of preparation for calling. J. Exp. Biol. 2022, 225, jeb243474. [Google Scholar] [CrossRef] [PubMed]
- Berntson, G.G.; Cacioppo, J.T. Heart rate variability: Stress and psychiatric conditions. In Dynamic Electrocardiography; Blackwell Publishing: Oxford, UK, 2007; pp. 57–64. [Google Scholar]
- Pomeranz, B.; Macaulay, R.J.; Caudill, M.A.; Kutz, I.; Adam, D.; Gordon, D.; Kilborn, K.M.; Barger, A.C.; Shannon, D.C.; Cohen, R.J.; et al. Assessment of autonomic function in humans by heart rate spectral analysis. Am. J. Physiol. Heart Circ. Physiol. 1985, 248, H151–H153. [Google Scholar] [CrossRef] [PubMed]
- Wang, D.; Eckert, J.; Teague, S.; Al-Naji, A.; Haun, D.; Chahl, J. Estimating the Cardiac Signals of Chimpanzees Using a Digital Camera: Validation and Application of a Novel Non-Invasive Method for Primate Research; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar]
- Barbosa Pereira, C.; Czaplik, M.; Blazek, V.; Leonhardt, S.; Teichmann, D. Monitoring of Cardiorespiratory Signals Using Thermal Imaging: A Pilot Study on Healthy Human Subjects. Sensors 2018, 18, 1541. [Google Scholar] [CrossRef]
- Almasri, F.; Debeir, O. RGB Guided Thermal Super-Resolution Enhancement. In Proceedings of the 2018 4th International Conference on Cloud Computing Technologies and Applications, Cloudtech 2018, IEEE, Brussels, Belgium, 26–28 November 2018; pp. 1–5. [Google Scholar] [CrossRef]
- Vrochidou, E.; Sidiropoulos, G.K.; Tsimperidis, I.; Ouzounis, A.G.; Sarafis, I.T.; Kalpakis, V.; Stamkos, A.; Papakostas, G.A. Fusion of Thermal and RGB Images for Automated Deep Learning Based Marble Crack Detection. In Proceedings of the 2023 IEEE World AI IoT Congress (AIIoT), IEEE, Seattle, WA, USA, 7–10 June 2023; pp. 243–249. [Google Scholar] [CrossRef]
- Alexander, Q.G.; Hoskere, V.; Narazaki, Y.; Maxwell, A.; Spencer, B.F. Fusion of thermal and RGB images for automated deep learning based crack detection in civil infrastructure. AI Civ. Eng. 2022, 1, 3. [Google Scholar] [CrossRef]
- Brenner, M.; Reyes, N.H.; Susnjak, T.; Barczak, A.L.C. RGB-D and Thermal Sensor Fusion: A Systematic Literature Review. IEEE Access 2023, 11, 82410–82442. [Google Scholar] [CrossRef]
- Martin, P.-E.; Kachel, G.; Wieg, N.; Eckert, J.; Haun, D.B.M. ApeTI Dataset and Models Weights [Data Set]. Zenodo. 2024. Available online: https://doi.org/10.5281/zenodo.11192141 (accessed on 20 May 2024).
- Martin, P.-E. Ccp-eva/ApeTI: Software (v1.0.0). Zenodo. 2024. Available online: https://doi.org/10.5281/zenodo.11204561 (accessed on 20 May 2024).
- Suh, M.K. Surgical Anatomy and Physiology of the Nose. In Atlas of Asian Rhinoplasty; Springer: Singapore, 2018; pp. 1–65. [Google Scholar] [CrossRef]
- Lin, T.; Maire, M.; Belongie, S.J.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Lecture Notes in Computer Science; Fleet, D.J., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2014; Volume 8693, pp. 740–755. [Google Scholar] [CrossRef]
- Chen, K.; Wang, J.; Pang, J.; Cao, Y.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Xu, J.; et al. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv 2019, arXiv:1906.07155. [Google Scholar]
- Contributors, M. OpenMMLab Pose Estimation Toolbox and Benchmark. 2020. Available online: https://github.com/open-mmlab/mmpose (accessed on 11 March 2024).
- Sun, K.; Xiao, B.; Liu, D.; Wang, J. Deep High-Resolution Representation Learning for Human Pose Estimation. In Proceedings of the CVPR, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y.; Liu, D.; Mu, Y.; Tan, M.; Wang, X.; et al. Deep High-Resolution Representation Learning for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 3349–3364. [Google Scholar] [CrossRef]
- Koestinger, M.; Wohlhart, P.; Roth, P.M.; Bischof, H. Annotated Facial Landmarks in the Wild: A Large-scale, Real-world Database for Facial Landmark Localization. In Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, 6–13 November 2011. [Google Scholar]
- Bazarevsky, V.; Kartynnik, Y.; Vakunov, A.; Raveendran, K.; Grundmann, M. BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs. arXiv 2019, arXiv:abs/1907.05047. [Google Scholar]
- Kartynnik, Y.; Ablavatski, A.; Grishchenko, I.; Grundmann, M. Real-time Facial Surface Geometry from Monocular Video on Mobile GPUs. arXiv 2019, arXiv:abs/1907.06724. [Google Scholar]
- Iwata, I.; Sakamoto, T.; Matsumoto, T.; Hirata, S. Noncontact Measurement of Heartbeat of Humans and Chimpanzees Using Millimeter-Wave Radar with Topology Method. IEEE Sens. Lett. 2023, 7, 1–4. [Google Scholar] [CrossRef]
Method | Train | Validation | Test | Overall |
---|---|---|---|---|
0.692 | 0.7 | 0.7 | 0.693 | |
0.692 | 0.7 | 0.7 | 0.693 | |
Tifa | 0.550 | 0.685 | 0.744 | 0.622 |
no ROI | 0 | 0 | 0 | 0 |
Thresh35.6 | 0.001 | 0 | 0 | 0.001 |
BlazeFace | 0.024 | 0.017 | 0.007 | 0.017 |
Thresh36.5 + BlazeFace | 0.155 | 0.155 | 0.136 | 0.155 |
Method | mAP | AP 50 | AP 75 | mIoU |
---|---|---|---|---|
0.7 | 1 | 1 | 0.836 | |
0.7 | 1 | 1 | 0.821 | |
Tifa | 0.744 | 0.980 | 0.902 | 0.868 |
no ROI | 0 | 0 | 0 | 0.128 |
Thresh35.6 | 0 | 0 | 0 | 0.183 |
BlazeFace | 0.007 | 0.037 | 0 | 0.398 |
Thresh36.5 + BlazeFace | 0.136 | 0.512 | 0.025 | 0.576 |
Method | Train | Validation | Test | Overall |
---|---|---|---|---|
GT + Tina | 0.940 | 1 | 0.989 | 0.965 |
+ Tina | 0.957 | 1 | 0.989 | 0.971 |
+ Tina | 0.926 | 0.993 | 0.995 | 0.956 |
no ROI + Tina | 0.926 | 1 | 0.987 | 0.953 |
Thresh29.8 + Tina | 0.965 | 1 | 0.989 | 0.981 |
Tifa + Tina | 0.919 | 0.990 | 0.980 | 0.950 |
+ Tina | 0.949 | 0.999 | 0.980 | 0.968 |
+ Tina | 0.903 | 0.978 | 0.980 | 0.939 |
Thresh29.8 + Tifa + Tina | 0.919 | 0.990 | 0.952 | 0.950 |
BlazeFace + FaceMesh | 0.312 | 0.363 | 0.336 | 0.336 |
Thresh36.5 + BlazeFace + FaceMesh | 0.557 | 0.557 | 0.566 | 0.557 |
Method | mAP | AP 50 | AP 75 | mOKS |
---|---|---|---|---|
GT + Tina | 0.989 | 0.989 | 0.989 | 0.524 |
+ Tina | 0.989 | 0.989 | 0.989 | 0.523 |
+ Tina | 0.995 | 1 | 1 | 0.524 |
no ROI + Tina | 0.987 | 0.988 | 0.988 | 0.532 |
Thresh29.8 + Tina | 0.989 | 0.990 | 0.990 | 0.532 |
Tifa + Tina | 0.980 | 0.980 | 0.980 | 0.524 |
+ Tina | 0.980 | 0.980 | 0.980 | 0.523 |
+ Tina | 0.980 | 0.980 | 0.980 | 0.524 |
Thresh29.8 + Tifa + Tina | 0.952 | 0.953 | 0.953 | 0.518 |
BlazeFace + FaceMesh | 0.336 | 0.652 | 0.312 | 0.398 |
Thresh36.5 + BlazeFace + FaceMesh | 0.566 | 0.819 | 0.617 | 0.41 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Martin, P.-E.; Kachel, G.; Wieg, N.; Eckert, J.; Haun, D.B.M. ApeTI: A Thermal Image Dataset for Face and Nose Segmentation with Apes. Signals 2024, 5, 147-164. https://doi.org/10.3390/signals5010008
Martin P-E, Kachel G, Wieg N, Eckert J, Haun DBM. ApeTI: A Thermal Image Dataset for Face and Nose Segmentation with Apes. Signals. 2024; 5(1):147-164. https://doi.org/10.3390/signals5010008
Chicago/Turabian StyleMartin, Pierre-Etienne, Gregor Kachel, Nicolas Wieg, Johanna Eckert, and Daniel B. M. Haun. 2024. "ApeTI: A Thermal Image Dataset for Face and Nose Segmentation with Apes" Signals 5, no. 1: 147-164. https://doi.org/10.3390/signals5010008