Next Article in Journal
HortNet417v1—A Deep-Learning Architecture for the Automatic Detection of Pot-Cultivated Peach Plant Water Stress
Next Article in Special Issue
Enhancing System Acceptance through User-Centred Design: Integrating Patient Generated Wellness Data
Previous Article in Journal
Airport Spatial Usability in Measuring the Spherical Antenna Properties on Small Aircraft
Previous Article in Special Issue
Novel Technique to Measure Pulse Wave Velocity in Brain Vessels Using a Fast Simultaneous Multi-Slice Excitation Magnetic Resonance Sequence
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Assessment of ROI Selection for Facial Video-Based rPPG

1
Tvstorm, Sunghyun Building, 255 Hyorung-to, Secho-gu, Seoul 13875, Korea
2
Department of Electronics and Communications Engineering, Kwangwoon University, Seoul 01897, Korea
*
Authors to whom correspondence should be addressed.
Sensors 2021, 21(23), 7923; https://doi.org/10.3390/s21237923
Submission received: 20 October 2021 / Revised: 21 November 2021 / Accepted: 25 November 2021 / Published: 27 November 2021

Abstract

:
In general, facial image-based remote photoplethysmography (rPPG) methods use color-based and patch-based region-of-interest (ROI) selection methods to estimate the blood volume pulse (BVP) and beats per minute (BPM). Anatomically, the thickness of the skin is not uniform in all areas of the face, so the same diffuse reflection information cannot be obtained in each area. In recent years, various studies have presented experimental results for their ROIs but did not provide a valid rationale for the proposed regions. In this paper, to see the effect of skin thickness on the accuracy of the rPPG algorithm, we conducted an experiment on 39 anatomically divided facial regions. Experiments were performed with seven algorithms (CHROM, GREEN, ICA, PBV, POS, SSR, and LGI) using the UBFC-rPPG and LGI-PPGI datasets considering 29 selected regions and two adjusted regions out of 39 anatomically classified regions. We proposed a BVP similarity evaluation metric to find a region with high accuracy. We conducted additional experiments on the TOP-5 regions and BOT-5 regions and presented the validity of the proposed ROIs. The TOP-5 regions showed relatively high accuracy compared to the previous algorithm’s ROI, suggesting that the anatomical characteristics of the ROI should be considered when developing a facial image-based rPPG algorithm.

1. Introduction

Cardiovascular disease (CVD) is a disease that can affect the heart and the body’s vascular system. Most cardiovascular diseases exist as long-lasting chronic diseases, and there is a lack of appropriate measures to continuously monitor and prevent them [1]. In order to prevent CVD, it is necessary to continuously monitor vital signs for example electrocardiogram, heartbeat, and blood pressure, must be continuously monitored, and professional instruments, such as an IR-UWB heart rate monitor and invasive blood pressure monitor, are required to measure them. However, these devices are for professional use, are expensive, and are not suitable for home use. In addition to professional measuring instruments, there is a method of inferring vital signs, such as heart rate and blood pressure, using an electrocardiogram (ECG). Although electrocardiography is the most accurate method, a photoplethysmography (PPG) method has been developed that can infer the heartbeat in an inexpensive and simple way. PPG is 98% similar to ECG and is an optical technology that requires a single sensor [2]. PPG has become common in recent years and is widely used in wearable vital sign measuring devices, such as smartwatches.
Recently, research on noncontact technology has been progressing beyond contact-type devices, such as wearable devices and heart rate monitors. The photoplethysmography (PPG) measurement method using a facial image is called remote PPG (rPPG) and face PPG (fPPG); rPPG can be measured only with an RGB video camera. Research on rPPG technology was carried out by focusing on the PPG technology of oximeter. PPG is a method of acquiring the pulse waveform of blood vessels noninvasively by using the optical properties of changes in blood vessels on the skin and is used to find out the state of the heartbeat. According to Beer–Lambert’s law [3], the absorbance of a single compound is proportional to its concentration. Hemoglobin has the highest absorbance at the green wavelength, which is a wavelength of 532 nm and utilizes the characteristic that biological tissue reflects and transmits part of the light when the light source is transmitted through the body. The rPPG measurement method using the RGB camera is based on the fact that the extracted value of the ROI from each frame is similar to the PPG waveform [4].
Figure 1 shows a graph of light absorption of deoxyhemoglobin (HHb), oxyhemoglobin (O2Hb), and carbaminohemoglobin (COHb), which are the most abundant in blood. The amount of light absorbed depends on the wavelength of the light, and it shows the greatest absorption at the wavelength of 400-440nm, which is the green channel. The absorption of the wavelength affects the change in the diffuse reflection value, which is responsible for the change in the information received by the RGB camera.
Representative rPPG methods include the ICA [5], GREEN [6,7,8,9], CHROM [10], POS [11], SSR [12], PBV [13], and LGI [14] methods. The ROI selection method is largely divided into a color-based skin detector and a method for designating a chosen area, and there is no clear rationale for this. In this paper, seven representative methods of rPPG are compared with the ROI proposed by each method using pyVHR [15] to provide accuracy. Experiments are conducted using publicly available data, such as LGI [14] and UBFC [16], suggesting that the proposed ROI displays higher accuracy.
The main contributions of this work are:
Proposal of 31 ROIs that can be used in the rPPG method using an anatomical basis.
Proposal of a BVP similarity (rBS) metric for a performance evaluation in various ROIs.
Performance evaluation of the rBS rank the TOP-5 and BOT-5 using ROI combinations.
The software is available on GitHub (https://github.com/TVS-AI/Pytorch_rppgs (accessed date 26 November 2021)) for experimentation.
This paper is organized as follows. The rPPG methods will be described in Section 2. Section 3 describes the ROI of Section 2’s algorithms and the proposed region of interest. Section 4 presents the experimental results, and Section 5 provides a conclusion.

2. rPPG (Remote Photoplethysmography)

The pixels extracted from a face image taken with the RGB camera have face reproduction information, noise, and BVP values. Various methods for extracting the BVP have been studied by analyzing the raw signal in which various information is combined.
Table 1 summarizes representative rPPG methods. As the result of the POS and CHROM method, it has relatively less spread of MAE and PCC values, and highly accurate results can be obtained [15].

3. ROI (Region of Interest)

3.1. Typical ROI Methods

A facial image-based rPPG algorithm requires a process of finding a face region and selecting an ROI within the found region for efficient signal extraction. Two main methods are used to detect the face area. The most used method is (1) the Viola–Jones method for face detection, which detects a face using the Harr feature [18]. As an alternative to feature-based face detection. there is (2) a skin region detection method [19]. In the past, in the ROI selection process, a method was used based on the face area detected by the Viola–Jones algorithm [20]. This method had the problem of including the background of the border in the ROI in addition to the face area. In another study, using single or additional coordinates within the face area, the forehead, cheeks, and the proposed regions were selected as ROIs [21].
Table 2 shows the ROI selection method of the representative rPPG method mentioned in Section 2. Representative rPPG methods are tried to use the face area as much as possible without focusing on a specific ROI. GREEN and ICA were used for facial image cropping, and CHROM, SSR, POS, PBV, and LGI were used to generate rPPG signals by extracting only specific skin colors.

3.2. ROI Analysis Studies

A previous study mentioned that ROI affects signal quality and computational load in the rPPG method [21]. Studies also raised the problem of designating the entire face as an ROI. It was assumed that there would be a protruding part of the blood vessel distribution, and the accuracy was evaluated for the forehead, left and right cheeks, nose, mouth, nasal dorsum, and chin. As a result, the cheeks and forehead were selected as excellent ROIs.

3.3. Proposal of ROI Selection

3.3.1. Thickness of Human Face Skin

rPPG is a contrast between specular reflection and diffuse reflection that occurs when light hits the skin. Specular reflections are pure light reflections from the skin, while diffuse reflections are reflections due to absorption and scattering of skin tissue that depend on changes in blood volume [22].
Figure 2 shows the principle of how the camera receives BVP (blood volume pulse) information. When the light source hits the skin, some of the light is absorbed by the skin and blood vessels, and the remaining diffuse reflection information is received by the camera. Depending on the thickness of the skin, the reflection information of the light source can be different. Although blood vessels decrease reflectance and transmittance, diffuse reflection exhibits sensitive dependence on the depth of blood vessels, that is, the thickness of the skin [23]. According to the thickness of the skin, the absorption amount of the light source decreases, which represents a large difference between the specular reflection and diffuse reflection information. The thickness of the dermis and epidermis of 39 anatomical sites of 10 cadavers were measured [24]. The 39 areas used in [24], the relative thickness of the dermis and the epidermis, and the relative thickness of the skin calculated based on the information are as shown in Table 3.

3.3.2. Proposed ROI

To conduct rPPG experiments on the anatomical regions mentioned in Section 3.3.1, we selected the experimental regions. When selecting ROI candidates, the scalp area (temporal scalp, anterior scalp, posterior scalp), ear area (preauricular, upper helix, mid helix, conchal bowl, earlobe, rear ear), and neck area (anterior neck, lateral neck) were excluded. In addition, the area around the eyes (upper medial eyelid, upper lateral eyelid, lower eyelid, and tear trough) was integrated into one region because the size of the region was small. Finally, the symmetrical parts such as the nasolabial fold and marionette fold were divided into two areas, left and right.
Table 4 shows the proposed 31 regions and skin thickness.

3.4. Assessment Metric of Proposed ROI

We used three measurement methods used in rPPG to evaluate the performance of the proposed ROIs. In addition, we propose a relative BVP similarity (rBS) method for evaluating the relative superiority of each ROI.
  • MAE (Mean Absolute Error): MAE was used to see the accuracy of the estimated waveform for each rPPG method.
M A E = 1 N t | S ^ ( t ) S ( t ) |  
  • RMSE (Root Mean Square Error): RMSE was used to view the standard mean error.
R M S E = 1 N t ( S ^ ( t ) S ( t ) ) 2
  • PCC (Pearson’s Correlation Coefficient): PCC is a method for interpreting the linear relationship between two given signals. The closer the absolute value of the PCC result to 1, the more linear it is.
P C C = t ( S ^ ( t ) μ ^ ) ( S ( t ) μ ) t ( S ^ ( t ) μ ^ ) 2 t ( S ^ ( t ) μ ) 2
S(t) is the ground truth, and Ŝ(t) is the result of the rPPG method. In addition, μ is the average value of S ( t ) , and μ is the average value of Ŝ(t). The results of each of the above three methods were processed to generate the rBS (relative BVP similarity), which is a final evaluation metric.
r B S = ( log ( m a x ( M A E ) M A E + e ) + log ( m a x ( R M S E ) R M S E + e ) ) | P C C |
In the rPPG method, the MAE is used as a measure to determine the absolute difference value from the actual BVP waveform, and the RMSE is used as a measure to determine the variance value of the difference. The PCC is used to determine the linear relationship between the measured value and the original value. The closer the absolute value of the PCC is to 1, the more linear it is. The waveform of the BVP is significant in extracting ultralow frequency (ULF), very low frequency (VLF), low frequency (LF), and high frequency (HF) well. The included disease information is shown in Table 5.
The smaller the MAE and RMSE values, the more they were shown to be similar to the actual data so that the area with a smaller value is more effective. Because each frequency band means different information, it was designed to have a big impact on the linearity of the waveform.

3.5. ROI Assessment Procedure

In order to set the ROIs suggested in Section 3.3.2, three procedures were performed: the Face Mesh Generation, ROI Candidate Setting, and ROI Selection.
Figure 3 is the procedure for assessing the proposed ROIs. The ROI setting was carried out in the preprocessing step of rPPG, and the ROI was created using the landmark created through the face mesh method. Face mesh extraction methods can be divided into cascaded regression-based and deep learning-based methods.
Figure 4a is a face landmark key point of the cascaded regression-based Open Face Project, while Figure 4b shows the face mesh provided by the deep learning-based Media-pipe Project, which are Open-source Face Mesh Projects [25]. In the cascade regression-based method, the representative project open face creates a face mesh with 68 key points and is available in Dlib. As a deep learning-based method, Google’s Media-pipe Project creates a face mesh with 468 key points [26]. In [27], a comparison was conducted with the SAMM dataset composed of various emotion videos of human faces, and the Media-pipe showed high performance with a slight difference. Therefore, in this paper, face landmarks were created using a Media-pipe that can show excellent results in generating various ROIs, and ROI candidates were created by combining landmarks.
Figure 5 shows the result of generating a face mesh image using Media-pipe (a) and the visualization result of the ROI candidate (b).

4. Data and Statistical Analysis

The rPPG method is affected by whether the input video is encoded, light uniformity, and skin color. When the video is encoded, the rPPG information is quantized, and the complete information may not be transmitted [28]. If the light is not uniform, the face is not properly detected [29]. The darker the skin color, the lower the amount of diffuse reflection because the melanin content changes [30].
In this paper, the UBFC and LGI-PPGI datasets, which have the least three effects listed above, were selected to verify the validity of the proposed ROIs [14,16]. The UBFC and LGI-PPGI datasets are composed of raw video data and have uniform light brightness.
Figure 6 shows the Fitzpatrick skin color types. Type I means Pale white skin color, Type II means Fair skin color, Type III means Darker white skin color, Type IV means Light brown skin color, Type V means Brown skin color, and Type VI means Dark brown or black skin. In this paper, experiments were conducted with light skin colors of Type I and II among the six skin colors classified on the Fitzpatrick scale. A proposed ROI mask was generated for two datasets, POS and CHROM were applied to the image to which the generated mask was applied, and superiority was verified using the proposed metric.

4.1. Benchmark Dataset

  • UBFC [16]: It consists of 42 videos, heart rate, and a label in which the heart waveform is recorded. The participants looked directly at the camera installed at a distance of 1 m while filming the video and were filmed while solving the given quiz.
  • LGI-PPGI [14]: A video was recorded by giving 6 subjects four conditions: no motion, motion, vigorous motion, and dialogue.

4.2. Assessment of Proposed ROI

The results of the experiment with POS, CHROM on the UBFC and LGI-PPGI datasets are as follows. Figure 4 and Figure 5 show the results of performing seven methods on the UBFC and LGI-PPGI datasets by specifying 31 regions. It can be seen that the MAE and RMSE values of region numbers 0, 1, 3, and 27 are excellent regardless of the method type. Figure 6 is the PCC result, and the values of region numbers 0, 10, 27, and 28 show results close to 1.
Figure 7 shows the results of the MAE, RMSE, and PCC metrics on the UBFC data. The yellow boxes show the TOP-5 score, and the blue boxes show the BOT-5 score for each metric. The yellow box indicates the TOP-5 in each metric, it can be seen that region 0 and region 10 are commonly included in the TOP-5 in the whole metrics.
Figure 8 shows the results of the MAE, RMSE, and PCC metrics on LGI-PPGI data. Regions 0, 10, and 27 are commonly included in TOP-5 in the whole metrics. Regions 0 and 10 were found to be the best regions in both datasets.
Figure 9 shows the processed rBS values based on the results of the MAE, RMSE, and PCC. To derive a meaningful BC value, the median value was used, and a meaningful mask was selected as the median value.
Table 6 shows the BS median values for each mask, and as a result, regions 27, 10, 3, 0, and 28 showed high scores, whereas regions 15, 13, 12, 20, and 19 showed low scores in the order. The high-scoring regions have a skin thickness of 1086.2 μ m , 1386.11 μ m , 1221.88 μ m , 1245.63 μ m , and 1086.2 μ m respectively, while the low scoring regions have relatively thick skin thicknesses of 2015.89 μ m , 1794.71 μ m , 1794.71 μ m , 1496.12 μ m , and 1496.12 μ m .
Figure 10 is a visualization of the results of Table 5. The yellow areas are the TOP-5 regions, and the blue areas are the BOT-5regions. The white regions are the other remaining 21 regions.
Table 7 is an analysis table for the correlation among the ROIs, the thickness of the skin, and the number of pixels in the region. As a result of Pearson’s correlation, the correlation between skin thickness and rBS rank was 0.50, with moderate positive linearity, and the number of pixels in each region was −0.53, with moderate negative linearity. It can be seen that the thinner the skin and the larger the region, the better the results obtained. According to the results of Table 6, it was shown that there was a correlation with the thickness of the skin and the number of pixels in the region. However, the average number of pixels in the proposed TOP-5 regions is 696 pixels, which is very different from the existing 25,000 pixels used for the entire face. The smaller the region, the easier it is to be exposed to noise, such as light distortion or movement. To solve this problem, a combination of regions was proposed, and an experiment was conducted.
Table 8 shows the region combination of the proposed region and the evaluation results of the existing ROI method. The average thickness of TOP-5 is 1191.11, and the number of pixels is 2431. BOT-5 has an average thickness of 1581.39 and an immersive pixel count of 1030. As a result, the region combination had a positive effect on the improvement of the results, and the proposed TOP-5 combination showed higher accuracy than the Face + Skin method, and BOT-5 showed lower accuracy.
Figure 11 is the BVP extracted from the proposed ROI using the POS method. Yellow is the BVP extracted from the TOP-5 ROI. Comparing it with the blue BOT-5 BVP, the yellow waveform is more similar to the green ground truth. In particular, there is less variability and less noise than the blue waveform.

5. Conclusions

In summary, in this paper we have proposed:
  • Proposal of ROI candidates among 31 facial regions through skin thickness and anatomical analysis.
  • A metric called rBS that can be used to assess the excellence of each ROI.
In conclusion, the ROI selection in the rPPG method is as important as the signal extraction method. As rPPG uses diffuse reflection information, it has been demonstrated that the thickness of the skin affects the result. To extract the validity of skin thickness-based ROI selection, 31 masks and rBS metrics were proposed. For the UBFC and LGGI datasets, CHROM, GREEN, ICA, PBV, POS, SSR, and LGI were experimentally verified. In addition, using the proposed rBS metric, experiments were conducted on 31 areas of the face. The right malar, left malar, glabella, lower medial forehead, and upper medial forehead showed the best results for BVP and BPM extraction. Each area showed a strong correlation with the actual signal, and especially the PCC result was excellent.
Lastly, as the information that can be obtained in one area of the proposed ROI is limited, experiments were conducted on the TOP-5, the entire face, and BOT-5, and the superiority of the TOP-5 was proven. Therefore, it will contribute to effective ROI promotion in the future facial image-based rPPG extraction method, and an improvement of reliability and accuracy of the rPPG method is expected through effective ROI selection.
Existing rPPG methods focused on how well to remove noise from the extracted color information by extracting the color information of the ROIs. Through this study, the superiority of the proposed ROIs using the existing rPPG methods were verified, and it was found that the ROI affects the accuracy of the rPPG method. The rPPG methods that have been conducted so far lack research on the correlation of the ROIs. In a future study, we intend to generate an rPPG algorithm that learns the expression of the correlation in each region using the GNN (Graph Neural Network).

Author Contributions

Conceptualization, D.-Y.K. and K.L.; methodology, D.-Y.K.; software, D.-Y.K.; validation, D.-Y.K.; formal analysis, K.L. and C.-B.S.; investigation, K.L. and C.-B.S.; resources, D.-Y.K.; data curation, D.-Y.K.; writing—original draft preparation, D.-Y.K.; writing—review and editing, D.-Y.K., K.L. and C.-B.S.; visualization, D.-Y.K.; supervision, K.L. and C.-B.S.; project administration, K.L.; funding acquisition, K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korean government (MSIT). (No. 2021-0-00900, Adaptive Federated Learning in Dynamic Heterogeneous Environment).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The UBFC dataset is available at https://sites.google.com/view/ybenezeth/ubfcrppg (accessed date 26 November 2021). The LGI-PPGI dataset is available at https://sites.google.com/view/ybenezeth/ubfcrppg (accessed date 26 November 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bernell, S.; Howard, S.W. Use your words carefully: What is a chronic disease? Front. Public Health 2016, 4, 159. [Google Scholar] [CrossRef] [Green Version]
  2. Kim, J.S.; Lee, K. Untact Abnormal Heartbeat Wave Detection Using Non-Contact Sensor through Transfer Learning. IEEE Access 2020, 8, 217791–217799. [Google Scholar] [CrossRef]
  3. Taparia, N.; Platten, K.C.; Anderson, K.B.; Sniadecki, N.J. A microfluidic approach for hemoglobin detection in whole blood. AIP Adv. 2017, 7, 105102. [Google Scholar] [CrossRef] [Green Version]
  4. Damianou, D. The Wavelength Dependence of the Photoplethysmogram and its Implication to Pulse Oximetry. PhD Thesis, University of Nottingham, Nottingham, UK, 1995. [Google Scholar]
  5. Poh, M.Z.; McDuff, D.J.; Picard, R.W. Non-contact, automated cardiac pulse measurements using video imaging and blind source separation. Opt. Express 2010, 18, 10762–10774. [Google Scholar] [CrossRef] [PubMed]
  6. Martinez, L.F.C.; Paez, G.; Strojnik, M. Optimal wavelength selection for noncontact reflection photoplethysmography. In Proceedings of the 22nd Congress of the International Commission for Optics: Light for the Development of the World, Puebla, Mexico, 15–19 August; 2011; Volume 8011, p. 801191. [Google Scholar]
  7. Verkruysse, W.; Svaasand, L.O.; Nelson, J.S. Remote plethysmographic imaging using ambient light. Opt. Express 2008, 16, 21434–21445. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  8. Kim, D.Y.; Kim, J.S.; Lee, K.K. Real-time vital signs measurement system using facial image on Mobile. In Proceedings of the Korean Society of Broadcast Engineers Conference; The Korean Institute of Broadcast and Media Engineers: Seoul, Korea, 2020; pp. 94–97. [Google Scholar]
  9. Kim, D.; Kim, J.; Lee, K. Real-time Vital Signs Measurement System using Facial Image Data. J. Broadcast Eng. 2021, 26, 132–142. [Google Scholar]
  10. De Haan, G.; Jeanne, V. Robust pulse rate from chrominance-based rPPG. IEEE Trans. Biomed. Eng. 2013, 60, 2878–2886. [Google Scholar] [CrossRef] [PubMed]
  11. Wang, W.; den Brinker, A.C.; Stuijk, S.; De Haan, G. Algorithmic principles of remote PPG. IEEE Trans. Biomed. Eng. 2016, 64, 1479–1491. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Wang, W.; Stuijk, S.; De Haan, G. A novel algorithm for remote photoplethysmography: Spatial subspace rotation. IEEE Trans. Biomed. Eng. 2015, 63, 1974–1984. [Google Scholar] [CrossRef] [PubMed]
  13. De Haan, G.; Van Leest, A. Improved motion robustness of remote-PPG by using the blood volume pulse signature. Physiol. Meas. 2014, 35, 1913. [Google Scholar] [CrossRef] [PubMed]
  14. Pilz, C.S.; Zaunseder, S.; Krajewski, J.; Blazek, V. Local group invariance for heart rate estimation from face videos in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1254–1262. [Google Scholar]
  15. Boccignone, G.; Conte, D.; Cuculo, V.; D’Amelio, A.; Grossi, G.; Lanzarotti, R. An Open Framework for Remote-PPG Methods and their Assessment. IEEE Access 2020, 8, 216083–216103. [Google Scholar] [CrossRef]
  16. Bobbia, S.; Macwan, R.; Benezeth, Y.; Mansouri, A.; Dubois, J. Unsupervised skin tissue segmentation for remote photoplethysmography. Pattern Recognit. Lett. 2019, 124, 82–90. [Google Scholar] [CrossRef]
  17. Wu, H.; Rubinstein, M.; Shih, E.; Guttag, J.; Durand, F.; Freeman, W.T. Eulerian video magnification for revealing subtle changes in the world. ACM Trans. Graph. (TOG) 2012, 31, 1–8. [Google Scholar] [CrossRef]
  18. Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, 2001, Kauai, HI, USA, 8–14 December 2001; Volume 1, p. I. [Google Scholar]
  19. Lee, K.Z.; Hung, P.C.; Tsai, L.W. Contact-free heart rate measurement using a camera. In Proceedings of the 2012 Ninth Conference on Computer and Robot Vision, 2012, Toronto, ON, Canada, 28–30 May 2012; pp. 147–152. [Google Scholar]
  20. Li, P.; Benezeth, Y.; Nakamura, K.; Gomez, R.; Yang, F. Model-based region of interest segmentation for remote photoplethysmography. SCITEPRESS-Sci. Technol. Publ. 2019, 383–388. [Google Scholar] [CrossRef]
  21. Kwon, S.; Kim, J.; Lee, D.; Park, K. ROI analysis for remote photoplethysmography on facial video. In Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2015, Milan, Italy, 25–29 August 2015; pp. 4938–4941. [Google Scholar]
  22. Sahin, S.M.; Deng, Q.; Castelo, J.; Lee, D. Non-Contact Heart Rate Monitoring from Face Video Utilizing Color Intensity. J. Multimed. Inf. Syst. 2021, 8, 1–10. [Google Scholar] [CrossRef]
  23. Chen, B.; Zhang, Y.; Gao, S.; Li, D. Extraction of the Structural Properties of Skin Tissue via Diffuse Reflectance Spectroscopy: An Inverse Methodology. Sensors 2021, 21, 3745. [Google Scholar] [CrossRef] [PubMed]
  24. Chopra, K.; Calva, D.; Sosin, M.; Tadisina, K.K.; Banda, A.; De La Cruz, C.; Chaudhry, M.R.; Legesse, T.; Drachenberg, C.B.; Manson, P.N.; et al. A comprehensive examination of topographic thickness of skin in the human face. Aesthetic Surg. J. 2015, 35, 1007–1013. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Kartynnik, Y.; Ablavatski, A.; Grishchenko, I.; Grundmann, M. Real-time facial surface geometry from monocular video on mobile GPUs. arXiv 2019, arXiv:1907.06724. [Google Scholar]
  26. Amos, B.; Ludwiczuk, B.; Satyanarayanan, M. Openface: A general-purpose face recognition library with mobile applications. CMU Sch. Comput. Sci. Tech. Rep. 2016, 6, CMU-CS-16-118. [Google Scholar]
  27. Savin, A.V.; Sablina, V.A.; Nikiforov, M.B. Comparison of Facial Landmark Detection Methods for Micro-Expressions Analysis. In Proceedings of the 2021 10th Mediterranean Conference on Embedded Computing (MECO), Budva, Montenegro, 7–10 June 2021; pp. 1–4. [Google Scholar] [CrossRef]
  28. Mcduff, D.J.; Blackford, E.B.; Estepp, J.R. The impact of video compression on remote cardiac pulse measurement using imaging photoplethysmography. In Proceedings of the 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), Washington, DC, USA, 30 May–3 June 2017; IEEE: Bellingham, WA, USA, 2017; pp. 63–70. [Google Scholar]
  29. Heusch, G.; Anjos, A.; Marcel, S. A reproducible study on remote heart rate measurement. arXiv 2017, arXiv:1709.00962. [Google Scholar]
  30. Liu, X.; Jiang, Z.; Fromm, J.; Xu, X.; Patel, S.N.; McDuff, D.J. MetaPhys: Few-shot adaptation for non-contact physiological measurement. In Proceedings of the Conference on Health, Inference, and Learning, New York, NY, USA, 8–10 April 2021. [Google Scholar]
Figure 1. The absorbance of hemoglobin according to the wavelength of light.
Figure 1. The absorbance of hemoglobin according to the wavelength of light.
Sensors 21 07923 g001
Figure 2. The absorbance of hemoglobin according to the wavelength of light.
Figure 2. The absorbance of hemoglobin according to the wavelength of light.
Sensors 21 07923 g002
Figure 3. ROI assessment procedure.
Figure 3. ROI assessment procedure.
Sensors 21 07923 g003
Figure 4. Open-source Face Mesh Projects (a) Open Face (b) Media-pipe.
Figure 4. Open-source Face Mesh Projects (a) Open Face (b) Media-pipe.
Sensors 21 07923 g004
Figure 5. Proposed ROI list (a) face mesh image (b) ROI index.
Figure 5. Proposed ROI list (a) face mesh image (b) ROI index.
Sensors 21 07923 g005
Figure 6. Fitzpatrick skin type and the type used in experiment.
Figure 6. Fitzpatrick skin type and the type used in experiment.
Sensors 21 07923 g006
Figure 7. Performance evaluation of proposed ROIs on UBFC data (yellow: TOP-5, blue: BOT-5).
Figure 7. Performance evaluation of proposed ROIs on UBFC data (yellow: TOP-5, blue: BOT-5).
Sensors 21 07923 g007
Figure 8. Performance evaluation of proposed ROIs on LGI-PPGI data (yellow: TOP-5, blue: BOT-5).
Figure 8. Performance evaluation of proposed ROIs on LGI-PPGI data (yellow: TOP-5, blue: BOT-5).
Sensors 21 07923 g008
Figure 9. Evaluation of rBS in proposed regions.
Figure 9. Evaluation of rBS in proposed regions.
Sensors 21 07923 g009
Figure 10. Evaluation of rBS in proposed regions (yellow: TOP-5 regions, blue: BOT-5 regions, white: the other regions).
Figure 10. Evaluation of rBS in proposed regions (yellow: TOP-5 regions, blue: BOT-5 regions, white: the other regions).
Sensors 21 07923 g010
Figure 11. rPPG signal extracted with the proposed ROI using POS (yellow: TOP-5, green: Ground truth, blue: BOT-5).
Figure 11. rPPG signal extracted with the proposed ROI using POS (yellow: TOP-5, green: Ground truth, blue: BOT-5).
Sensors 21 07923 g011
Table 1. Summaries of representative rPPG algorithms.
Table 1. Summaries of representative rPPG algorithms.
MethodCharacteristic
GREEN [6,7,8,9]The green channel is preferred for BVP extraction because it has more diffuse reflection information from hemoglobin than other channels.
In [17], an attempt was made to visually show the pulse change by maximizing the amount of change in the green channel.
ICA [5]A method of splitting a multidimensional signal into multiple components. The whitening matrix was obtained using Jacobian rotation, and the actual original signal was separated by multiplying the whitening matrix by the mixed signal. In [5], the mixed signal was separated into four independent components using the JADE method, and empirically, the second signal was used as the PPG signal.
CHROM [10]The CHROME method removes noise caused by light reflection through color difference channel normalization.
SSR [12]The SSR method is based on the absorbance of hemoglobin. Using Subspace Rotation and Temporal Rotation has the advantage of extending the pulse amplitude and reducing the distortion by the light reflection.
POS [11]The POS method aims to reduce the specular noise problem presented by the CHROM to the “plane orthogonal to skin” method. A PPG signal is generated by a projection of the plane orthogonal to skin tone from the temporally normalized RGB plan.
PBV [13]It suggests a pulse blood vector that distinguishes the pulse-induced color changes from motion noise in the RGB source.
LGI [14]It suggested a robust algorithm in various environment using differentiable local transformations
Table 2. ROI method of each rPPG algorithm.
Table 2. ROI method of each rPPG algorithm.
MethodGREENICACHROMSSRPOSPBVLGI
ROI(1) Face(1) Face(1) Face + (2) Skin(2) Skin(2) Skin(2) Skin(2) Skin
Table 3. The dermis and epidermal thickness of 39 facial areas.
Table 3. The dermis and epidermal thickness of 39 facial areas.
RegionLocationAverage Epidermal Thickness (μm) Average Dermal
Thickness (μm)
eRT (1), *dRT (2), *RT (3), *
0Upper Medial Forehead44.701200.931.511.581.56
1Lower Medial Forehead45.761176.111.551.551.53
2Upper Lateral Forehead44.801252.501.521.651.62
3Lower Lateral Forehead39.861172.341.351.541.52
4Upper Medial Eyelid40.31758.851.361.001.00
5Upper Lateral Eyelid42.391088.581.431.431.42
6Lower Lateral Eyelid38.581227.101.301.621.58
7Tear Through47.001178.641.591.551.53
8Glabella46.591339.521.581.771.73
9Upper Nasal Dorsum52.191475.421.771.941.91
10Lower Nasal Dorsum61.601198.612.081.581.58
11Medial Canthus42.81840.361.451.111.11
12Mid Nasal Sidewall48.451746.271.642.302.25
13Lower Nasal Sidewall46.701969.201.582.592.52
14ALA51.571941.031.742.562.49
15Columella44.171160.761.491.531.56
16Philtrum48.071196.171.631.581.56
17Nasal Tip59.771288.001.681.701.67
18Soft Triangle51.441477.471.741.951.91
19Malar45.731040.461.551.371.36
20Lower Cheek44.661291.261.511.701.67
21Upper Lip62.621433.492.121.891.87
22Nasolabial Fold48.911250.181.651.651.63
23Marionette Fold40.87989.411.381.301.29
24Chin45.371165.771.531.541.52
25Temporal42.181245.771.431.641.61
26Preauricular37.531251.841.271.651.61
27Upper Helix42.291074.901.431.421.40
28Mid Helix56.891052.431.921.391.39
29Conchal Bowl32.92999.141.111.321.29
30Earlobe44.651191.901.511.571.55
31Lower Medial Eyelid48.01868.391.621.141.15
32Anterior Neck40.691237.681.381.631.60
33Lateral Neck32.891440.711.111.901.84
34Posterior Scalp35.361443.861.202.271.85
35Posterior Auricular29.571724.211.001.782.19
36Temporal Scalp33.251349.521.121.511.73
37Anterior Scalp37.541146.131.271.211.48
38Vertex37.42919.451.271.581.20
Maximum Value29.57758.852.122.582.52
Minimum Value62.621969.201.001.001.00
* are normalized ratios calculated by dividing each thickness by the thinnest value in each category. (1) the relative thickness of the epidermis. (2) the relative thickness of the dermis. (3) the relative thickness.
Table 4. Proposed 31 regions.
Table 4. Proposed 31 regions.
RegionLocationThickness (μm)
0Upper Medial Forehead1245.63
1Right Upper Lateral Forehead1297.30
2Left Upper Lateral Forehead1297.30
3Lower Medial Forehead1221.88
4Right Eye
5Left Eye
6Right Temporal Lobe1287.96
7Left Temporal Lobe1287.96
8Right Lower Lateral Forehead1212.20
9Left Lower Lateral Forehead1212.20
10Glabella1386.11
11Upper Nasal Dorsum1527.60
12Right Mid Nasal Sidewall1794.71
13Left Mid Nasal Sidewall1794.71
14Right Lower Nasal Sidewall2015.89
15Left Lower Nasal Sidewall2015.89
16Lower Nasal Dorsum1496.12
17Nasal Tip1496.12
18Philtrum1496.12
19Right Upper Lip1496.12
20Left Upper Lip1496.12
21Lower Nasal Sidewall2015.89
22Right Nasolabial Fold1299.08
23Left Nasolabial Fold1299.08
24Chin1211.14
25Right Marionette Fold1030.28
26Left Marionette Fold1030.28
27Right Malar1086.20
28Left Malar1086.20
29Right Lower Cheek1335.91
30Left Lower Cheek1335.91
Table 5. Disease information according to frequency band.
Table 5. Disease information according to frequency band.
ParameterFrequencyDescription
ULF≤0.003 HzAssociated with acute heart attack and arrhythmias
VLF0.033 Hz−0.04 HzVariables dependent on the renin–angiotensin system
LF0.04 Hz–0.15 HzControlled by the sympathetic and parasympathetic nervous systems
HF0.15 Hz–0.4 HzThere is a heart rate variability related to the respiratory system, called respiratory arrhythmias
Table 6. The median value of rBS and rBS ranking of regions.
Table 6. The median value of rBS and rBS ranking of regions.
Region012345678910
rBS2.881.831.812.981.701.321.361.241.971.873.33
Rank4910312221925782
Region1112131415161718192021
rBS1.981.161.141.331.071.431.361.171.171.161.44
Rank628302131181926262817
Region222324252627282930
rBS1.451.491.461.301.273.642.681.811.65
Rank1614152324151013
Table 7. Thickness and # of pixels at each region.
Table 7. Thickness and # of pixels at each region.
RegionLocationThickness μm# of PixelsrBS (Rank)
0Upper Medial Forehead1245.635044
1Right Upper Lateral Forehead1297.303899
2Left Upper Lateral Forehead1297.3047310
3Lower Medial Forehead1221.884543
4Right Eye-86512
5Left Eye-125522
6Right Temporal Lobe1287.961719
7Left Temporal Lobe1287.9641425
8Right Lower Lateral Forehead1212.205277
9Left Lower Lateral Forehead1212.205978
10Glabella1386.117752
11Upper Nasal Dorsum1527.604566
12Right Mid Nasal Sidewall1794.714628
13Left Mid Nasal Sidewall1794.715730
14Right Lower Nasal Sidewall2015.893821
15Left Lower Nasal Sidewall2015.894831
16Lower Nasal Dorsum1496.1212418
17Nasal Tip1496.1215019
18Philtrum1496.1214026
19Right Upper Lip1496.1217926
20Left Upper Lip1496.1220228
21Lower Nasal Sidewall2015.8926817
22Right Nasolabial Fold1299.0818616
23Left Nasolabial Fold1299.0821314
24Chin1211.1499015
25Right Marionette Fold1030.2831223
26Left Marionette Fold1030.2840824
27Right Malar1086.207941
28Left Malar1086.209555
29Right Lower Cheek1335.9184010
30Left Lower Cheek1335.91117413
Correlation coefficient(Thickness, rBS rank)0.50
(# of pixels, rBS rank)−0.53
Table 8. Experimental results for TOP-5, Face + Skin, Bot-5.
Table 8. Experimental results for TOP-5, Face + Skin, Bot-5.
MAEPCC
POSCHROMPOSCHROM
TOP-5Face + SkinBOT-5TOP-5Face + SkinBOT-5TOP-5Face + SkinBOT-5TOP-5Face + SkinBOT-5
UBFC1.851.877.261.52.675.90.800.850.290.870.800.36
LGI-PPGI3.614.046.212.934.0410.710.340.300.300.590.380.35
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Kim, D.-Y.; Lee, K.; Sohn, C.-B. Assessment of ROI Selection for Facial Video-Based rPPG. Sensors 2021, 21, 7923. https://doi.org/10.3390/s21237923

AMA Style

Kim D-Y, Lee K, Sohn C-B. Assessment of ROI Selection for Facial Video-Based rPPG. Sensors. 2021; 21(23):7923. https://doi.org/10.3390/s21237923

Chicago/Turabian Style

Kim, Dae-Yeol, Kwangkee Lee, and Chae-Bong Sohn. 2021. "Assessment of ROI Selection for Facial Video-Based rPPG" Sensors 21, no. 23: 7923. https://doi.org/10.3390/s21237923

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop