Remote Monitoring of Vital Signs in Diverse Non-Clinical and Clinical Scenarios Using Computer Vision Systems: A Review

Khanam, Fatema-Tuz-Zohra; Al-Naji, Ali; Chahl, Javaan

doi:10.3390/app9204474

Open AccessReview

Remote Monitoring of Vital Signs in Diverse Non-Clinical and Clinical Scenarios Using Computer Vision Systems: A Review

by

Fatema-Tuz-Zohra Khanam

^1,*,

Ali Al-Naji

^1,2

and

Javaan Chahl

^1,3

¹

School of Engineering, University of South Australia, Mawson Lakes, SA 5095, Australia

²

Electrical Engineering Technical College, Middle Technical University, Al Doura, Baghdad 10022, Iraq

³

Joint and Operations Analysis Division, Defence Science and Technology Group, Melbourne, VIC 3207, Australia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2019, 9(20), 4474; https://doi.org/10.3390/app9204474

Submission received: 29 August 2019 / Revised: 17 October 2019 / Accepted: 18 October 2019 / Published: 22 October 2019

(This article belongs to the Special Issue Contactless Vital Signs Monitoring)

Download

Browse Figures

Versions Notes

Abstract

:

Techniques for noncontact measurement of vital signs using camera imaging technologies have been attracting increasing attention. For noncontact physiological assessments, computer vision-based methods appear to be an advantageous approach that could be robust, hygienic, reliable, safe, cost effective and suitable for long distance and long-term monitoring. In addition, video techniques allow measurements from multiple individuals opportunistically and simultaneously in groups. This paper aims to explore the progress of the technology from controlled clinical scenarios with fixed monitoring installations and controlled lighting, towards uncontrolled environments, crowds and moving sensor platforms. We focus on the diversity of applications and scenarios being studied in this topic. From this review it emerges that automatic multiple regions of interest (ROIs) selection, removal of noise artefacts caused by both illumination variations and motion artefacts, simultaneous multiple person monitoring, long distance detection, multi-camera fusion and accepted publicly available datasets are topics that still require research to enable the technology to mature into many real-world applications.

Keywords:

vital signs; contactless; video camera imaging; imaging photoplethysmography (iPPG); region of interest (ROI); noise artefacts

1. Introduction

The monitoring of human vital signs, for example respiratory rate (RR), blood oxygen saturation (Sp

O_{2}

), heart rate (HR), heart rate variability (HRV) and blood pressure (BP) plays a significant role in modern clinical care of patients in hospitals and at home [1]. Applications include medical diagnosis, training programs, fitness assessment, lie detection and stress measurement [2]. There are various instruments for measuring these vital signs, such as electrocardiograms (ECGs), pulse oximeters, nasal thermocouples, respiratory belt transducers and piezoelectric transducers [3]. These instruments require direct physical contact with the human body as they use contact-based, sensor modalities, straps, probes or electrodes [4]. These instruments may cause skin infection, injury, or harmful reactions on patients especially premature babies, aged people or burns victims who have fragile skin [5]. Moreover, there is a risk of entanglement or strangulation of infants who are attached to monitors by means of wires and leads [6]. These instruments are also not appropriate for long term monitoring as they may cause discomfort, irritation and a cumulative risk of fungal and bacterial infection [7]. In addition, a reduced amplitude of chest wall expansion can affect the respiratory rate (RR) input signal from the impedance lead [8]. Furthermore, cost is an important issue as the monitoring electrodes and leads are only intended and certified for a single use, followed by disposal [9]. Placement of the sensors with self-adhesive pads leads to difficulties with wet, oily, dirty or hairy subjects which is a limitation of these technologies in emergency situations [10]. Accuracy is another issue with conventional contact methods since they are sensitive to artefacts produced by the subject’s movement [11]. Therefore, to minimise these limitations, there is a need for an alternative method where vital signs can be measured without any physical contact.

As presented in Figure 1, there are several noncontact means based on magnetic induction, the Doppler effect, thermal imaging and video camera imaging which can be an effective alternative means of monitoring vital signs with acceptable reliability and accuracy [12]. These methods depend on the observation of physical and physiological variations including skin colour, temperature, impedance changes, head motion, arterial pulse motion, and importantly, thoracic and abdominal motion due to the activity of both the respiratory and cardiovascular systems. Magnetic induction-based methods can detect the impedance changes caused by blood and air volume variations due to the mechanical action of the heart, diaphragm and thorax. The basic principle is to induce eddy currents in the tissue and to calculate the re-inducted magnetic field externally; the impedance changes can then be observed remotely to extract vital signs [13]. The method uses a simple arrangement based on multiple coils [14] or a single coil [15] integrated into a mattress [16], bed [17] or seat [18]. However, the method is highly susceptible to relative movements between coil and body.

The Doppler effect is an active noncontact method that can detect subtle chest movements due to cardiorespiratory activity. In this method, vital signs are extracted using a Doppler radar [19,20], or laser sensors [21] as well as digital signal processing (DSP) techniques [22] where the phase shift between the transmitted waves and the reflected received waves from a region of interest (ROI) are calculated. There are three types of Doppler-based methods—Doppler with electromagnetics [23,24], lasers [25,26], and ultrasonics [27,28,29].

Thermal imaging [30,31] is a passive noncontact method that can detect the radiation emitted from particular parts of the human body in the infrared (IR) range of the electromagnetic spectrum to measure the physiological signal using a thermal camera [32,33,34]. Thermal imaging-based methods extract vital signs by measuring temperature changes around the nostril area [35,36,37,38] as well as heat differences due to pulsating blood flow in the main superficial arteries at various regions such as the carotid artery in the neck [39,40] and temporal artery in the forehead [41,42]. However, both Doppler- and thermal imaging-based approaches are susceptible to noise and motion artefacts and constrain the movement of the subjects due to the high cost of the sensor, preventing saturation sampling of the environment. Their relatively low resolution limits the detection range and specificity to one subject. Moreover, these methods need exposed ROI and specialized hardware, making them costly [4]. They are also constrained to short-term monitoring and monitoring a single subject at a time. Additionally, Doppler-based methods may have biological effects on humans [43] with unknown future population risks if broadly adopted.

Digital cameras offer high resolution, in spatial (number of pixels per degree), temporal (number of frames per second), intensity (number of bits per pixel) and in spectrum (at least 3 visible channels, with hyperspectral options increasingly common), all due to consumer market demand. Furthermore, a large base of research assets exist for processing imagery, much of it free for use, for example; OpenCV [44]. Flexibility with visible light optical design, offering panoramic, microscopic and telescopic solutions in well integrated commercial product families allows diverse measurement scenarios. Tailored fields of view allow analysis of multiple ROIs in parallel, or in series based on availability. The mass market has led to low cost [4,12] and affordable optics that can be used in almost any conceivable application scenario [9].

Video camera imaging is a passive contactless method where video cameras are used to extract different physiological signals from several regions of the human body, exploiting two principles. The first principle relies on skin colour variations due to cardiorespiratory activity, photoplethysmography (PPG). Vital signs are measured by exploiting variations in the reflectance properties of human skin from video, which causes variation in brightness values in sequences of images. The second principle depends on cyclic body motion owing to cardiorespiratory activity in techniques that can be broadly characterised as motion-based methods. The motion in the regions of the head, arterial pulse, and thoracic and abdominal region are included in this method. For noncontact physiological assessments, camera imaging based methods seem to be a promising approach since they are robust, reliable, safe, cost-effective, suitable for long distance and long-term monitoring as well as multiple person detection simultaneously [12].

Camera imaging-based methods have been attracting increasing attention in the literature. This paper aims to explore the progress of video camera imaging-based technology from controlled clinical scenarios with fixed monitoring installations and controlled lighting, towards uncontrolled environments, crowds and moving sensor platforms. We focus on the diversity of applications and scenarios being studied in this topic. We emphasise visible light sensing since these cameras represent the largest installed base, the lowest costs, the highest rate of improvement and the greatest opportunity to insert new capability into existing devices. First, we discuss studies of motion and colour-based methods. Then, we discuss the considerations and scenarios appropriate to colour-based methods, for example, in the presence of motion artefacts, illumination variation, different sensors, different subjects, different vital signs, multiple ROIs, long distance and multiple persons. Additionally, potential application of iPPG in both clinical and non-clinical sectors are described. We then consider research gaps and challenges of existing studies that may inform researchers who wish to further progress the techniques and applications.

Several review papers have been published in recent years based on video camera imaging. McDuff et al. [45] presented a review of the work on remote PPG imaging using digital cameras. Sun et al. [7] introduced PPG measurement techniques from contact to noncontact and from points to images. Sikdar et al. [46] did a methodological review for contactless vision-guided pulse rate estimation. Hassan et al. [47] investigated both iPPG and ballistocardiography (BCG) estimation based on digital cameras. Rouast et al. [5] did a technical literature review on remote heart rate measurement using low cost RGB face video. Al-Naji et al. [12] provided a broad literature survey of remote cardiorespiratory monitoring, including the Doppler effect, thermal imaging, and video camera imaging. Zausender et al. [48] did a review on the technique’s background for cardiovascular assessment using iPPG. A thorough review on iPPG in diverse scenarios is appropriate, including issues such as motion artefacts, illumination variations, alternate sensor modalities, different subjects, different vital signs, multiple ROIs, long distance and multiple persons. Moreover, there is no review paper that focuses on applications of iPPG considering both clinical and non-clinical sectors.

To fill these gaps, this paper provides a comprehensive review of the recent advances of iPPG studies focusing on diverse and non-clinical scenarios. We compare different techniques to give a clear summary of the state of the field. We describe potential applications of iPPG in both clinical and non-clinical sectors separately to show the value of the iPPG technique in real world applications. Finally, we present several issues and scenarios for future studies.

2. Video Camera Imaging Based Method

2.1. Basic Framework

Using video camera imaging, a series of image and signal processing techniques are required to extract vital signs from the image. Figure 2 shows the basic framework, which includes data acquisition, ROI detection, raw signal extraction, noise artefact removal and vital sign extraction.

2.1.1. Data Acquisition

First, data, i.e., the video of skin area of human body, is collected using an imaging sensor such as a digital camera, webcam, smartphone, Microsoft Kinect or unmanned aerial vehicle (UAV), as depicted in Figure 3. A dedicated light or ambient light can be used as the light source. The frame rate varies from as low as 10 to 60 fps.

2.1.2. ROI Detection

After collecting video, regions of interest (ROI) such as the face, forehead, chest and palm within the video frames are detected either manually or automatically.

2.1.3. Raw Signal Extraction

Then, the raw signals are extracted from the selected ROI by calculating the spatial average of pixel value in the ROI for each frame using Equation (1). This describing an intensity-based method that uses spatial averaging. The purpose of this is to average out camera noise contained in each single pixel, thus improving signal to noise ratio.

i_{R} (t), i_{G} (t), i_{B} (t) = \frac{\sum_{x, y \in R O I} I (x, y, t)}{|R O I|}

(1)

where

i_{R} (t), i_{G} (t), i_{B} (t)

are three source signals from red, green and blue components, respectively, I(x,y,t) is the brightness pixel value at image location (x,y) at time t, and |ROI| is the size of the selected ROI.

2.1.4. Noise Artefact Removal

The raw signal may contain unwanted noise due to factors such as subject movement, illumination changes, camera movement and skin tone. To remove unwanted noise from the raw signal various signal processing techniques may be applied such as low pass filtering, bandpass filtering, adaptive bandpass filtering, signal decomposition, blind source separation (BSS) and model-based methods.

2.1.5. Vital Sign Extraction

Finally, vital signs are extracted by using frequency analysis or peak detection. For frequency analysis, a signal that contains a distinct periodicity is converted to the frequency domain using a discrete Fourier transform. The fast Fourier transform (FFT) is generally applied to calculate the corresponding frequency,

F_{s}

. Discrete cosine transform (DCT), Welch’s method or short-time Fourier transform (STFT) can also be used. When using a peak detection algorithm, the number of peaks,

N_{s}

, is calculated during the processing period T (s). Heart rate and respiratory rate per minute can be calculated as follows:

HR or RR = 60 \times F_{s}

(2)

HR or RR = 60 \times (N_{s} / T)

(3)

2.2. Motion Based Methods

Cardiorespiratory activity causes subtle motion in various parts of the human body that can be measured from video to extract vital signs. Numerous researchers have introduced different techniques to monitor vital signs using information extracted from motion. Nakazima et al. [49] proposed a contactless method based on optical flow analysis to extract RR from the video of whole body motion captured by a CCD (charge-coupled device) camera. Another optical flow analysis-based respiratory monitoring system was introduced by Frigola et al. [50] using videos of a subject’s chest movement recorded by video camera. Optical flow-based techniques have been found to be susceptible to motion artefacts caused by the subject’s movement, ambient light and unclear ROI. Computational complexity is a consideration for optical flow-based analysis compared to other processing options.

Several studies have been published based on blind source separation (BSS) using either principal component analysis (PCA) or independent component analyses (ICA). Balakrishnan et al. [51] proposed a novel noncontact method based on head motion using facial video recorded by a digital camera to measure heart rate. In their proposed method, the Viola–Jones (V–J) face detector [52] and the Kanade-Lucas-Tomasi (KLT) [53] tracking algorithm were exploited for detecting the face, extracting the ROI and tracking the ROI feature points based on the good feature tracking (GFT) method. Then, using a Butterworth filter, the tracked points were temporally filtered and principal component analysis (PCA) was exploited to remove artefacts and measure the physiological signal. Finally, the pulse rate was extracted by means of a simple peak detection algorithm followed by a Fast Fourier Transform (FFT). On the other hand, to monitor HRs, Shan et al. [54] introduced another noncontact method using head motion captured by smart phone camera based on independent component analyses (ICA) rather than PCA. These methods are susceptible to motion artefacts as they considered only stationary subjects without any internal or external movement. To mitigate these limitations, an improved head motion-based method to extract HR using a webcam was designed by Irani et al. [55] based on the discrete cosine transform (DCT) and a moving average filter considering various facial expressions and head poses. Haque et al. [56] further improved this technique by integrating both the GFT method and supervised descent method (SDM) with training against the MANHOB-HCI (human computer interaction) database with moving objects. Lomazia et al. [57] proposed a method based on tracking both background and facial features to solve the handshaking problem with a smartphone camera. This method was further enhanced in [58] by exploiting a system with two cameras in a smartphone, where the front camera was for tracking facial features and the rear camera was for tracking background features.

Some researchers have used video magnification to extract vital signs from motion. He et al. [59] introduced a contactless method combining both Eulerian video magnification (EVM) [60] and 2-Gaussian curve modelling for extracting pulse signals from a subject’s wrist using a digital camera. A study by Al-Naji et al. [61] used video magnification and a frame subtraction method to remotely measure cardiac activity (heart rate, pulse width and the total cycle length) from head motion. Another study by Al-Naji et al. [62] presented a noncontact respiratory monitoring scheme for detecting and measuring respiratory rates as well as respiratory cycle timing parameters by using a video camera from the movement of the chest or blanket draped over an infant subject in various sleeping postures based on the two processing techniques, as shown in Figure 4. The first technique was for magnifying the motion resulting from chest movement based on 8 level wavelet pyramid decomposition and 5th order temporal elliptical filtering. The second technique was to measure respiratory rates by using motion detection exploiting frame subtraction, local contrast enhancement, binarization, morphological and masked filtering, white area detection and logical matrix calculation where the mean distance between ones in the logical matrix was used to calculate RR. Although the system was an effective solution for the unclear ROI problem, it still has some limitations. They have only considered a single vital sign, i.e., RR and a single infant, which limits the real applicability of the system. In [63], a contactless real time monitoring system was introduced for measuring respiratory rates and detecting apnoea by exploiting a Microsoft Kinect v2 sensor, as presented in Figure 3c, from the movement of thorax and abdomen in different sleeping positions and various conditions of light including dark environments. They used an improved motion magnification scheme based on the Lanczos resampling method, wavelet pyramid decomposition, temporal band pass filtering and image denoising to magnify the input signal. Frame subtraction was used to calculate respiratory rate. Nevertheless, the system suffered from motion artefacts, short range, and a limited number of viable ROI. Furthermore, they only considered respiratory activity and apnoea, and did not consider other vital signs. To mitigate this limitation, an improved contactless system was presented to measure the HR and RR and to sense irregular cardiopulmonary functions such as bradycardia, tachycardia, bradypnea, tachypnoea, and central apnoea by using the signal from the thoracic-abdominal region based on image sequences taken by the Microsoft Kinect v2 sensor with processing that considered unclear ROI, various illumination conditions and different sleeping postures [64]. The basic block diagram of the proposed system is presented in Figure 5. An efficient motion magnification technique (EMMS) [65] was used to magnify the input data to make the movement apparent. Then, to calculate HR, an intensity-based technique was used, including signal decomposition, blind source separation, spectral analysis, filtering and peak detection. To calculate RR, they used a frame subtraction technique, binarisation, filtering, white area detection and binary matrix calculation. If any irregularity was detected, an alarm would be sent to a carer to notify them of the irregularity. However, the detection range was limited to 4.5 m and the number of ROI was also limited. Furthermore, system failure may occur in the case of a fully covered subject or if the subject should lie on their stomach.

The motion-based method was beneficial in this case, since the method is robust to illumination variance, skin tone and unclear ROI. However, the main limitation of this method is the dependency on motion features to extract the physiological signals. As a result, the technique may be highly vulnerable to voluntary motion variations such as different facial expressions, walking and talking by the subject. Consequently, determining vital signs during voluntary motion may reduce the consistency and reliability of the method, and remains a research problem to be overcome.

2.3. Colour-Based Method

Photoplethysmography (PPG) is a non-invasive, low-cost, passive, optical technique first proposed by Hertzman et. al. in 1937 [66]. It can monitor three vital signs: HR, RR and Sp

O_{2}

. PPG is used extensively in biomedical, clinical and non-clinical fields due to its simple design and relatively low cost. PPG measures variations in the optical properties of transmitted or reflected light from the human skin caused by blood volume changes during cardiorespiratory activity. Blood absorbs light more than the adjacent tissue which causes subtle optical property variations of the skin because of the haemoglobin in blood. Generally, PPG can function either in a transmissive or a reflective mode; however, the first mode is limited to regions such as the ear lobes and fingertips. In the PPG technique, a dedicated light source is required to illuminate a part of the body and an optical sensor or photodetector connected to the skin is needed to sense the optical properties of the skin. Imaging photoplethysmography (iPPG) is basically remote or non-contact PPG; whereas for skin-contact, a PPG sensor and illuminator replace the camera. To detect PPG signals remotely, many researchers have considered a video camera as an optical sensor. Some studies have used a dedicated light source and others have considered just ambient light as a source of light. Nevertheless, a dedicated light source increases the hardware setup of experiments as well as starts to become invasive to the participant, particularly infants.

As shown in Figure 6, the iPPG reflection model mainly consists of three parts such as a light source, a patch of human skin containing pulsating blood and a video camera. The light source can be a dedicated light source or ambient light. When a light source illuminates human skin, subtle colour changes are observed from the videos captured by the camera. Without illumination variations and motion artefacts, colour changes denote the blood volume changes in the microvascular tissue bed under the skin because pulsatile blood flows vary each cardiac cycle. Nevertheless, illumination variations and motion artefacts can also be a reason for the intensity and spectral composition variations. So, the skin area observed by the camera has a colour variation because of motion-induced and illumination-induced intensity/specular changes and pulse-induced subtle colour changes. It is assumed that the spectral composition of the light is fixed. The variation of light intensities is dependent on the distance from light source to camera and to the skin, and the specific geometry of the situation.

Based on the dichromatic model [67], the light reflected from the skin can be presented as follows

P_{k} (t) = I (t) (R_{s} (t) + R_{d} (t)) + R_{n} (t)

(4)

where

P_{k} (t)

represents the RGB channels of the kth skin pixel; I(t) denotes the illumination intensity level;

R_{s} (t)

signifies the specular reflection and

R_{d} (t)

represents the diffuse reflection. I(t) is modulated by both specular and diffuse reflection.

R_{n} (t)

signifies the quantization noise of the camera sensor.

The specular reflection of light from the skin surface is like light reflected from a mirror. As specularly reflected light does not penetrate skin tissues, it does not carry any information about the physiological signal. Additionally, the spectral composition of specular reflected light is identical to the light source [67]. On the other hand, diffuse reflected light penetrates the skin, it is absorbed and scattered inside skin tissues and then reflected. Useful information about the physiological signal can be extracted from the diffuse reflected light.

Even though the slight colour variations of the skin are invisible to human eyes, they can be detected by video cameras. From the frame sequences of the video captured by a camera, the PPG signal can be extracted by image and signal processing techniques.

Over the past several years, numerous techniques in iPPG have been proposed under well controlled situations such as considering stationary subjects and stable illumination, as listed in Table 1. Verkruysse et al. [68] used normal ambient light as the light source to measure the HR and RR from the human face using a digital camera. In their proposed method, the ROI was selected manually, and the raw PPG signal was calculated per frame by an intensity method: averaging the spatial pixel value of the red, green, blue (RGB) colour channels. The fast Fourier transform (FFT) algorithm was used to calculate the power spectral density of the signal to extract HR. They showed that under these circumstances, the green channel with highest signal to noise ratio (SNR) contains the strongest plethysmography signal because oxygenated haemoglobin absorbs green light more than blue and red. In addition to HR and RR monitoring, they showed that PPG imaging can be used to characterise regions of high and low pulsatility on facial port wine stains (PWS). They used amplitude and phase maps to show the difference between normal skin and PWS skin. In normal skin, HR pulse amplitudes (G channel) were typically 2 to 4 times higher than in adjacent PWS skin, e.g., 0.75 and 0.25 PV, respectively. However, this method is strongly affected by motion artefacts caused by the subject’s movement, as they did not consider methods for noise removal. Their ROI selection was manual. To overcome the limitations of the previous technique, Pho et al. [69] proposed a novel method to measure HR, considering blind source separation (BSS) using a webcam and considering three subjects at a time. Moreover, they used the OpenCV face detection algorithm which is automatic and based on the Viola and Jones (V–J) [52] method. The facial ROI was defined as a rectangular bounding box. However, the proposed approach only considered very small movements and did not consider illumination variations. Additionally, Poh et al. considered the second component produced by ICA as the PPG signal always. Moreover, they only measured HR in that study, a drawback that was minimised in [70] where they presented an enhanced method to measure RR and heart rate variability (HRV) along with HR. Another ICA based method was introduced by Purche et al. [71] where they used two algorithms for extracting the pulse rate—a peak detection algorithm and power-spectrum analysis algorithm under both relaxed and active conditions—and found that power-spectrum performed better than peak detection. However, this method is highly susceptible to noise artefacts caused by illumination variations, subject movement and facial expressions. Lewandoska et al. [72] proposed another method based on BSS using principle component analysis (PCA) and reported that PCA is faster than ICA in terms of computation time and it can be a good alternative if only HR is extracted. However, in this method, only the stationary case and HR were considered.

3. Different Aspects of Colour-Based Method

3.1. Motion Artefacts

There are some studies that treat motion artefacts as a serious issue that have proposed various methods to suppress the motion artefacts based on blind source separation (BSS) or model-based methods, listed in Table 2.

Pho et al. [69] proposed an automatic and robust method to measure HR using a webcam considering three subjects at a time. They first introduced a novel method to remove motion artefacts using BSS based on independent component analysis (ICA). In their proposed method, first, the region of interest (ROI) is automatically detected using a face tracker. They used the OpenCV face detection algorithm which is based on the Viola and Jones (V–J) [52] method. The facial ROI is defined as a rectangular bounding box. Then, the ROI is decomposed into the RGB channels and spatially averaged to obtain the raw RGB traces. After that, ICA is applied on the normalised RGB traces to recover three independent source signals.

To show how ICA [73] works as BSS, let us consider that source signals,

s_{i} (t)

= [

s_{1} (t), s_{2} (t), \dots . ., s_{k} (t)

], are independently transmitted by K sources. The observed signals,

x_{i} (t)

= [

x_{1} (t), x_{2} (t), \dots . ., x_{k} (t)

], by various sensors, i = 1, 2, …, K can be written as follows:

x_{i} (t) = A s_{i} (t)

(5)

where mixing matrix A with column vectors

a_{i}

is unknown and

n_{i} (t)

is the additive noise. The observations, x(t), are mixtures or linear combinations of the source signals,

s_{i} (t)

. To estimate A and

s_{i} (t)

, it is assumed the all source signals are statistically independent and nongaussian. To reconstruct the source signals,

u_{i} (t)

, the following expression can be used:

u_{i} (t) = W x_{i} (t) = W [A s_{i} (t)] \approx s_{i} (t)

(6)

where W is the demixing matrix and it is the inverse of matrix A.

The joint approximate diagonalization of eigenmatrices (JADE) algorithm is then used to perform ICA. From three recovered independent source signals, the second component is always chosen as the desired source signal. Finally, FFT is used on the selected source signal to obtain the power spectrum. The pulse frequency is designated as the frequency that corresponds to the highest power in the spectrum within an operational frequency band. To conduct their experiment, Pho et al. considered 12 participants remaining still and engaging in natural movement. Using Bland–Altman and correlation analysis, they compared the HR extracted from videos recorded by a basic webcam to an FDA-approved finger blood volume pulse (BVP) sensor and achieved high accuracy and correlation even in the presence of movement artefacts. They also measured the HR of multiple persons. The study considered small movements and did not consider illumination variations. Additionally, they considered the second component produced by ICA to be the PPG signal in every case.

In [74], a motion compensated method was designed based on single-channel independent component analysis (SCICA) to extract both HR and RR using a CMOS video camera during exercise. Moreover, they implemented image stabilisation via 2D cross correlation and used time-frequency representation (TFR) for spectral analysis. Still, the motion artefacts could not be totally removed using this method. Moreover, in the extracted components, physiological waveforms were different under different situations. Feng et al. [75] introduced an improved motion tolerant technique to measure both average and instantaneous HR via webcam, considering a substantial amount of movement by the body as well as the head. After detecting a face by the V–J method, they used the speeded up robust feature (SURF) detection algorithm to find trackable interest points in the facial region and exploited the Kanade–Lucas–Tomasi (KLT) algorithm for tracking the faces. To extract the signal, they used an adaptive bandpass filter and automatic sorting algorithm to sort ICA output components by taking a sine function as a reference signal. Nevertheless, this method is not suitable if the subject walks or engages in other difficult movements and the range tested was also very short, with limited ROI options. Using a digital camera to remotely measure heart rate, Qi et al. [76] proposed a new technique to improve the photoplethysmography signal by combining facial sub-region landmark localisation and joint blind source separation (JBSS) methods to extract the physiological signal.

However, blind source separation-based approaches may have limitations when exposed to other incidental periodic signals, including illuminance or motion variations. To overcome this limitation, some researchers have considered different techniques to suppress motion artefacts rather than using BSS. For instance, a continuous wavelet transform-based method was presented in [77] to extract both instantaneous heart and respiratory rates using a webcam with normal head movements. However, their motion consideration was limited to head movements and they did not consider illumination variation or other movements. De Haan et al. [78] proposed a robust technique to extract HR from CCD camera video based on chrominance (CHROM) during exercise. In chrominance-based methods, the colour difference signals are combined linearly considering a standardized skin tone. However, in this case they considered two fitness devices, a stationary bike and a stepping device, with only one subject. Their method was also affected by skin colour, especially dark skin. This method was further improved in [79] by using the blood volume pulse vector and five different exercise devices. Here, blood volume pulse (BVP) was considered as a signature for various reflection spectra of skin for explicitly distinguishing the physiological signal from noise caused by motion. Another study by Feng et al. [80] presented a system, robust to motion, using adaptive colour variation between red and green channels as well as an adaptive bandpass filter (ABF) to measure HR from moving subjects, exploiting a webcam. Using a CCD camera, Wang et al. [81] introduced a new robust technique exploiting spatial subspace rotation (2SR), where a spatial subspace of skin-pixels was estimated and its temporal rotation was measured in the image domain to measure HR. Nevertheless, as 2SR is a completely data-driven algorithm, it might cause undetected inaccurate results because of noise or a poorly selected skin-mask. Another algorithm to measure HR by exploiting imaging of the plane orthogonal to skin (POS) was presented in [67] where normalized RGB channels were combined into two new channels which were merged by weighting to the desired signal. Later, to allow the independent reduction of various motion frequencies by exploiting sub band (SB) decomposition, Wang et al. [82] enhanced the POS method to measure continuous HR considering different fitness applications. However, this technique cannot suppress motion artefacts when the motion and the pulse have the same frequency. Moreover, this method is susceptible to degradation with skin tone and illumination variation. Wu et al. introduced a time-frequency analysis method using continuous wavelet transform (CWT) [83] and a motion resistant spectral peak tracking (MRSPT) method [84] to measure HR via webcam, considering seven motion circumstances when subjects were driving, running and engaged in fitness training. Nevertheless, these methods considered very short ranges of around 1.5 m. For monitoring precise HR using a video camera, Xie et al. [85] presented another method using singular spectrum analysis (SSA) where the motion signal measured from signal and SSA based on singular value decomposition (SVD) was employed to rectify motion artefacts during treadmill exercise. However, they only considered treadmill exercise and a small number of subjects. Moreover, only the first three leading decomposed components of the motion signal were removed from BVP, which may affect the performance of the proposed method. McDuff et al. [86] designed a computationally efficient method based on linear transformation exploiting parameters attained from tissue-like models of the skin to extract HR and HRV accurately in the presence of various head motions using a colour camera. However, the apparatus needed to be calibrated via a known colour grid and only systematic head motions were considered. Fallet et al. [87] introduced a signal quality index (SQI) and verified the capability of SQI as a tool in order to enhance the consistency of heart rate measuring applications considering videos of a moving object. However, this SQI on videos needs to be tested with participants carrying out smoother movements, in time-varying illumination environments.

Motion artefacts are not only caused by camera or subject movement but also caused by cardiac-related, i.e., ballistocardiographic (BCG) artefacts. Using a CCD camera, Moco et al. [88] proposed a motion robust PPG-imaging through colour channel-mapping based on chrominance (CHROM) and the blood-volume pulse (BVP) signature methods to overcome BCG artefacts. They extracted a reference PPG signal from the palm and obtained PPG-images as the normalised inner-product between this reference and the streams from the skin sensor array. In the proposed method, first they pre-process the videos. After frame preprocessing, remote-PPG signals are acquired in each sensor-element, mapped according to CHROM or PBV algorithms and correlated with a reference PPG signal from the palm. PPG amplitude and phase images are obtained using channel mapping algorithms. The result showed that the proposed method reduced BCG artefacts to less than 10% of the reference PPG signal strength at the palm.

3.2. Illumination Variations

While most studies are concerned about the motion artefacts, other studies have considered how to mitigate illumination variations, as listed in Table 3. For example, Chen et al. [89] proposed a new robust method to measure the pulse signal using a camera that is based on reflectance decomposition on the green channel and ensemble empirical mode decomposition (EEMD) to suppress the noise caused by illumination variation. First, videos covering the brow area were captured using a digital camera. Then, the green channel was selected for reflectance decomposition as oxygenated blood absorbs green light more than red and blue. Reflectance decomposition was done using the alternating direction method of multipliers (ADMMs). To remove noise caused by illumination variations, EEMD was used to decompose the original time signal from face reflectance to a set of intrinsic mode functions (IMFs). They selected IMF4 as it was closest to normal heart rate frequency. Finally, a peak detection algorithm was applied to detect the number of peaks and, finally, HR was calculated. The experimental result showed that their method outperformed Pho’s method [69], providing better measurement accuracy with a smaller variance. This method was further improved in [90] where HR was evaluated by means of a multiple linear regression (MLR) model followed by Poisson distribution to reduce the effects of ambient light changes. Nevertheless, EEMD has limitations of inadvertently considering periodic illuminant variations as physiological signals, especially if the frequency is near the normal cardiac frequency range, particularly from 0.75 to 4 Hz. Moreover, both methods are not suitable for real time application. To measure HR, Lee et al. [91] used a different approach based on multi order curve fitting (MOCF) to remove noise artefacts, considering subjects watching television in a dark room. In the proposed approach, they subtracted the estimated brightness signal from the raw PPG signal to cancel out the noise caused by illumination variations. Another study by Tarassenko et al. [92] presented a new technique based on autoregressive (AR) modelling and pole cancellation to cancel out the aliased frequency components caused by artificial light flicker, which enhanced robustness of measurement under strong fluorescent lights. Nevertheless, the periodic illumination variations may affect the performance as AR modelling is a spectral analysis technique. Moreover, there is a calibration issue while calculating oxygen saturation as well. To extract HR using a webcam, Cheng et al. [93] designed a robust method exploiting illumination variation by combining joint blind source separation (JBSS) and EEMD and considering background images as well. However, this method is not free from motion artefacts as they only considered stationary subjects. Moreover, the illumination variation considered here was mostly controlled, which restricts real time applicability of the proposed method. Another method that was robust to illumination variations was proposed by Xu et al. [94], using partial least squares (PLS) and multivariate empirical mode decomposition (MEMD) to efficiently extract HR using a webcam under varying illumination conditions. However, they only considered artificial illumination created by an LED lamp.

Most of the above-mentioned works are concerned about the elimination of the effect of illumination variations and keeping the participants still, which means that they did not consider motion artefacts. Nevertheless, both illumination variations and motion artefacts are important in realistic applications. So, these approaches which are initially considered to remove illumination variations should be further improved by considering motion artefacts.

3.3. Both Illumination Variations and Motion Artefacts

Most studies are concerned with either motion artefacts or illumination variations. Only a few studies have considered both illumination variations and motion artefacts as issues. Table 4 summarizes the studies concerned with both illumination and motion. Li et al. [95] introduced a normalized least mean square (NLMS) adaptive filtering scheme to suppress the effect of both the subject’s motion and illumination variations considering realistic situations, for example, watching movies or playing games by using an iSight camera of an IPAD to measure cardiac pulse. Nevertheless, this method was seriously affected by large head movements. Using a monochrome camera, Kumar et al. [96] designed a robust method to extract HR and HRV based on a weighted average and considering different skin tones, illumination variation, and several motion scenarios. Al-Naji et al. [97] presented a new noise elimination technique using both complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and canonical correlation analysis (CCA) to eliminate noise artefacts caused by movement of subjects and camera, variations of illumination and variation of skin tone. As shown in Figure 7, firstly, data was collected using a UAV from a 3 m distance at various times of day with various illumination levels. Then, an improved video magnification technique [98] was used to magnify skin colour variation. After that, an efficient face detection method [99] was used to detect facial ROI as it is more effective with an inclined or angled face. Raw iPPG signal was extracted using spatial averaging within the facial ROI of the green channel. To suppress the noise caused by illumination variations, CEEMDAN was applied as it showed better performance than EMD and EEMD by decreasing noise from the intrinsic mode functions (IMFs) with more physical meaning. BY using CEEMDAN, the iPPG signal was decomposed into eight IMFs and the 5th, 6th and 7th IMFs were selected as their frequency bands fall within 0.2–4 Hz, corresponding to 12 to 240 beats/min. Then, to reduce motion artefacts, CCA was used on chosen IMFs as CCA generates components derived from their uncorrelated signals rather than independence components used in ICA and gives better performance than ICA.

To explain how CCA works as BSS [100], let us consider j and k to be two multidimensional random signals with N mixtures. The linear combinations of these signals are known as the canonical variates and can be written as follows:

j = W_{j_{1}} j_{1} + W_{j_{2}} j_{2} + \dots W_{j_{N}} j_{N} = W_{j}^{T} j

(7)

k = W_{k_{1}} k_{1} + W_{k_{2}} k_{2} + \dots W_{k} k_{N} = W_{k}^{T} k

(8)

where the weighting vectors of j and k are

W_{j} = {[W_{j_{1}}, W_{j_{2}}, W_{j_{N}}]}^{T}

and

W_{k} = {[W_{k_{1}}, W_{k_{2}}, W_{k_{N}}]}^{T}

, respectively, which maximise the correlation between j and k by resolving the following maximisation problem:

\begin{array}{r} ρ = c o r (j, k) = m a x_{W_{j}, W_{k}} c o r (j, k) = \frac{E [j k]}{\sqrt{E [j^{2}]E [k^{2}]}} \\ = \frac{W_{j}^{T} C_{j k} W_{k}}{\sqrt{(W_{j}^{T} C_{j j} W_{j}) (W_{k}^{T} C_{k k} W_{k})}} \end{array}

(9)

where the non-singular within-set covariance matrices of j and k are

C_{j j}

and

C_{k k}

, respectively;

C_{j k}

is the between-sets covariance matrix; and E represents the expected value operator of the corresponding variables. The maximisation problem with respect to

W_{j}

and

W_{k}

can be resolved as follows:

{[\begin{matrix} C_{j j} & 0 \\ 0 & C_{k k} \end{matrix}]}^{- 1} [\begin{matrix} 0 & C_{j k} \\ C_{k j} & 0 \end{matrix}] (\begin{matrix} {\hat{W}}_{j} \\ {\hat{W}}_{k} \end{matrix}) = ρ (\begin{matrix} {\hat{W}}_{j} \\ {\hat{W}}_{k} \end{matrix})

(10)

After solving Equation (10), a complete explanation of the canonical correlations can be written as follows:

\{\begin{matrix} C_{j j}^{- 1} C_{j k} {\hat{W}}_{k} = ρ {\hat{W}}_{j} \\ C_{k k}^{- 1} C_{k j} {\hat{W}}_{j} = ρ {\hat{W}}_{k} \end{matrix}

(11)

The K approximations of the source signals,

z_{i} (t)

, i = 1,2, …, K, can be obtained by:

z_{i} (t) = {\hat{W}}_{j_{i}}^{T} j (t) \approx s_{i} (t)

(12)

It is noted from Equation (12) that the CCA technique yields the same outcome when employed with a given dataset; which is not possible in case of the ICA technique.

Furthermore, spectral analysis and filtering were done using FFT and two Butterworth bandpass filters, respectively. Finally, HR and RR were extracted using a peak detection algorithm. To obtain experimental results, the authors considered 15 subjects with different skin tones in four different scenarios such as without movement, different facial expressions, talking and different illumination levels. Figure 8 shows that the proposed method (with and without magnification) achieved better performance than ICA and PCA for all four scenarios. However, the CCA (1.22 s) needs higher computational time than ICA (0.86 s) and PCA (0.79 s). Moreover, in the proposed method, they did not consider higher levels of movement such as walking or exercising. In addition, this method is also constrained to limited distance and single subject detection.

3.4. Alternate Sensors

Rather than using digital cameras or webcams, some researchers have used other sensors to capture PPG data, as shown in Table 5. For example, Kwon et al. [101] extracted HR by exploiting the built-in camera of a smartphone and introduced FaceBEAT, which is an application in an iPhone for measuring HR remotely. Al-Naji et al. [102] presented a robust method to monitor cardiorespiratory signals remotely from the video taken by a hovering unmanned aerial vehicle (UAV), as shown in Figure 9. They used an improved video magnification technique, an intensity-based method and advanced signal processing methods such as signal decomposition based on complete EEMD and blind source separation based on ICA to eliminate noise artefacts. Nevertheless, only slow and small movements were considered, and they did not address low light situations either. To extract vital signs, McDuff et al. [103] used a DSLR camera with five bands (red, green, blue, cyan and orange), considering only stationary subjects both at rest and under cognitive stress and reported that CGO is better than RGB. However, this method is susceptible to noise artefacts and is not suitable for real time application. In [104], a comparison between a CMOS camera and a webcam was done to extract HR under cycling exercise, and reported HR values were independent of the measurement method. However, the proposed method is applicable to a very limited range of 0.2 to 0.35 m. Blanik et al. [36] combined both a CCD camera and a thermal camera to monitor a broad range of physiological signs. Using a Kinect device, Bernacchia et al. [105] measured HR and RR by means of spatial averaging and ICA. Smilkstein et al. [106] and Gambi et al. [107] used Microsoft Kinect to extract HR, exploiting the EVM technique based on RGB signals.

3.5. Different Subjects

Most of the work considered adult, healthy subjects and a limited number of babies to measure vital signs. However, there are some researchers who considered infants in neonatal intensive care units (NICUs), as described in Table 5. Scalise et al. [108] proposed a system to measure heart rates using a web camera in a NICU using an intensity-based method and ICA. However, this method is susceptible to noise artefacts such as motion and illumination variations and range is very short, at only 0.20 m. Another work [10] by Arts monitored the HR of newborn infants in NICUs using a digital camera with up to 1 m range, exploiting ambient light. Still, this method is not free from motion artefacts, illumination variation and the skin tone problems. To suppress illumination variations and moderate motion artefacts, Cobos–Torres et al. [109] designed a computationally efficient method based on numerical analysis techniques and filtering using a digital camera to measure the HR and RR of preterm infants in NICUs. Nevertheless, the proposed method is affected by strong motion, poor lighting and shadow. Gibson et al. [110] introduced a remote monitoring system to monitor the HR and RR of infants in NICUs using a digital camera based on video magnification and compared their results with ECG data. Their system was also able to detect a real apnoea event in clinical settings.

3.6. Different Vital Signs

Almost all studies aimed to extract heart and respiratory rates. Few researchers consider other vital signs such as heart rate variability and oxygen saturation, as shown in Table 5. To extract blood oxygen saturation (Sp

O_{2}

) along with HR and RR, Tarassenko et al. [92] presented a new technique based on autoregressive (AR) modelling and pole cancellation in order to cancel out the aliased frequency components caused by artificial light flicker, which enhanced the robustness of the method under strong fluorescent lights. By means of a webcam, another robust method to calculate Sp

O_{2}

and average HR was introduced by Bal [111], using a noise removal algorithm based on dual tree complex wavelet transform (DTCWT) to rectify the artefacts caused by movement and artificial lighting. However, the range was very short, at only 50 cm.

3.7. Multiple ROIs

Almost all of the prior works consider limited ROI, particularly the face, for vital sign extraction. Some studies have used the face and cheeks [10,112], face and forehead [72], cheeks and forehead [92], nose, forehead and mouth [71], as ROIs to measure physiological signals. Another study by Datcu et al. [113] considered 10 different parts of the face as ROIs. To extract vital signs, Yu et al. [74] and Feng et al. [114] used both the face and the palm. Another study by Bernacchia et al. [105] considered the neck, thorax and abdominal area to calculate HR and RR. Zhao et al. [115] extracted physiological signals from face, arm and hand using a webcam, considering two subjects simultaneously under stationary conditions. However, this method is susceptible to motion artefacts and was only tested over short ranges. Al-Naji et al. [116] measured cardiopulmonary signals from various regions of the human body such as the face, palm, wrist, arm, neck, leg, forehead, head and chest under stationary scenarios using a digital camera. They proposed a noise elimination technique based on EEMD and ICA. They used various methods such as intensity, frame subtraction and feature tracking to extract cardiopulmonary signals by considering skin colour variation, chest motion and head motion. Nevertheless, the proposed methods suffer from various issues such as limited range, a single subject at a time and noise artefacts.

3.8. Long Distance and Multiple Persons Simultaneously

Most of the previous methods discussed above have some limitations. Firstly, most of the works considered very short ranges, the highest distance considered was only 3 m. Secondly, other than Pho et al. (three subjects) and Zhao et al. (two subjects), all other methods measured vital signs for one subject at a time. In [117], a robust contactless method was proposed to calculate HR and RR using both iPPG and head motion from the videos taken by both a digital camera and a hovering UAV, considering long ranges of up to 60 m, multiple subjects in groups of up to six people simultaneously under both stationary and non-stationary circumstances using a video magnification technique. To eliminate noise artefacts caused by movement of subjects and camera, variations of illumination and variation of skin tone, they used a noise elimination technique using both CEEMDAN and CCA. The FFT was used for spectral analysis and two Butterworth bandpass filters were used for filtering. Finally, HR and RR were extracted by calculating the number of peaks using the MATLAB built-in function ‘findpeaks’. Moreover, they introduced a graphical user interface (GUI) that facilitates a user loading video data, selecting the magnification type, and executing the proposed system and configurations.

The experimental setup and data acquisition of the proposed system is presented in Figure 10 where they considered three groups of people consisting of 15, 20 and 10 subjects with different skin tones, respectively, under three assumption: noise artefacts, multiple detection and long distance. Figure 11 and Figure 12 demonstrate that the proposed method achieved better performance than ICA and PCA for both stationary and nonstationary scenarios with multiple subject detection and long distance, respectively. From these figures, it can also be noted that colour-based methods gave better performance than motion-based methods. However, the system used limited ROI, observing only the face and will be affected by unclear ROI.

3.9. Others

Based on machine learning, Hsu et al. [118] presented a novel method to extract HR using a video camera where the PPG signal was recovered from either ICA or chrominance-based methods and may be enhanced by utilising the mid-level PPG based features exploiting support vector regression (SVR). However, they only considered the stationary case. Using a deep convolutional attention network (CAN), Chen et al. [119] introduced the first end-to-end system named DeepPhys for video-based measurement of HR and BR. They proposed a new motion representation based on a skin reflection model and a new attention mechanism using appearance information to guide motion estimation, both of which enabled robust measurement under varying lighting and significant head motions. Normalised frame difference was used as input motion representation. The network learnt spatial masks, that were shared between the models, and features important for recovering the BVP and respiration signals. The motion model and the appearance model were learnt jointly to find the best motion estimator and the best ROI detector simultaneously. They evaluated their method on three datasets of RGB videos and a dataset of IR videos. The proposed approach significantly outperformed prior state-of-the-art methods (CHROM, POS etc.) on both RGB and infrared video datasets and allowed spatio-temporal visualisation of physiological information in video. Moreover, the participant dependent vs. independent performance as well as the transfer learning results showed that the supervised method generalised to other people, skin types and illumination conditions. Yu et al. [120] proposed the first end-to-end spatiotemporal network (PhysNet) to recover iPPG signals precisely from raw facial videos, considering a temporal context which was not considered in previous works. They conducted comprehensive experiments on OBF and MAHNOB-HCI datasets. First, the face area was detected using the V–J face detector. Then, to extract the iPPG signal, a spatio-temporal network with 3D convolutional neural networks (3DCNN) and a recurrent neural network (RNN) was used. After that, filtering, normalisation, and peak detection were performed to attain the inter-beat-intervals. Finally, the average HR and HRV were calculated. Experimental results showed that the proposed PhysNet reconstructed iPPG signals with accurate time location of each individual pulse peak and achieved better performance on both HR and HRV levels compared to the state-of-the-art methods (CHROM, POS etc.). This method has potential applications in remote arterial fibrillation detection and emotion recognition.

To integrate colour-based and motion-based approaches, Weidi et al. [121] introduced two fusion techniques, the mean method and the ratio of variations method, to extract HR using an industrial camera. They reported that a colour-based method is more reliable than a motion-based method. However, the accuracy of the fusion-based method was not satisfactory, and this method was confined to rehabilitation only.

4. Application

As iPPG overcomes various limitations of contact sensors especially related to sensitive skin damage or infection, such contactless vital sign monitoring methods have promise to become realistic and attractive for numerous potential applications in both clinical and nonclinical scenarios.

4.1. Clinical Application

Because of its contactless sensing mode, iPPG can offer a comfortable method to monitor elderly people, infants, and patients with chronic pain, burnt skin, under dialysis, during and after surgery.

4.1.1. Neonatal Monitoring

Neonates and infants have very delicate and sensitive skin so non-contact methods would be the preferred approach to monitoring them. Several studies have used iPPG methods to monitor neonates. For example, Klaessens et al. [122] introduced a baby-friendly non-contact approach to monitor several vital signs such as HR, RR, skin temperature, and Sp

O_{2}

of infants and neonates in NICUs with gestational ages of 24 to 39 weeks. To measure HR and RR, RGB color magnification and IR-thermography were used, respectively. Another study by Villario et al. [123] applied the iPPG technique to continuously monitor HR, RR, and Sp

O_{2}

of neonates and infants in an NICU. They also detected bradycardia while monitoring participants continuously for long periods. Other studies [10,108,109,110] have also proved the promise of iPPG in neonatal monitoring.

4.1.2. Assessing Patients with Chronic Pain

Rubins et al. [124] used imaging photoplethysmography to assess patients with chronic pain, particularly neuropathic pain, using a monochrome camera. Another study by Zaproudina et al. [125] also employed the iPPG method to find new biomarkers of migraine patients using a monochrome camera.

4.1.3. Critical Patient Monitoring

Rasche et al. [126] used the iPPG technique in patients after heart surgery using a mobile camera under critical care conditions. In their proposed method, ROI was selected manually, and optical flow techniques were used to detect motion artefacts. Moreover, high pass filter and Fourier spectra were employed to calculate HR. Nevertheless, their method is affected by patient movements and illumination variations.

4.1.4. Arrythmia Detection

An abnormal heart rhythm is called an arrhythmia. Amelard et al. [127] applied the iPPG technique to detect arrhythmia by using a monochrome camera based on a signal fusion technique. The signal fusion technique was used to extract the blood pulse signal using prior information. They formulated the problem based on a Bayesian classification and modeled a novel probabilistic pulsatility model which combined spectral and spatial priors derived from physiology of the blood pulse signal.

4.1.5. Anesthesia Monitoring

Rubins et al. [128] introduced the concept that a PPG imaging system might be employed to continuously monitor regional anesthesia by monitoring skin microcirculation. Using both an RGB camera and a near-infrared camera, Trumpp et al. [129] applied camera-based PPG to monitor patients of cardiovascular disease during surgery in an intraoperative environment. It can help anesthetists to take measurements in response to any cardiovascular events and to administer appropriate medication.

4.1.6. Monitoring Dialysis Patients

Villarroel et al. [130] extracted HR, RR and Sp

O_{2}

of patients going through haemodialysis treatment in the Renal Unit of the Churchill Hospital in Oxford, UK, based on iPPG using a high-quality 5 megapixel camera. Tarassenko et al. [92] also applied iPPG to monitor HR, RR, and Sp

O_{2}

of hemodialysis patients in the Oxford Kidney Unit.

4.1.7. Burn care

Thatcher et al. [131] introduced that PPG imaging can be used to identify the correct depth of burn excision as there was significantly less blood circulation in burnt tissue. Consequently, it can help surgeons by giving an idea of where and how much to resect. Thus, iPPG helps to improve burn care as well.

4.2. Non-Clinical Application

In nonclinical sectors such as home health care, fitness monitoring, sleep monitoring, polygraph, living skin detection, stress monitoring, driver monitoring, security, war zone, natural calamity, and animal research, the iPPG may play an important future role by monitoring vital signs.

4.2.1. Home Health Care

There are limited places and resources for patients in hospitals. Moreover, staying in hospital for a long time is uncomfortable and expensive. Therefore, there is a need for patients, particularly elderly people and infants, to be monitored at home. As iPPG is a noncontact and low cost method with no consumable supplies, it can be used to monitor people’s vital signs in home environments [3]. Bernacchia et al. [105] also proposed a novel method using a Kinect device that could be applied to the monitoring of HR and RR of subjects at home, without the presence of experts or clinicians.

4.2.2. Fitness Monitoring

Exercise is good for health. However, over-exercise can have severe adverse health effects including death due to heart attack. Therefore, it can be important to monitor HR and RR to assess the health status of the exerciser and customise their training program according to their changing vital signs. Monitoring the physiological state using iPPG while doing exercise can be a convenient solution to overcome the adverse effects of over-exercising. In [74,78,79,82,85,93], iPPG was applied to monitor HR and RR during exercise, considering various fitness devices such as a stepping device, a treadmill, a bike, a hand bike, and a synchro-device.

4.2.3. Sleep Monitoring

Continuous monitoring of vital signs, even at night, can be carried out by combining visible RGB cameras and infrared cameras. During the night, for sleep monitoring, the iPPG exploiting infrared cameras will be particularly suitable [3]. HR and RR of sleep apnea patients can also be monitored using the Microsoft Kinect sensor [106] and colour magnification. Using three monochrome cameras, Vogels et al. [132] proposed a fully automated method to remotely and continuously monitor HR and

{SpO}_{2}

during sleep. In the proposed method, videos are first pre-processed using Gaussian smoothing using a 2D Gaussian kernel and rigid block segmentation. Then, the pulse signal, extracted using the PBV [79] method, is exploited as a feature to distinguish living and non-living tissue based on similarity mapping. After that, a hybrid method is introduced which combines the iPPG-based subject detection with a tracker to select an ROI. Finally, vital signs are extracted. They used five healthy subjects sleeping in different supine positions, to simulate realistic sleep scenarios. The results showed that the proposed method outperformed the state-of-the-art method (VPS, V–J etc.) for the estimation of oxygen saturation.

4.2.4. Polygraph

Telling a lie activates the autonomic nervous system (ANS). Therefore, there is a significant difference of mental stress which results in variations in physiological signs depending on whether someone is telling a lie or the truth. When someone is interrogated, the iPPG has been proposed as a polygraph for lie detection [2].

4.2.5. Living Skin Detection

Recently, researchers have been using the iPPG technique to detect living skin. For example, Wang et al. [133] proposed a novel unsupervised technique called “Voxel-Pulse-Spectral”(VPS) to detect living skin tissue using a CCD camera considering the pulse as a feature. However, the VPS method is time consuming because of the complexity of unsupervised learning for pulse extraction. A fast living skin detection method was introduced in another study [134] where the time variant iPPG signal is transformed into signal shape descriptors using a technique known as multiresolution iterative spectra.

4.2.6. Security

In security systems, the biometric information of individuals is mostly face biometrics and is used for authentication. However, such biometric information can be stolen or duplicated by attackers, which is called biometric presentation attack (BPA). Lakshminarayana et al. [135] employed iPPG to differentiate authentic users based on deep Convolutional Neural Network (CNN) considering CASIA and Replay-Attack. Nowara et al. [136] and Seepers et al. [137] also reported that the iPPG technique can be exploited in biometric authentication.

4.2.7. Emotion/Stress Monitoring

Stress is one of the major issues for various diseases such as cardiovascular diseases, diabetes, and cerebrovascular diseases. To understand and control personal stress, it is necessary to monitor an individual’s emotional state or stress level continuously. Physiological signals are more reliable than other factors such as facial expression, gesture or voice to measure emotional state. Several researchers have used iPPG to monitor stress. For instance, Maaoui et al. [138] detected and classified emotional states exploiting HR extracted from videos captured by webcam and using machine learning algorithms. Another study by Madan et al. [139] also used a webcam to detect emotional state by measuring changes in HR during sitting and standing conditions. By using a five band digital camera, McDuff et al. [140] measured cognitive stress by measuring HRV, where subjects were requested to do a mental arithmetic task. Burzo et al. [141], Monkaresi et al. [142] and Rouast et al. [143] have shown that iPPG can be effectively used to monitor emotional states of human beings.

4.2.8. Driver Monitoring

With the increase in the number of vehicles on the road, there is a potential increase in the number of accidents as well. The primary reasons for traffic accidents are physiological and psychological factors such as illness, mental illness, fatigue and the external ambient environment. Therefore, iPPG can be used to measure the physiological and psychological states of drivers. Kuo et al. [144] measured a driver’s HR using video cameras, considering the in-vehicle environment. Another study by Zhang et al. [145] used a webcam to monitor HR based on ICA during various driving situations. Other studies [146,147] have also applied the iPPG method to monitor the physiological parameters of drivers.

4.2.9. War Zone or Natural Calamity

Al-Naji et al. [117] have demonstrated that a hovering UAV can be used to capture video of subjects to extract HR and RR, considering multiple persons together at significant distance. This method has opened a new area for the potential application of iPPG during war or natural calamities such as earthquakes.

4.2.10. Animal Research

In animal research, camera-based techniques have numerous potential applications such as monitoring vital signs, assessment of motion activity and wound infection [148]. In [3], they extracted the HR and RR of mice, zebrafish and pigs using a CMOS camera and exploiting ICA based on the iPPG technique. Blanik et al. [149] used the iPPG technique to extract HR and RR of anesthetized pigs with Acute Respiratory Distress Syndrome (ARDS) using a CCD camera. By means of iPPG, Unakafov et al. [150] accurately estimated the HR of rhesus monkeys from facial videos captured by Microsoft Kinect and a monochrome near-infrared (NIR) camera.

5. Research Gaps and Challenges and Future Direction

During the past few years, many advances have been made using camera imaging-based methods and numerous studies have been published in this field. However, the existing studies have several limitations that need to be overcome in the future.

Most of the current studies focus on either motion artefacts or illumination variations. However, in real time scenarios both motion and illumination play an important role in degrading accuracy and usability. Only a few researchers have given solutions to both motion artefacts and illumination variations. Therefore, it can be a further topic of research for new researchers to develop a method which can deal with both motion and illumination artefacts.
Researchers are still mainly interested in extracting HR and RR. Nevertheless, blood oxygen also plays an important role in assessing a human’s health condition. Blood glucose [151,152,153] is another important vital sign which helps to identify and maintain the welfare of diabetes patients. Therefore, more research needs to be done to monitor blood oxygen saturation and blood glucose.
All of the studies except Pho et al. (3 subjects), Zhao et al. (2 subjects) and Al-Naji et al. (6 subjects), considered only a single subject at a time when monitoring vital signs. Detecting multiple persons simultaneously is a major issue in the current studies that need to be overcome in the future.
The distance between the camera and a subject is another challenge for current researchers. All the studies except Al-Naji et al. (60 m) considered very small distances while monitoring vital signs.
The number of ROI and selecting ROI are important issues to overcome as very few researchers have paid attention to these problems well enough for industrial or commercial use. Most of the studies have used manual processes while some have used automatic methods to select from a very limited number of ROI in limited scenarios. However, advanced techniques are required to automatically select multiple ROIs.
Researchers mainly considered healthy young participants as their subjects to do the experiments and did not use patients, elderly people and premature babies. More research needs to be done considering different subjects such as premature babies and elderly people.
Most of the existing works considered privately owned databases. Only a few used publicly available databases such as the MAHNOB-HCI (human computer interaction) or DEAP (Database for Emotion Analysis using Physiological Signals). Lack of publicly available datasets taken under a realistic situation is another challenge to deal with in the future.
To validate proposed methods, researchers have mainly used a pulse oximeter as a ground truth. The ECG was used by very few researchers as a ground truth. There are indications that no commercial instrumentation is truly accurate and most are simply accepted as accurate [110].
Future research could also include multi-camera fusion as well as new non-visible light sensors to tackle the visible light camera’s shortcomings.

6. Conclusions

Computer vision-based methods for vital signs measurement have been attracting increasing attention over the past few years. This paper gives a review of recent works in camera imaging-based methods, especially colour-based methods. First, numerous works based on motion- and colour-based methods have been discussed. Then, different aspects of colour-based methods, for example, motion artefacts, illumination variations, alternate sensors, different subjects, different vital signs, multiple ROIs, long distance and multiple persons have been reviewed. Additionally, potential applications of iPPG in both clinical and non-clinical sectors have been described. Moreover, we tried to identify the research gaps and challenges of the existing studies and gave some indications for future research. We believe that this paper will be a pathway for new researchers to understand and explore the gaps and challenges found in recent studies.

Author Contributions

F.-T.-Z.K. did the literature review and wrote the draft manuscript. A.A.-N. and J.C. supervised the work and contributed with valuable discussions and advice. All authors read and approved the final manuscript.

Funding

This research is funded by Research Training Program domestic (RTPd) scholarship administered by University of South Australia on behalf of the Australian Commonwealth Department of Education and Training.

Conflicts of Interest

The authors declare no conflict of interest.

References

Brüser, C.; Antink, C.H.; Wartzek, T.; Walter, M.; Leonhardt, S.; Brueser, C. Ambient and unobtrusive cardiorespiratory monitoring techniques. IEEE Rev. Biomed. Eng. 2015, 8, 30–43. [Google Scholar] [CrossRef] [PubMed]
Chen, X.; Cheng, J.; Song, R.; Liu, Y.; Ward, R.; Wang, Z.J. Video-Based Heart Rate Measurement: Recent Advances and Future Prospects. IEEE Trans. Instrum. Meas. 2018, 68, 3600–3615. [Google Scholar] [CrossRef]
Zhao, F.; Li, M.; Qian, Y.; Tsien, J.Z. Remote measurements of heart and respiration rates for telemedicine. PLoS ONE 2013, 8, e71384. [Google Scholar] [CrossRef] [PubMed]
Kranjec, J.; Beguš, S.; Geršak, G.; Drnovšek, J. Non-contact heart rate and heart rate variability measurements: A review. Biomed. Signal Process. Control 2014, 13, 102–112. [Google Scholar] [CrossRef]
Rouast, P.V.; Adam, M.T.P.; Chiong, R.; Cornforth, D.; Lux, E. Remote heart rate measurement using low-cost RGB face video: A technical literature review. Front. Comput. Sci. 2018, 12, 858–872. [Google Scholar] [CrossRef]
Phansalkar, S.; Edworthy, J.; Hellier, E.; Seger, D.L.; Schedlbauer, A.; Avery, A.J.; Bates, D.W. A review of human factors principles for the design and implementation of medication safety alerts in clinical information systems. J. Am. Med. Inform. Assoc. 2010, 17, 493–501. [Google Scholar] [CrossRef] [Green Version]
Sun, Y.; Thakor, N. Photoplethysmography revisited: From contact to noncontact, from point to imaging. IEEE Trans. Biomed. Eng. 2016, 63, 463–477. [Google Scholar] [CrossRef]
Charlton, P.H.; Bonnici, T.; Tarassenko, L.; Alastruey, J.; Clifton, D.; Beale, R.; Watkinson, P. Extraction of respiratory signals from the electrocardiogram and photoplethysmogram: Technical and physiological determinants. Physiol. Meas. 2017, 38, 669–690. [Google Scholar] [CrossRef]
Al-Naji, A.; Gibson, K.; Chahl, J. Remote sensing of physiological signs using a machine vision system. J. Med. Eng. Technol. 2017, 41, 396–405. [Google Scholar] [CrossRef]
Aarts, L.A.; Jeanne, V.; Cleary, J.P.; Lieber, C.; Nelson, J.S.; Oetomo, S.B.; Verkruysse, W. Non-contact heart rate monitoring utilizing camera photoplethysmography in the neonatal intensive care unit—A pilot study. Early Hum. Dev. 2013, 89, 943–948. [Google Scholar] [CrossRef]
Zaunseder, S.; Henning, A.; Wedekind, D.; Trumpp, A.; Malberg, H. Unobtrusive acquisition of cardiorespiratory signals. Somnologie 2017, 21, 93–100. [Google Scholar] [CrossRef]
Al-Naji, A.; Gibson, K.; Lee, S.-H.; Chahl, J. Monitoring of cardiorespiratory signal: Principles of remote measurements and review of methods. IEEE Access 2017, 5, 15776–15790. [Google Scholar] [CrossRef]
Scalise, L. Non contact heart monitoring. In Advances in Electrocardiograms-Methods and Analysis; IntechOpen: London, UK, 2012. [Google Scholar]
Tarjan, P.P.; McFee, R. Electrodeless measurements of the effective resistivity of the human torso and head by magnetic induction. IEEE Trans. Biomed. Eng. 1968, 4, 266–278. [Google Scholar] [CrossRef] [PubMed]
Guardo, R.; Trudelle, S.; Adler, A.; Boulay, C.; Savard, P. Contactless recording of cardiac related thoracic conductivity changes. In Proceedings of the 1995 IEEE 17th International Conference of the Engineering in Medicine and Biology Society, Montreal, QC, Canada, 20–23 September 1995; pp. 1581–1582. [Google Scholar]
Richer, A.; Adler, A. Eddy current based flexible sensor for contactless measurement of breathing. In Proceedings of the 2005 IEEE Instrumentation and Measurement Technology Conference Proceedings, Ottawa, ON, Canada, 16–19 May 2005; pp. 257–260. [Google Scholar]
Steffen, M.; Aleksandrowicz, A.; Leonhardt, S. Mobile noncontact monitoring of heart and lung activity. IEEE Trans. Biomed. Circuits Syst. 2007, 1, 250–257. [Google Scholar] [CrossRef] [PubMed]
Vetter, P.; Leicht, L.; Leonhardt, S.; Teichmann, D. Integration of an electromagnetic coupled sensor into a driver seat for vital sign monitoring: Initial insight. In Proceedings of the 2017 IEEE International Conference on Vehicular Electronics and Safety (ICVES), Vienna, Austria, 27–28 June 2017; pp. 185–190. [Google Scholar]
Dalal, H.; Basu, A.; Abegaonkar, M.P. Remote sensing of vital sign of human body with radio frequency. CSI Trans. ICT 2017, 5, 161–166. [Google Scholar] [CrossRef]
Rabbani, M.S.; Ghafouri-Shiraz, H. Ultra-wide patch antenna array design at 60 GHz band for remote vital sign monitoring with Doppler radar principle. J. Infrared Millim. Terahertz Waves 2017, 38, 548–566. [Google Scholar] [CrossRef]
Scalise, L.; Marchionni, P.; Ercoli, I. Optical method for measurement of respiration rate. In Proceedings of the 2010 IEEE International Workshop on Medical Measurements and Applications, Ottawa, ON, Canada, 30 April–1 May 2010; pp. 19–22. [Google Scholar]
Lohman, B.; Boric-Lubecke, O.; Lubecke, V.; Ong, P.; Sondhi, M. A digital signal processor for Doppler radar sensing of vital signs. IEEE Eng. Med. Biol. Mag. 2002, 21, 161–164. [Google Scholar] [CrossRef] [Green Version]
Mercuri, M.; Liu, Y.-H.; Lorato, I.; Torfs, T.; Bourdoux, A.; Van Hoof, C. Frequency-Tracking CW doppler radar solving small-angle approximation and null point issues in non-contact vital signs monitoring. IEEE Trans. Biomed. Circuits Syst. 2017, 11, 671–680. [Google Scholar] [CrossRef]
Nosrati, M.; Shahsavari, S.; Lee, S.; Wang, H.; Tavassolian, N. A Concurrent Dual-Beam Phased-Array Doppler Radar Using MIMO Beamforming Techniques for Short-Range Vital-Signs Monitoring. IEEE Trans. Antennas Propag. 2019, 67, 2390–2404. [Google Scholar] [CrossRef]
Marchionni, P.; Scalise, L.; Ercoli, I.; Tomasini, E.P. An optical measurement method for the simultaneous assessment of respiration and heart rates in preterm infants. Rev. Sci. Instrum. 2013, 84, 121705. [Google Scholar] [CrossRef]
Sirevaag, E.J.; Casaccia, S.; Richter, E.A.; O’Sullivan, J.A.; Scalise, L.; Rohrbaugh, J.W. Cardiorespiratory interactions: Noncontact assessment using laser Doppler vibrometry. Psychophysiology 2016, 53, 847–867. [Google Scholar] [CrossRef] [PubMed]
Holdsworth, D.W.; Norley, C.J.D.; Frayne, R.; Steinman, D.A.; Rutt, B.K. Characterization of common carotid artery blood-flow waveforms in normal human subjects. Physiol. Meas. 1999, 20, 219–240. [Google Scholar] [CrossRef] [PubMed]
Arlotto, P.; Grimaldi, M.; Naeck, R.; Ginoux, J.-M. An ultrasonic contactless sensor for breathing monitoring. Sensors 2014, 14, 15371–15386. [Google Scholar] [CrossRef] [PubMed]
Min, S.D.; Yoon, D.J.; Yoon, S.W.; Yun, Y.H.; Lee, M. A study on a non-contacting respiration signal monitoring system using Doppler ultrasound. Med. Biol. Eng. 2007, 45, 1113–1119. [Google Scholar]
Yang, M.; Liu, Q.; Turner, T.; Wu, Y. Vital sign estimation from passive thermal video. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
Bennett, S.; El Harake, T.N.; Goubran, R.; Knoefel, F. Adaptive Eulerian Video Processing of Thermal Video: An Experimental Analysis. IEEE Trans. Instrum. Meas. 2017, 66, 2516–2524. [Google Scholar] [CrossRef]
Cardoso, J.-F. Blind signal separation: Statistical principles. Proc. IEEE 1998, 86, 2009–2025. [Google Scholar] [CrossRef]
Elphick, H.E.; Alkali, A.H.; Kingshott, R.K.; Burke, D.; Saatchi, R. Exploratory study to evaluate respiratory rate using a thermal imaging camera. Respiration 2019, 97, 205–212. [Google Scholar] [CrossRef]
Luca, C.; Corciovă, C.; Andriţoi, D.; Ciorap, R. The Use of Thermal Imaging Techniques as a Method of Monitoring the New Born. In Proceedings of the 6th International Conference on Advancements of Medicine and Health Care through Technology, Cluj-Napoca, Romania, 17–20 October 2018; pp. 35–39. [Google Scholar]
Abbas, A.K.; Heimann, K.; Jergus, K.; Orlikowsky, T.; Leonhardt, S. Neonatal non-contact respiratory monitoring based on real-time infrared thermography. Biomed. Eng. Online 2011, 10, 93. [Google Scholar] [CrossRef]
Blanik, N.; Abbas, A.K.; Venema, B.; Blazek, V.; Leonhardt, S. Hybrid optical imaging technology for long-term remote monitoring of skin perfusion and temperature behavior. J. Biomed. Opt. 2014, 19, 16012. [Google Scholar] [CrossRef]
Chauvin, R.; Hamel, M.; Brière, S.; Ferland, F.; Grondin, F.; Létourneau, D.; Tousignant, M.; Michaud, F. Contact-Free respiration rate monitoring using a pan–tilt thermal camera for stationary bike telerehabilitation sessions. IEEE Syst. J. 2014, 10, 1046–1055. [Google Scholar] [CrossRef]
Pereira, C.B.; Yu, X.; Czaplik, M.; Rossaint, R.; Blazek, V.; Leonhardt, S. Remote monitoring of breathing dynamics using infrared thermography. Biomed. Opt. Express 2015, 6, 4378–4394. [Google Scholar] [CrossRef] [PubMed]
Garbey, M.; Sun, N.; Merla, A.; Pavlidis, I. Contact-Free measurement of cardiac pulse based on the analysis of thermal imagery. IEEE Trans. Biomed. Eng. 2007, 54, 1418–1426. [Google Scholar] [CrossRef] [PubMed]
Pavlidis, I.; Dowdall, J.; Sun, N.; Puri, C.; Fei, J.; Garbey, M. Interacting with human physiology. Comput. Vis. Image Underst. 2007, 108, 150–170. [Google Scholar] [CrossRef]
Sun, N.; Garbey, M.; Merla, A.; Pavlidis, I. Imaging the cardiovascular pulse. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; pp. 416–421. [Google Scholar]
Chekmenev, S.Y.; Farag, A.A.; Miller, W.M.; Essock, E.A.; Bhatnagar, A. Multiresolution approach for noncontact measurements of arterial pulse using thermal imaging. In Augmented Vision Perception in Infrared; Springer Science and Business Media LLC: Berlin, Germany, 2009; pp. 87–112. [Google Scholar]
Abbas, A.K.; Heiman, K.; Orlikowsky, T.; Leonhardt, S. Non-Contact Respiratory Monitoring Based on Real-Time IR-Thermography. In Proceedings of the World Congress on Medical Physics and Biomedical Engineering, Munich, Germany, 7–12 September 2009; pp. 1306–1309. [Google Scholar]
Pulli, K.; Baksheev, A.; Kornyakov, K.; Eruhimov, V. Real-time computer vision with OpenCV. Commun. ACM 2012, 55, 61–69. [Google Scholar] [CrossRef]
McDuff, D.J.; Estepp, J.R.; Piasecki, A.M.; Blackford, E.B. A survey of remote optical photoplethysmographic imaging methods. In Proceedings of the 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Italy, 25–29 August 2015; pp. 6398–6404. [Google Scholar]
Sikdar, A.; Behera, S.K.; Dogra, D.P. Computer-Vision-Guided human pulse rate estimation: A Review. IEEE Rev. Biomed. Eng. 2016, 9, 91–105. [Google Scholar] [CrossRef] [PubMed]
Hassan, M.; Malik, A.; Fofi, D.; Saad, N.; Karasfi, B.; Ali, Y.; Mériaudeau, F. Heart rate estimation using facial video: A review. Biomed. Signal Process. Control 2017, 38, 346–360. [Google Scholar] [CrossRef]
Zaunseder, S.; Trumpp, A.; Wedekind, D.; Malberg, H. Cardiovascular assessment by imaging photoplethysmography—A review. Biomed. Tech. Eng. 2018, 63, 617–634. [Google Scholar] [CrossRef]
Nakajima, K.; Osa, A.; Miike, H. A method for measuring respiration and physical activity in bed by optical flow analysis. In Proceedings of the 19th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. ‘Magnificent Milestones and Emerging Opportunities in Medical Engineering’ (Cat. No. 97CH36136), Chicago, IL, USA, 30 October–2 November 1997; pp. 2054–2057. [Google Scholar]
Frigola, M.; Amat, J.; Pagès, J. Vision based respiratory monitoring system. In Proceedings of the 10th Mediterranean Conference on Control and Automation (MED 2002), Lisbon, Portugal, 9–12 July 2002; pp. 9–13. [Google Scholar]
Balakrishnan, G.; Durand, F.; Guttag, J. Detecting pulse from head motions in video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 3430–3437. [Google Scholar]
Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, 8–14 December 2001; pp. 1–6. [Google Scholar]
Lucas, B.D.; Kanade, T. An Iterative Image Registration Technique with an Application to Stereo Vision. In Proceedings of the 7th International Joint Conference on Artificial Intelligence, Vancouver, BC, Canada, 24–28 August 1981; pp. 674–679. [Google Scholar]
Shan, L.; Yu, M. Video-based heart rate measurement using head motion tracking and ICA. In Proceedings of the 2013 6th International Congress on Image and Signal Processing (CISP), Hangzhou, China, 16–18 December 2013; pp. 160–164. [Google Scholar]
Irani, R.; Nasrollahi, K.; Moeslund, T.B. Improved pulse detection from head motions using DCT. In Proceedings of the 2014 International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, 5–8 January 2014; pp. 118–124. [Google Scholar]
Haque, M.A.; Irani, R.; Nasrollahi, K.; Moeslund, T.B. Heartbeat rate measurement from facial video. IEEE Intell. Syst. 2016, 31, 40–48. [Google Scholar] [CrossRef]
Lomaliza, J.-P.; Park, H. Detecting Pulse from Head Motions Using Smartphone Camera. In Proceedings of the International Conference on Advanced Engineering Theory and Applications, Busan, Vietnnam, 8–10 December 2016; pp. 243–251. [Google Scholar]
Lomaliza, J.-P.; Park, H. Improved Heart-Rate Measurement from Mobile Face Videos. Electronics 2019, 8, 663. [Google Scholar] [CrossRef]
He, X.; Goubran, R.A.; Liu, X.P. Wrist pulse measurement and analysis using Eulerian video magnification. In Proceedings of the 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), Las Vegas, NV, USA, 24–27 February 2016; pp. 41–44. [Google Scholar]
Wu, H.-Y.; Rubinstein, M.; Shih, E.; Guttag, J.; Durand, F.; Freeman, W. Eulerian video magnification for revealing subtle changes in the world. ACM Trans. Graph. 2012, 31, 1–8. [Google Scholar] [CrossRef]
Al-Naji, A.; Chahl, J. Contactless cardiac activity detection based on head motion magnification. Int. J. Image Graph. 2017, 17, 1750001. [Google Scholar] [CrossRef]
Al-Naji, A.; Chahl, J. Remote respiratory monitoring system based on developing motion magnification technique. Biomed. Signal Process. Control 2016, 29, 1–10. [Google Scholar] [CrossRef]
Al-Naji, A.; Gibson, K.; Lee, S.-H.; Chahl, J. Real time apnoea monitoring of children using the microsoft kinect sensor: A pilot study. Sensors 2017, 17, 286. [Google Scholar] [CrossRef] [PubMed]
Al-Naji, A.; Chahl, J. Detection of cardiopulmonary activity and related abnormal events using microsoft kinect sensor. Sensors 2018, 18, 920. [Google Scholar] [CrossRef]
Al-Naji, A.; Lee, S.-H.; Chahl, J. An efficient motion magnification system for real-time applications. Mach. Vis. Appl. 2018, 29, 585–600. [Google Scholar] [CrossRef]
Hertzman, A.B. Observations on the finger volume pulse recorded photoelectrically. Am. J. Physiol. 1937, 119, 334–335. [Google Scholar]
Wang, W.; den Brinker, A.C.; Stuijk, S.; de Haan, G. Algorithmic principles of remote PPG. IEEE Trans. Biomed. Eng. 2017, 64, 1479–1491. [Google Scholar] [CrossRef]
Verkruysse, W.; Svaasand, L.O.; Nelson, J.S. Remote plethysmographic imaging using ambient light. Opt. Express 2008, 16, 21434–21445. [Google Scholar] [CrossRef] [Green Version]
Poh, M.-Z.; McDuff, D.J.; Picard, R.W. Non-contact, automated cardiac pulse measurements using video imaging and blind source separation. Opt. Express 2010, 18, 10762–10774. [Google Scholar] [CrossRef]
Poh, M.-Z.; McDuff, D.J.; Picard, R.W. Advancements in noncontact, multiparameter physiological measurements using a webcam. IEEE Trans. Biomed. Eng. 2011, 58, 7–11. [Google Scholar] [CrossRef]
Pursche, T.; Krajewski, J.; Moeller, R. Video-based heart rate measurement from human faces. In Proceedings of the 2012 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 13–16 January 2012; pp. 544–545. [Google Scholar]
Lewandowska, M.; Rumiński, J.; Kocejko, T.; Nowak, J. Measuring pulse rate with a webcam—A non-contact method for evaluating cardiac activity. In Proceedings of the 2011 Federated Conference on Computer Science and Information Systems (FedCSIS), Szczecin, Poland, 18–21 September 2011; pp. 405–410. [Google Scholar]
Hyvärinen, A.; Oja, E. Independent component analysis: Algorithms and applications. Neural Netw. 2000, 13, 411–430. [Google Scholar] [CrossRef]
Sun, Y.; Hu, S.; Azorin-Peris, V.; Greenwald, S.; Chambers, J.; Zhu, Y. Motion-compensated noncontact imaging photoplethysmography to monitor cardiorespiratory status during exercise. J. Biomed. Opt. 2011, 16, 077010. [Google Scholar] [CrossRef] [PubMed]
Feng, L.; Po, L.-M.; Xu, X.; Li, Y. Motion artifacts suppression for remote imaging photoplethysmography. In Proceedings of the 2014 19th International Conference on Digital Signal Processing, Hong Kong, China, 20–23 August 2014; pp. 18–23. [Google Scholar]
Qi, H.; Guo, Z.; Chen, X.; Shen, Z.; Wang, Z.J. Video-based human heart rate measurement using joint blind source separation. Biomed. Signal Process. Control 2017, 31, 309–320. [Google Scholar] [CrossRef]
Bousefsaf, F.; Maaoui, C.; Pruski, A. Continuous wavelet filtering on webcam photoplethysmographic signals to remotely assess the instantaneous heart rate. Biomed. Signal Process. Control 2013, 8, 568–574. [Google Scholar] [CrossRef]
De Haan, G.; Jeanne, V. Robust pulse rate from chrominance-based rPPG. IEEE Trans. Biomed. Eng. 2013, 60, 2878–2886. [Google Scholar] [CrossRef]
De Haan, G.; Van Leest, A. Improved motion robustness of remote-PPG by using the blood volume pulse signature. Physiol. Meas. 2014, 35, 1913–1926. [Google Scholar] [CrossRef]
Feng, L.; Po, L.-M.; Xu, X.; Li, Y.; Ma, R. Motion-resistant remote imaging photoplethysmography based on the optical properties of skin. IEEE Trans. Circuits Syst. Video Technol. 2015, 25, 879–891. [Google Scholar] [CrossRef]
Wang, W.; Stuijk, S.; De Haan, G. A novel algorithm for remote photoplethysmography: Spatial subspace rotation. IEEE Trans. Biomed. Eng. 2016, 63, 1974–1984. [Google Scholar] [CrossRef]
Wang, W.; Brinker, A.C.D.; Stuijk, S.; De Haan, G. Robust heart rate from fitness videos. Physiol. Meas. 2017, 38, 1023–1044. [Google Scholar] [CrossRef]
Wu, B.-F.; Huang, P.-W.; Tsou, T.-Y.; Lin, T.-M.; Chung, M.-L. Camera-based heart rate measurement using continuous wavelet transform. In Proceedings of the 2017 International Conference on System Science and Engineering (ICSSE), Ho Chi Minh City, Vietnam, 21–23 July 2017; pp. 7–11. [Google Scholar]
Wu, B.-F.; Huang, P.-W.; Lin, C.-H.; Chung, M.-L.; Tsou, T.-Y.; Wu, Y.-L. Motion resistant image-photoplethysmography based on spectral peak tracking algorithm. IEEE Access 2018, 6, 21621–21634. [Google Scholar] [CrossRef]
Xie, K.; Fu, C.-H.; Liang, H.; Hong, H.; Zhu, X. Non-contact Heart Rate Monitoring for Intensive Exercise Based on Singular Spectrum Analysis. In Proceedings of the 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), San Jose, CA, USA, 28–30 March 2019; pp. 228–233. [Google Scholar]
McDuff, D.J.; Blackford, E.B.; Estepp, J.R.; Nishidate, I. A Fast Non-Contact Imaging Photoplethysmography Method Using a Tissue-Like Model. In Optical Diagnostics and Sensing XVIII: Toward Point-of-Care Diagnostics; International Society for Optics and Photonics: San Francisco, CA, USA, 2018; p. 105010Q-1-9. [Google Scholar]
Fallet, S.; Schoenenberger, Y.; Martin, L.; Braun, F.; Moser, V.; Vesin, J.-M. Imaging photoplethysmography: A real-time signal quality index. In Proceedings of the 2017 Computing in Cardiology (CinC), Rennes, France, 24–27 September 2017; pp. 1–4. [Google Scholar]
Moço, A.V.; Stuijk, S.; De Haan, G. Motion robust PPG-imaging through color channel mapping. Biomed. Opt. Express 2016, 7, 1737–1754. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chen, D.-Y.; Wang, J.-J.; Lin, K.-Y.; Chang, H.-H.; Wu, H.-K.; Chen, Y.-S.; Lee, S.-Y. Image sensor-based heart rate evaluation from face reflectance using Hilbert–Huang transform. IEEE Sens. J. 2015, 15, 618–627. [Google Scholar] [CrossRef]
Lin, K.-Y.; Chen, D.-Y.; Tsai, W.-J. Face-based heart rate signal decomposition and evaluation using multiple linear regression. IEEE Sens. J. 2016, 16, 1351–1360. [Google Scholar] [CrossRef]
Lee, D.; Kim, J.; Kwon, S.; Park, K. Heart rate estimation from facial photoplethysmography during dynamic illuminance changes. In Proceedings of the 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Italy, 25–29 August 2015; pp. 2758–2761. [Google Scholar]
Tarassenko, L.; Villarroel, M.; Guazzi, A.; Jorge, J.; Clifton, D.A.; Pugh, C. Non-contact video-based vital sign monitoring using ambient light and auto-regressive models. Physiol. Meas. 2014, 35, 807–831. [Google Scholar] [CrossRef] [PubMed]
Cheng, J.; Chen, X.; Xu, L.; Wang, Z.J. Illumination variation-resistant video-based heart rate measurement using joint blind source separation and ensemble empirical mode decomposition. IEEE J. Biomed. Health Inform. 2017, 21, 1422–1433. [Google Scholar] [CrossRef] [PubMed]
Xu, L.; Cheng, J.; Chen, X. Illumination variation interference suppression in remote PPG using PLS and MEMD. Electron. Lett. 2017, 53, 216–218. [Google Scholar] [CrossRef]
Li, X.; Chen, J.; Zhao, G.; Pietikainen, M. Remote heart rate measurement from face videos under realistic situations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 4264–4271. [Google Scholar]
Kumar, M.; Veeraraghavan, A.; Sabharwal, A. DistancePPG: Robust non-contact vital signs monitoring using a camera. Biomed. Opt. Express 2015, 6, 1565–1588. [Google Scholar] [CrossRef] [Green Version]
Al-Naji, A.; Perera, A.G.; Chahl, J. Remote monitoring of cardiorespiratory signals from a hovering unmanned aerial vehicle. Biomed. Eng. Online 2017, 16, 101. [Google Scholar] [CrossRef]
Al-Naji, A.; Lee, S.-H.; Chahl, J. Quality index evaluation of videos based on fuzzy interface system. IET Image Process. 2017, 11, 292–300. [Google Scholar] [CrossRef]
Chen, J.-H.; Tang, I.-L.; Chang, C.-H. Enhancing the detection rate of inclined faces. In Proceedings of the 2015 IEEE Trustcom/BigDataSE/ISPA, Helsinki, Finland, 20–22 August 2015; pp. 143–146. [Google Scholar]
Borga, M.; Knutsson, H. A Canonical Correlation Approach to Blind Source Separation; Report LiU-IMT-EX-0062; Department of Biomedical Engineering, Linkping University: Linköping, Sweden, 2001. [Google Scholar]
Kwon, S.; Kim, H.; Park, K.S. Validation of heart rate extraction using video imaging on a built-in camera system of a smartphone. In Proceedings of the 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Diego, CA, USA, 28 August–1 September 2012; pp. 2174–2177. [Google Scholar]
Al-Naji, A.; Perera, A.G.; Chahl, J. Remote Measurement of Cardiopulmonary Signal Using an Unmanned Aerial Vehicle; IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2018; p. 012001. [Google Scholar]
McDuff, D.; Gontarek, S.; Picard, R.W. Improvements in remote cardiopulmonary measurement using a five band digital camera. IEEE Trans. Biomed. Eng. 2014, 61, 2593–2601. [Google Scholar] [CrossRef]
Sun, Y.; Azorin-Peris, V.; Kalawsky, R.; Hu, S.; Papin, C.; Greenwald, S.E. Use of ambient light in remote photoplethysmographic systems: Comparison between a high-performance camera and a low-cost webcam. J. Biomed. Opt. 2012, 17, 37005. [Google Scholar] [CrossRef] [PubMed]
Bernacchia, N.; Scalise, L.; Casacanditella, L.; Ercoli, I.; Marchionni, P.; Tomasini, E.P. Non contact measurement of heart and respiration rates based on Kinect™. In Proceedings of the 2014 IEEE International Symposium on Medical Measurements and Applications (MeMeA), Lisboa, Portugal, 11–12 June 2014; pp. 1–5. [Google Scholar]
Smilkstein, T.; Buenrostro, M.; Kenyon, A.; Lienemann, M.; Larson, G. Heart rate monitoring using Kinect and color amplification. In Proceedings of the 2014 IEEE Healthcare Innovation Conference (HIC), Seattle, WA, USA, 8–10 October 2014; pp. 60–62. [Google Scholar]
Gambi, E.; Agostinelli, A.; Belli, A.; Burattini, L.; Cippitelli, E.; Fioretti, S.; Pierleoni, P.; Ricciuti, M.; Sbrollini, A.; Spinsante, S. Heart rate detection using microsoft kinect: Validation and comparison to wearable devices. Sensors 2017, 17, 1776. [Google Scholar] [CrossRef] [PubMed]
Scalise, L.; Bernacchia, N.; Ercoli, I.; Marchionni, P. Heart rate measurement in neonatal patients using a webcamera. In Proceedings of the 2012 IEEE International Symposium on Medical Measurements and Applications Proceedings, Budapest, Hungary, 18–19 May 2012; pp. 1–4. [Google Scholar]
Cobos-Torres, J.-C.; Abderrahim, M.; Martínez-Orgado, J. Non-Contact, Simple Neonatal Monitoring by Photoplethysmography. Sensors 2018, 18, 4362. [Google Scholar] [CrossRef] [PubMed]
Gibson, K.; Al-Naji, A.; Fleet, J.; Steen, M.; Esterman, A.; Chahl, J.; Huynh, J.; Morris, S. Non-contact heart and respiratory rate monitoring of preterm infants based on a computer vision system: A method comparison study. Pediatr. Res. 2019, 1–4. [Google Scholar]
Bal, U. Non-contact estimation of heart rate and oxygen saturation using ambient light. Biomed. Opt. Express 2015, 6, 86–97. [Google Scholar] [CrossRef]
Fouad, R.M.; Omer, O.A.; Aly, M.H. Optimizing Remote Photoplethysmography Using Adaptive Skin Segmentation for Real-Time Heart Rate Monitoring. IEEE Access 2019, 7, 76513–76528. [Google Scholar] [CrossRef]
Datcu, D.; Cidota, M.; Lukosch, S.; Rothkrantz, L. Noncontact automatic heart rate analysis in visible spectrum by specific face regions. In Proceedings of the 14th International Conference on Computer Systems and Technologies, Ruse, Bulgaria, 28–29 June 2013; pp. 120–127. [Google Scholar]
Feng, L.; Po, L.-M.; Xu, X.; Li, Y.; Cheung, C.-H.; Cheung, K.-W.; Yuan, F. Dynamic ROI based on K-means for remote photoplethysmography. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, QLD, Australia, 19–24 April 2015; pp. 1310–1314. [Google Scholar]
Zhao, C.; Chen, W.; Lin, C.-L.; Wu, X. Physiological signal preserving video compression for remote photoplethysmography. IEEE Sens. J. 2019, 19, 4537–4548. [Google Scholar] [CrossRef]
Al-Naji, A.; Chahl, J.; Lee, S.-H. Cardiopulmonary Signal Acquisition from Different Regions Using Video Imaging Analysis. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 2018, 7, 117–131. [Google Scholar] [CrossRef]
Al-Naji, A.; Chahl, J. Remote Optical Cardiopulmonary Signal Extraction With Noise Artifact Removal, Multiple Subject Detection & Long-Distance. IEEE Access 2018, 6, 11573–11595. [Google Scholar]
Hsu, Y.; Lin, Y.-L.; Hsu, W. Learning-based heart rate detection from remote photoplethysmography features. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, QLD, Australia, 19–24 April 2014; pp. 4433–4437. [Google Scholar]
Chen, W.; McDuff, D. Deepphys: Video-based physiological measurement using convolutional attention networks. In Proceedings of the European Conference on Computer Vision (ECCV), Ruse, Bulgaria, 28–29 June 2018; pp. 349–365. [Google Scholar]
Yu, Z.; Li, X.; Zhao, G. Remote Photoplethysmograph Signal Measurement from Facial Videos Using Spatio-Temporal Networks. arXiv 2019, arXiv:1905.02419. [Google Scholar]
Wiede, C.; Richter, J.; Hirtz, G. Signal fusion based on intensity and motion variations for remote heart rate determination. In Proceedings of the 2016 IEEE International Conference on Imaging Systems and Techniques (IST), Chania, Greece, 4–6 October 2016; pp. 526–531. [Google Scholar]
Klaessens, J.H.; van den Born, M.; van der Veen, A.; Sikkens-van de Kraats, J.; van den Dungen, F.A.; Verdaasdonk, R.M. Development of a baby friendly non-contact method for measuring vital signs: First results of clinical measurements in an open incubator at a neonatal intensive care unit. In Advanced Biomedical and Clinical Diagnostic Systems XII; International Society for Optics and Photonics: San Francisco, CA, USA, 2014; pp. 89351P-1–89351P-7. [Google Scholar]
Villarroel, M.; Guazzi, A.; Jorge, J.; Davis, S.; Watkinson, P.; Green, G.; Shenvi, A.; McCormick, K.; Tarassenko, L. Continuous non-contact vital sign monitoring in neonatal intensive care unit. Healthc. Technol. Lett. 2014, 1, 87–91. [Google Scholar] [CrossRef] [PubMed]
Rubins, U.; Marcinkevics, Z.; Logina, I.; Grabovskis, A.; Kviesis-Kipge, E. In Imaging Photoplethysmography for Assessment of Chronic Pain Patients. In Optical Diagnostics and Sensing XIX: Toward Point-of-Care Diagnostics; International Society for Optics and Photonics: San Francisco, CA, USA, 2019; pp. 1088508-1–1088508-8. [Google Scholar]
Zaproudina, N.; Teplov, V.; Nippolainen, E.; Lipponen, J.A.; Kamshilin, A.A.; Närhi, M.; Karjalainen, P.A.; Giniatullin, R. Asynchronicity of facial blood perfusion in migraine. PLoS ONE 2013, 8, e80189. [Google Scholar] [CrossRef] [PubMed]
Rasche, S.; Trumpp, A.; Waldow, T.; Gaetjen, F.; Plötze, K.; Wedekind, D.; Schmidt, M.; Malberg, H.; Matschke, K.; Zaunseder, S. Camera-based photoplethysmography in critical care patients. Clin. Hemorheol. Microcirc. 2016, 64, 77–90. [Google Scholar] [CrossRef] [PubMed]
Amelard, R.; Clausi, D.A.; Wong, A. Spectral-spatial fusion model for robust blood pulse waveform extraction in photoplethysmographic imaging. Biomed. Opt. Express 2016, 7, 4874–4885. [Google Scholar] [CrossRef] [Green Version]
Rubīns, U.; Spīgulis, J.; Miščuks, A. Photoplethysmography imaging algorithm for continuous monitoring of regional anesthesia. In Proceedings of the 2016 14th ACM/IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia), Pittsburgh, PA, USA, 6–7 October 2016; pp. 1–5. [Google Scholar]
Trumpp, A.; Lohr, J.; Wedekind, D.; Schmidt, M.; Burghardt, M.; Heller, A.R.; Malberg, H.; Zaunseder, S. Camera-based photoplethysmography in an intraoperative setting. Biomed. Eng. Online 2018, 17, 33. [Google Scholar] [CrossRef]
Villarroel, M.; Jorge, J.; Pugh, C.; Tarassenko, L. Non-contact vital sign monitoring in the clinic. In Proceedings of the 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), Washington, DC, USA, 30 May–3 June 2017; pp. 278–285. [Google Scholar]
Thatcher, J.E.; Li, W.; Rodriguez-Vaqueiro, Y.; Squiers, J.J.; Mo, W.; Lu, Y.; Plant, K.D.; Sellke, E.; King, D.R.; Fan, W. Multispectral and photoplethysmography optical imaging techniques identify important tissue characteristics in an animal model of tangential burn excision. J. Burn. Care Res. 2016, 37, 38–52. [Google Scholar] [CrossRef]
Vogels, T.; van Gastel, M.; Wang, W.; de Haan, G. Fully-automatic camera-based pulse-oximetry during sleep. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1349–1357. [Google Scholar]
Wang, W.; Stuijk, S.; De Haan, G. Unsupervised subject detection via remote PPG. IEEE Trans. Biomed. Eng. 2015, 62, 2629–2637. [Google Scholar] [CrossRef]
Wang, W.; Stuijk, S.; De Haan, G. Living-Skin classification via remote-PPG. IEEE Trans. Biomed. Eng. 2017, 64, 2781–2792. [Google Scholar]
Lakshminarayana, N.N.; Narayan, N.; Napp, N.; Setlur, S.; Govindaraju, V. A discriminative spatio-temporal mapping of face for liveness detection. In Proceedings of the 2017 IEEE International Conference on Identity, Security and Behavior Analysis (ISBA), New Delhi, India, 22–24 February 2017; pp. 1–7. [Google Scholar]
Nowara, E.M.; Sabharwal, A.; Veeraraghavan, A. In PPGSecure: Biometric presentation attack detection using photopletysmograms. In Proceedings of the 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), Washington, DC, USA, 30 May–3 June 2017; pp. 56–62. [Google Scholar]
Seepers, R.M.; Wang, W.; de Haan, G.; Sourdis, I.; Strydis, C. Attacks on heartbeat-based security using remote photoplethysmography. IEEE J. Biomed. Health Inform. 2017, 22, 714–721. [Google Scholar] [CrossRef]
Maaoui, C.; Bousefsaf, F.; Pruski, A. Automatic human stress detection based on webcam photoplethysmographic signals. J. Mech. Med. Biol. 2016, 16, 1650039. [Google Scholar] [CrossRef]
Madan, C.R.; Harrison, T.; Mathewson, K.E. Noncontact measurement of emotional and physiological changes in heart rate from a webcam. Psychophysiology 2018, 55, e13005. [Google Scholar] [CrossRef] [PubMed]
McDuff, D.; Gontarek, S.; Picard, R. Remote measurement of cognitive stress via heart rate variability. In Proceedings of the 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Chicago, IL, USA, 26–30 August 2014; pp. 2957–2960. [Google Scholar]
Burzo, M.; McDuff, D.; Mihalcea, R.; Morency, L.-P.; Narvaez, A.; Pérez-Rosas, V. Towards sensing the influence of visual narratives on human affect. In Proceedings of the 14th ACM International Conference on Multimodal Interaction, Santa Monica, CA, USA, 22–26 October 2012; pp. 153–160. [Google Scholar]
Monkaresi, H.; Bosch, N.; Calvo, R.A.; D’Mello, S.K. Automated detection of engagement using video-based estimation of facial expressions and heart rate. IEEE Trans. Affect. Comput. 2016, 8, 15–28. [Google Scholar] [CrossRef]
Rouast, P.V.; Adam, M.T.; Cornforth, D.J.; Lux, E.; Weinhardt, C. Using contactless heart rate measurements for real-time assessment of affective states. In Information Systems and Neuroscience; Springer: Berlin, Germany, 2017; pp. 157–163. [Google Scholar]
Kuo, J.; Koppel, S.; Charlton, J.L.; Rudin-Brown, C.M. Evaluation of a video-based measure of driver heart rate. J. Saf. Res. 2015, 54, 55.e29–59. [Google Scholar] [CrossRef] [Green Version]
Zhang, Q.; Wu, Q.; Zhou, Y.; Wu, X.; Ou, Y.; Zhou, H. Webcam-based, non-contact, real-time measurement for the physiological parameters of drivers. Measurement 2017, 100, 311–321. [Google Scholar] [CrossRef]
Lee, K.; Han, D.K.; Ko, H. Video Analytic Based Health Monitoring for Driver in Moving Vehicle by Extracting Effective Heart Rate Inducing Features. J. Adv. Transp. 2018, 2018, 8513487. [Google Scholar] [CrossRef]
Zhang, Q.; Zhou, Y.; Song, S.; Liang, G.; Ni, H. Heart Rate Extraction Based on Near-Infrared Camera: Towards Driver State Monitoring. IEEE Access 2018, 6, 33076–33087. [Google Scholar] [CrossRef]
Pereira, C.B.; Kunczik, J.; Bleich, A.; Haeger, C.; Kiessling, F.; Thum, T.; Tolba, R.; Lindauer, U.; Treue, S.; Czaplik, M. Perspective review of optical imaging in welfare assessment in animal-based research. J. Biomed. Opt. 2019, 24, 070601. [Google Scholar] [CrossRef]
Blanik, N.; Pereira, C.; Czaplik, M.; Blazek, V.; Leonhardt, S. Remote Photopletysmographic Imaging of Dermal Perfusion in a porcine animal model. In Proceedings of the 15th International Conference on Biomedical Engineering, Singapore, 4–7 December 2013; pp. 92–95. [Google Scholar]
Unakafov, A.M.; Möller, S.; Kagan, I.; Gail, A.; Treue, S.; Wolf, F. Using imaging photoplethysmography for heart rate estimation in non-human primates. PLoS ONE 2018, 13, e0202581. [Google Scholar] [CrossRef]
Dantu, V.; Vempati, J.; Srivilliputhur, S. Non-invasive blood glucose monitor based on spectroscopy using a smartphone. In Proceedings of the 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Chicago, IL, USA, 26–30 August 2014; pp. 3695–3698. [Google Scholar]
Devadhasan, J.P.; Oh, H.; Choi, C.S.; Kim, S. Whole blood glucose analysis based on smartphone camera module. J. Biomed. Opt. 2015, 20, 117001. [Google Scholar] [CrossRef]
Leijdekkers, P.; Gay, V.; Lawrence, E. Smart homecare system for health tele-monitoring. In Proceedings of the First International Conference on the Digital Society (ICDS’07), Guadeloupe, France, 2–6 January 2007; pp. 1–5. [Google Scholar]

Figure 1. Contactless measuring methods. (a) The Doppler effect; (b) thermal imaging; (c) video camera imaging.

Figure 2. Block diagram of contactless monitoring system.

Figure 3. Different sensors used for research surveyed in this study. (a) Digital camera, (b) webcam, (c) Microsoft Kinect, (d) smart phone, (e) laptop, (f) unmanned aerial vehicle (UAV).

Figure 4. System diagram of the remote respiratory monitoring system [62].

Figure 5. Block diagram of the contactless vital sign monitoring system using a Microsoft Kinect sensor [64].

Figure 6. Skin reflection model of imaging photoplethysmography (iPPG) [adapted from [67]].

Figure 7. System overview of the noise removal technique using a UAV [97].

Figure 8. RMSE performance comparison of complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and canonical correlation analysis (CCA) techniques for four scenarios [97]. (a) RMSE performance comparison for HR, (b) RMSE performance comparison for RR.

Figure 9. Data collection using a UAV (3DR solo drone) [102].

Figure 10. The experimental setup and data acquisition of the noise artefact removal, multiple person and long detection system [117].

Figure 11. RMSE performance comparison of CEEMDAN and CCA techniques under multiple detection [117]. (a) RMSE performance comparison for HR, (b) RMSE performance comparison for RR.

Figure 12. RMSE performance comparison of CEEMDAN and CCA techniques at long range [117]. (a) RMSE performance comparison for HR, (b) RMSE performance comparison for RR.

Table 1. imaging photoplethysmography (iPPG) based methods under well controlled conditions.

Ref.	Used Sensors with Parameter	Vital Signs	ROIs	Used Technique	Distance (m)	No. of Participants at a Time
Verkruysse et al. [68]	CCD 15 or 30 fps	HR and RR	Face and forehead	Single/multichannel analysis	1–2 m	1
Pho et al. [70]	Webcam resolution 640 × 480, 15 fps	HR, RR and HRV	Face	ICA	0.5 m	1
Purche et al. [71]	Webcam resolution 640 × 480, 30 fps	HR	Forehead, nose and mouth	ICA	0.5 m	1
Lewandoska et al. [72]	Webcam resolution 640 × 480, 20 fps	HR	Face and forehead	PCA	1 m	1

Note: fps = frame rate; HR = heart rate; RR = respiratory rate.

Table 2. iPPG based methods concern about motion artefacts.

Ref.	Used Sensors with Parameter	Vital Signs	ROIs	Used Technique	Distance (m)	No. of Participants at a Time	Results
Pho et al. [69]	Webcam resolution 640 × 480, 25 fps	HR	Face	ICA	0.5 m	3	PPC = 0.95, RMSE = 4.63 bpm
Yu et al. [74]	CMOS 720 × 576 face detector, 25 fps	HR and RR	Palm and face	SCICA	0.35–0.4 m	1	PCC = 0.9
Feng et al. [75]	Webcam resolution 640 × 480, 30 fps	HR	Forehead	ICA	0.75 m	1	PCC = 0.99
Qi et al. [76]	Digital camera resolution 720 × 576, 50 fps	HR	Face	JBSS	-	1	PCC = 0.74
Bousefsaf et al. [77]	Webcam resolution 320 × 240, 30 fps	HR	Face	CWT	1 m	1	PCC = 1
De Haan et al. [78]	CCD resolution 1024 × 752, 20 fps	HR	Face	CHROM	-	1	PCC = 1, RMSE = 0.5
De Haan et al. [79]	CCD resolution 1024 × 752, 20 fps	HR	Face	PBV	-	1	PCC = 0.99, RMSE = 0.64
Feng et al. [80]	Webcam resolution 640 × 480, 30 fps	HR	Cheeks	GRD	0.75 m	1	PCC = 0.97
Wang et al. [81]	CCD resolution 768 × 576, 20 fps	HR	Face and forehead	2SR	1.5 m	1	PCC = 0.94
Wang et al. [67]	CCD resolution 768 × 576, 20 fps	HR	Face	POS	1.5 m	1	SNR (dB) = 5.16
Wang et al. [82]	CCD resolution 768 × 576, 20 fps	HR	Face	Sub-band decomposition	2 m	1	SNR (dB) = 4.77
Wu et al. [83]	Webcam	HR	Face	CWT	0.5–1.5 m	1	SNR (dB) = −3.01
Wu et al. [84]	Webcam	HR	Cheeks	MRSPT	0.6–1.6 m	1	RMSE = 6.44 bpm
Xie et al. [85]	Video camera with 30 fps	HR	Face	SSA and SVD	-	1	PCC = 0.99, RMSE = 3.99
McDuff et al. [86]	Colour camera resolution 658 x 492, 120 fps.	HR and HRV	Face	Linear transformation	-	1	MAE = 4.17 bpm, SNR (dB) = 3.13
Fallet et al. [87]	Video camera resolution 1.3 megapixels, 20 fps.	HR	Forehead and face	SQI	-	1	MAE = 4.72 bpm

Note: PCC = Pearson correlation coefficient, RMSE = root mean square error, SNR = signal to noise ratio, MAE = mean absolute error, bpm = beats per minute.

Table 3. iPPG based methods concern about illumination variations.

Ref.	Used Sensors with Parameter	Vital Signs	ROIs	Used Technique	Distance (m)	No. of Participants at a Time	Results
Chen et al. [89]	Digital camera 30 fps	HR	Brow area	EEMD	0.07–0.09 m	1	PCC = 0.91
Lin et al. [90]	Digital camera 30 fps	HR	Brow area	EEMD + MLR	0.10–0.25 m	1	PCC = 0.96
Lee et al. [91]	Digital Camera resolution 1280 × 720	HR	Cheek	MOCF	1 m	1	RMSE = 1.8 bpm
Tarassenko et al. [92]	Digital camera Resolution 5 megapixels, 12 fps	HR, RR, Sp $O_{2}$	Forehead and cheek	AR modelling and pole cancellation	1 m	1	MAE = 3 bpm
Cheng et al. [93]	Webcam resolution 640 × 480, 30 fps	HR	Face	JBSS + EEMD	0.50 m	1	PCC = 0.91
Xu et al. [94]	Webcam resolution 640 × 480, 30 fps	HR	Face	PLS + MEMD	0.50 m	1	PCC = 0.81

Table 4. iPPG based methods concern about both illumination variations and motion artefacts.

Ref.	Used Sensors with Parameter	Vital Signs	ROIs	Used Technique	Distance (m)	No. of Participants at a Time	Results
Li et al. [95]	iSight camera of an IPAD resolution 780 × 580, 61 fps	HR	Face	NLMS adaptive filtering	0.35–0.50 m	1	RMSE = 1.8 bpm
Kumar et al. [96]	Monochrome camera 30 fps	HR and HRV	Face	Weighted average	0.5 m	1	SNR (dB) = 6.5
Al-Naji et al. [97]	Hovering UAV resolution 1920 × 1080, 60 fps	HR and RR	Face	CEEMDAN + CCA	3 m	1	PCC = 0.99, RMSE = 0.65 bpm

Table 5. Summary of different iPPG based methods.

Ref.	Used Sensors with Parameter	Vital Signs	ROIs	Used Technique	Distance (m)	No. of Participants at a Time	Results
Kwon et al. [101]	Smartphone resolution 640 × 480, 30 fps	HR	Face	ICA	0.3 m	1	MAE = 1.47 bpm
Al-Naji et al. [102]	UAV resolution 3840 × 2160, 25 fps	HR and RR	Face	CEEMD + ICA	3 m	1	PCC = 0.99, RMSE = 0.7 bpm
Mcduff et al. [103]	Digital camera with five bands resolution 960 × 720, 30 fps	HR, RR, and HRV	Face	ICA	3 m	1	PCC = 0.92
Sun et al. [104]	CMOS resolution 1280 × 1024, Webcam	HR	Face	TFR	0.2–0.35 m	1	PCC = 0.85
Bernacchia et al. [105]	Microsoft Kinect 30fps	HR and RR	Neck, thorax and abdominal area	ICA	-	1	PCC = 0.91
Smilkstein et al. [106]	Microsoft Kinect	HR	Face	EVM	-	1	-
Gambi et al. [107]	Microsoft Kinect resolution 1920 × 2080, 30 fps	HR	Forehead, cheeks, neck,	EVM	-	1	RMSE = 2.2 bpm
Scalise et al. [108]	Webcam resolution 640 × 480, 30 fps	HR	Forehead	ICA	20 cm	1	PCC = 0.94
Arts et al. [10]	Digital camera 300 × 300 face detector, 15 fps	HR	Face and cheek	JFTD	1 m	1	-
Cobos-Torres et al. [109]	Digital camera resolution 1920 × 1080 or 1280 × 720 pixels, 24 or 30 fps	HR	Abdominal area	Stack FIFO	50 cm	1	PCC = 0.94
Gibson et al. [110]	Digital camera resolution 1920 × 1080, 30 fps	HR and RR	Face and chest	EVM	1–2 m	1	Mean bias = 4.5 bpm
Bal [111]	Webcam resolution 640 × 480, 30 fps	HR and Sp $O_{2}$	Face	DTCWT	50 cm	1	PCC = 0.92, RMSE = 2.05 bpm
Tarassenko et al. [92]	Digital camera resolution 5 megapixels, 12 fps	HR, RR, Sp $O_{2}$	Forehead and cheek	AR modelling and pole cancellation	1 m	1	MAE = 3 bpm
Foud et al. [112]	Webcam resolution 640 × 480, 30 fps	HR	Face and cheeks	ICA	1–2 m	1	RMSE = 2.7 bpm
Datcu et al. [113]	Video camera resolution 252 × 350, 15 fps	HR	10 different parts of face	ICA	-	1	RMSE = 1.47 bpm
Feng et al. [114]	Webcam resolution 640 × 480, 30 fps	HR	Face and palm	Block division	0.75 m	1	PCC = 0.96
Zhao et al. [115]	Webcam resolution 640 × 480, 30 fps	HR	Face, arm, hand	POSCC	1.5 m	2	SNR (dB) = 4.5
Al-Naji et al. [116]	Digital camera resolution 1920 × 1080 or 1080 × 720, 60 or 30 fps	HR and RR	Face, palm, wrist, arm, neck, leg, forehead, head and chest	EEMD + ICA	2 m	1	PCC = 0.96, RMSE = 3.52
Al-Naji et al. [117]	Digital camera resolution 1080 × 720, 60 fps, UAV 10 megapixel	HR and RR	Face and Forehead	CEEMDAN + CCA	60 m	6	PCC = 0.99, RMSE = 0.89 bpm
Hsu et al. [118]	Video camera resolution 1920 × 1080, 29.97 fps	HR	Face	SVR	-	1	PCC = 0.88, RMSE = 5.48 bpm
Chen et al. [119]	Video camera resolution 658 × 492, 120 fps	HR and BR	Face	Convolutional attention network	-	1	MAE = 1.50 bpm
Yu et al. [120]	Video camera resolution 1920 × 2080, 60 fps	HR and HRV	Face	Spatio-temporal network	-	1	PCC = 0.99, RMSE = 1.8 bpm
Weidi et al. [121]	Industrial camera	HR	Forehead	ICA	2 m	1	RMSE = 4.83 bpm

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khanam, F.-T.-Z.; Al-Naji, A.; Chahl, J. Remote Monitoring of Vital Signs in Diverse Non-Clinical and Clinical Scenarios Using Computer Vision Systems: A Review. Appl. Sci. 2019, 9, 4474. https://doi.org/10.3390/app9204474

AMA Style

Khanam F-T-Z, Al-Naji A, Chahl J. Remote Monitoring of Vital Signs in Diverse Non-Clinical and Clinical Scenarios Using Computer Vision Systems: A Review. Applied Sciences. 2019; 9(20):4474. https://doi.org/10.3390/app9204474

Chicago/Turabian Style

Khanam, Fatema-Tuz-Zohra, Ali Al-Naji, and Javaan Chahl. 2019. "Remote Monitoring of Vital Signs in Diverse Non-Clinical and Clinical Scenarios Using Computer Vision Systems: A Review" Applied Sciences 9, no. 20: 4474. https://doi.org/10.3390/app9204474

APA Style

Khanam, F.-T.-Z., Al-Naji, A., & Chahl, J. (2019). Remote Monitoring of Vital Signs in Diverse Non-Clinical and Clinical Scenarios Using Computer Vision Systems: A Review. Applied Sciences, 9(20), 4474. https://doi.org/10.3390/app9204474

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Remote Monitoring of Vital Signs in Diverse Non-Clinical and Clinical Scenarios Using Computer Vision Systems: A Review

Abstract

1. Introduction

2. Video Camera Imaging Based Method

2.1. Basic Framework

2.1.1. Data Acquisition

2.1.2. ROI Detection

2.1.3. Raw Signal Extraction

2.1.4. Noise Artefact Removal

2.1.5. Vital Sign Extraction

2.2. Motion Based Methods

2.3. Colour-Based Method

3. Different Aspects of Colour-Based Method

3.1. Motion Artefacts

3.2. Illumination Variations

3.3. Both Illumination Variations and Motion Artefacts

3.4. Alternate Sensors

3.5. Different Subjects

3.6. Different Vital Signs

3.7. Multiple ROIs

3.8. Long Distance and Multiple Persons Simultaneously

3.9. Others

4. Application

4.1. Clinical Application

4.1.1. Neonatal Monitoring

4.1.2. Assessing Patients with Chronic Pain

4.1.3. Critical Patient Monitoring

4.1.4. Arrythmia Detection

4.1.5. Anesthesia Monitoring

4.1.6. Monitoring Dialysis Patients

4.1.7. Burn care

4.2. Non-Clinical Application

4.2.1. Home Health Care

4.2.2. Fitness Monitoring

4.2.3. Sleep Monitoring

4.2.4. Polygraph

4.2.5. Living Skin Detection

4.2.6. Security

4.2.7. Emotion/Stress Monitoring

4.2.8. Driver Monitoring

4.2.9. War Zone or Natural Calamity

4.2.10. Animal Research

5. Research Gaps and Challenges and Future Direction

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI