1. Introduction
Photoplethysmography (PPG) is a kind of popular optical measurement technique that can be used to detect blood volume changes in the micro-vascular bed of tissue [
1]. Because of its simplicity and low-cost advantages, PPG has become the new technique for wearable measurement instead of the conventional electrocardiography (ECG) technique. The pulsatile “AC” physiological waveform can be obtained due to cardiac synchronous changes in blood volume with the heartbeat. Due to this property, PPG can be a source of real-time heart rate (HR) information calculation [
2]. Ear, finger and wrist are all optional PPG measurement sites. However, in measurement sites, noise interference produced by motion artifacts (MAs) and cardiac arrhythmia is inevitable. Due to human movement, relative motion may occur between the sensor and skin so that the principle component of true HR information is weakened. The quality of the PPG sensor signal is especially susceptible to motion artifacts. In other words, the accuracy of heart rate estimation depends on the quality of the PPG. Therefore, it is essential to minimize the influence of motion artifacts in order to improve the accuracy of heart rate estimation.
Several methods are usually used to recover or reconstruct a clean PPG signal from a corrupted one before HR is extracted. Generally speaking, there are five kinds of these MA-removal algorithms: classical digital filters [
3]; adaptive filters [
4]; time-frequency analysis (wavelet decomposition [
5], singular value decomposition [
6], and empirical mode decomposition [
7]); spectrum analysis; and bland signal processing [
8].
Those algorithms mentioned above can be applied to the signals which are corrupted slightly when motion artifacts are not strong. However, these techniques cannot figure out the precise heart rate estimation when intense physical exercise such as boxing occurs. So, people would always prefer to use complex algorithms when extracting HR from the corrupted PPG signals, rather than a single technique. When the motion artifacts are strong, heart-rate information can be mostly masked by the noise component. Thus, the removal of MAs in intense exercise from the PPG is always challenging. Fukushima et al. [
9] and Zhang et al. [
10] argue that acceleration data are also helpful for removing MAs. In this paper, tri-axis acceleration data play an important role in MA removal.
Recently, some study groups have concentrated on the solution of strong MA removal and have made progress. Therefore, many state-of-the-art algorithms are proposed. Zhang et al. [
11] put forward the TROIKA framework which consists of three key parts, namely signal decomposition, sparse signal reconstruction, and spectral peak tracking. In this framework, signal decomposition using singular spectrum analysis is applied to cancel partial MAs. Sparse PPG signal reconstruction puts the sparse signal into a high-solution spectrum so that the true peak corresponding to the heart rate is found. Then, Zhang et al. proposed an improved algorithm JOSS [
10] with the help of acceleration data. The spectra of PPG signals and simultaneous acceleration signals are jointly estimated using the multiple measurement vector (MMV) model in sparse signal recovery. This algorithm shows the effect of acceleration data on the accuracy of heart-rate estimation from the PPG. However, on the one hand, the sparse signal reconstruction entails high computational complexity and will take a long time in practical use. On the other hand, all sliding windows in the two methods mentioned above must experience every step, which can result in a complex algorithm and a high computational cost. In order to save computational cost, for windows that indicate MAs are not too significant an influence upon finding the true peak of the HR with acceleration data, there is no need to go through every step.
The SpaMA algorithm [
12] proposed by Salehizadeh et al. combines the PPG signal and acceleration data. Its key idea is to calculate the power spectral density of both PPG and acceleration data, and the related frequency peaks resulting from MAs can be distinguished from the PPG spectrum. This method performs better in the first 12 datasets, but the off-track error is large in other datasets that have stronger MAs during intense arm movements. Another algorithm named CNAFSD [
13] proposed a hybrid-motion artifacts-removal method, which combines non-linear adaptive filtering and signal decomposition (singular spectrum analysis). Sun et al. proposed SPECTRAP [
14] using a new spectrum-subtraction algorithm. Another algorithm, WFPV [
15], uses a Wiener filter to suppress MAs and a phase vocoder to improve the HR estimate. The algorithms mentioned above apply every step for all sliding windows so that the calculation and run time are relatively large. Therefore, it is essential to make multilevel decisions according to the different features of the PPG.
In this paper, a new approach combining the single notch filter and improved EEMD (NFEEMD) is proposed to estimate HR, during intense exercise, from the PPG. Acceleration data are adopted to detect if MAs exist and provide several reliable spectral peaks corresponding to MAs. The proposed algorithm framework determines the different signal-processing methods for different PPG signals by designing multiple decisions. Two binary decisions related to acceleration data and two ternary decisions related to the number of spectral peaks are applied to the NFEEMD algorithm. Our proposed algorithm is different from the previous one, which uses sequential execution for every sliding window. This approach consists of three stages (
Figure 1). Stage 1 aims to find out whether the original PPG signal is seriously corrupted by MAs and then chooses one way to estimate the HR by a precise binary-decision algorithm. When there are no significant MAs, HR can be obtained from the pre-processed signal spectrum peaks. When there are significant MAs, then stage 2 appears to work. Stage 2 removes the corresponding MA spectrum peaks from pre-processed signal spectrum peaks and then goes through the next step combined with the single notch filter and the improved EEMD by a ternary-decision algorithm. Every step has one kind of HR tracking method. With the purpose of avoiding spectrum peak loss to keep tracking valid, stage 3 will calibrate the value of HR.
The main contribution of this algorithm has two aspects: (1) The different signal processing methods with high estimation accuracy are proposed according to different PPG features instead of using the sequential execution mode for every sliding window. (2) The running time in real-time HR estimation is greatly reduced, due to reduced computational complexity.
This paper is organized as follows:
Section 2 discusses various conditions where there is very strong MAs so that HR estimation is difficult.
Section 3 introduces the proposed algorithm framework to deal with the problem mentioned above.
Section 4 illustrates the experimental results using 23 datasets and compares these with other framework algorithms in this field, with corresponding discussions. Moreover, all datasets used in this paper will be available at Zhilin Zhang’s homepage:
https://sites.google.com/site/researchbyzhang/ and the related paper is [
11].
2. Materials and Methods
In this section, the complete NFEEMD algorithm framework will be mentioned and the corresponding details of the proposed algorithm will be given. The overall flowchart of NFEEMD could be shown in
Figure 1.
The framework of HR estimation during intense physical exercise is presented in
Figure 1, then the signal-processing method in every case will be illustrated. For this algorithm, one-channel raw PPG and tri-axis acceleration data are needed. This framework includes two binary decisions and two ternary decisions.
Initialization: In the initialization stage, the system can enter into a stable heart-rate tracking state if HR can be estimated by choosing only one spectrum peak from the raw PPG periodogram. Before the stable heart-rate tracking state, the estimated HR values can be discarded, since the contaminated PPG with many spectral peaks may provide wrong prior HR information initially. It is known that the HR value cannot change rapidly in a short time [
16], therefore correct prior HR information is essential to maintain continuous HR tracking. Once stable tracking starts, the first HR value is obtained. Then the main algorithm uses the previous HR estimation
bpm_prev as its base. It should be noted that we identify the spectral peaks if they are larger than 30% of the maximum amplitude in the periodogram.
Pre-processing: It is necessary for raw PPG to be pre-processed in order to highlight the HR information. The repeated moving average filter [
3] is applied to obtain the signal component of 0.5–3 Hz. The pre-processing signal is used in two cases: (1) in the HR tracking when there are no MAs; (2) in the first ternary decision for finding the number of PPG spectral peaks when there are MAs.
Binary Decision 1: This step aims to determine if there are MAs through the amplitude of the acceleration periodogram. MAs exist when the spectral amplitude of every axis acceleration data is greater than 0.1 (power spectral density). Of course, there are other algorithms for determining the existence of MAs, such as the correlation coefficient (CC) [
17]. Through the comparison of three methods, the resulting judgement can be described in
Figure 2.
It is known that there must be motion artifacts in the running state. Moreover, PPG and MAs may overlap when MAs are very strong. From
Figure 2, it is obvious that our method is more suitable for the experiment than the CC algorithm.
Case 1: When MAs do not exist, an HR-estimation process is proposed. bpm_track (the final estimated HR value) depends on the highest spectral peak of pre-processed PPG. If |bpm_track − bpm_prev| > th_pass (a constant threshold), bpm_track is equal to bpm_prev.
HR Calibration 1: According to the calculation rules above, an HR-calibration mechanism is designed in order to prevent the correct spectral peak being lost due to over-reliance on the past heart rate values. If two consecutive heart-rate values are dependent on the past, the HR Calibration 1 algorithm will start when MAs do not exist. In this condition, bpm_track will be recalculated through the original PPG signal. Meanwhile, if |bpm_track − bpm_prev| th_pass, the value of bpm_track will be replaced by the bpm_prev’s.
Binary Decision 2: The purpose of this step is to determine whether the accelerometer signal is reliable and whether the frequency component associated with motion artifacts can be accurately provided during strenuous exercise. It is known that the spectrum of tri-axis acceleration data is too messy to accurately extract the MA component by exploring the acceleration data. If the number of spectral peaks in any axis accelerometer data exceeds th_acc_peaks (a constant), then we consider the acceleration data of this window to be unreliable.
Case 2: At this time, bpm_track will be replaced by past HR value.
After
Binary Decision 2, if the acceleration data is reliable, then we will remove the spectral peaks (absolute value less than or equal to 8) associated with the tri-axis acceleration from the spectral peaks of pre-processed PPG. The number of remaining spectral peak is then counted.
Ternary Decision 1 divides the situation into three categories: the remaining zero (the peak corresponding to HR cannot be detected), the remaining one, leaving more than one. For the second case, a
Ternary Decision 2 was designed by counting the peak number with a difference of
th_pass BPM compared to
bpm_prev. This ternary decision also divides the situation into three categories: the remaining zero, the remaining one, leaving more than one. As shown in
Figure 1, the same situation undergoes the same treatment.
Case 3: (the remaining zero): This is the main part of the algorithm used to remove motion artifacts. The repeated single notch filter and improved EEMD algorithm are adopted herein.
(1) Single notch filter
A second-order IIR filter is used to implement the single notch filter here, whose amplitude response is zero at a certain frequency. It is usually used to eliminate a particular frequency component, such as the removal of 50 Hz power-line interference. In this paper, the single notch filter is applied to the motion artifacts’ removal from the PPG signal spectrum peaks based on the tri-axis acceleration signal spectrum peaks. The system function of the single notch filter is defined in Equation (1) as follows:
where
is the notch digital frequency (rad);
is notch frequency (Hz);
is sampling frequency; and
r is a constant (
r = 0.96). The single notch filter can be applied when the MAs of the original signal exist after the binary decision. According to the tri-axis acceleration data, several frequencies corresponding to MAs are identified through the periodogram. Then, the related frequencies in the original PPG signal are removed, and the reconstructed, clean signal is obtained.
Figure 3 shows a flowchart depicting the removal of MAs to reconstruct the PPG signal for accurate HR evaluation. It should be noted that |
bpm_track_NF −
bpm_prev| >
th_NF_1 (a constant threshold) and |
bpm_track_EEMD −
bpm_prev| >
th_EEMD_1 after the
Ternary Decision 1, |
bpm_track_NF −
bpm_prev| >
th_NF_2 (a constant threshold) and |
bpm_track_EEMD −
bpm_prev| >
th_EEMD_2 after the
Ternary Decision 2.
th_NF and
th_EEMD are the empirical values obtained from the first 12 experiments.
It should be noted that the usage number of the repeated single notch filter is different in a different signal window. On the one hand, this number depends on all spectral peaks detected in the tri-axis acceleration signal spectrum. If there is a repeated value, it is recorded as one. On the other hand, if the number does not exceed three, then this number is the times the single notch filter is applied, and the frequencies of the notch are the detected peaks. If the number exceeds three, then a spectral peak value is found from the spectrum of each axis acceleration data; that is, the number of times the single notch filter is used is up to three. The purpose of limiting the number of application times is to retain the component corresponding to heart rate of the original signal as much as possible.
Figure 4 shows an example of case 3.
From
Figure 4, it is known that the final heart rate estimation after the NFEEMD algorithm is 164.8 BPM. The true HR value by ECG is 163.2 BPM. The absolute error is 1.6 BPM. In fact, we can find the only spectral peak, 172.1 BPM, in the periodogram of the original PPG signal. However, this value is quite different from the true HR. It is the effect of strong motion artifacts that results in a large error in HR estimation. It is worth noting that MAs are superimposed on the PPG signal closely in this example, so that the true HR information is covered by MAs. Although MAs conceal the true HR information, the tri-axis motion acceleration data accurately reflects the frequency components of motion artifacts. After removing the MAs by the single notch filter, the estimated HR value calculated from the final reconstructed signal is 164.8 BPM, which is very close to the true heart rate value. It is shown that using the periodogram directly may lead to a large error and lost HR tracking. This example also demonstrates that the repeated use of the single notch filter can accurately remove the MA components without affecting the nearby real HR components, and the HR-value resolution is high.
(2) Improved Ensemble Empirical Mode Decomposition
Empirical mode decomposition (EMD) is a kind of non-linear and non-stationary signal decomposition algorithm proposed by Professor Huang, which aims to decompose any complicated time-series signal into a number of “intrinsic mode functions” [
18]. However, its drawbacks are obvious. End effect and mode mixing are the primary factors that can limit the development of EMD. On the basis of EMD, Wu et al. proposed a noise-assisted data analysis method, Ensemble Empirical Mode Decomposition (EEMD) [
19]. In this algorithm, mirror continuation is used to remove end effect, and its key idea is that the mode-mixing phenomenon can be weakened by adding white Gaussian noise. Zhang et al. [
17] and Khan et al. [
20] have proven that EMD or the EEMD algorithm performance well in the analysis of biomedical signals.
In the EMD approach, the given original signal
is decomposed into the IMF components and a residue. Each IMF component should satisfy two conditions: (1) the number of extrema and the number of zero-crossings must either be the same or differ at most by 1; and (2) the mean value of the upper envelope and lower envelope must be zero at any positions in the signal. EMD uses a “sifting process” to obtain the local zero-mean until an IMF and residue are found. The “sifting process” includes several steps as described follows: (1) identifying all extremes of
; (2) interpolating the upper envelope of maxima and the lower envelop of minima; (3) computing the mean of upper and lower envelope
; (4) extracting the detail
and the residual; (5) iterating on the residual until
satisfies the IMF conditions and then
will be an IMF component; (6) repeating on the residual using sifting process (1)–(5) until the residual component is a monotonic function. The final decomposition result is described as follows:
The updated EEMD algorithm adds an ensemble of
Ne signals into the given signal
by adding white Gaussian noise
which has the same variance. The equation can be described as follows:
Then, the EMD algorithm is applied to each of the new signals with noise, and the decomposition results can be described as follows:
Then, the optimum choice of IMFs is obtained by computing the average of an ensemble of
Ne signals of the EMD algorithm:
The basic principle of EEMD based on noise-assisted analysis is that the time-frequency space is composed of different scale components divided by the filter group when the additional white noise is evenly distributed over the entire time-frequency space. Of course, EEMD cannot eliminate the effect of mode mixing completely due to the additional noise, especially when the parameter Ne is small.
To make the reconstructed signal derived from EEMD cleaner, singular spectrum analysis (SSA) [
11] is combined with EEMD to extract the usable reconstructed signal. SSA generates a trajectory matrix from the original signal by a sliding window of length L. The trajectory matrix is approximated using singular value decomposition (SVD). The last step is to reconstruct the series.
In this paper, SSA is applied to each IMF component after EEMD, and then the signal is reconstructed by only the first eigenvalue which corresponds to the most principal component over the whole signal. Finally, the only IMF whose frequency is closest to the previous heart rate frequency is chosen as the reconstructed signal.
Selecting which IMF belongs to the constructed signal is crucial. In this paper, the cycle of each IMF is first estimated. Then the IMF signal whose cycle is closest to the bpm_prev is chosen as the constructed signal.
In the example above, although the PPG signal is contaminated by MAs, the HR information is still obtained by the single notch filter. However, it is almost impossible to extract the HR information by the single notch filter when the PPG coincides with the motion artifacts (the spectral peak of the PPG is completely overlapped by the MA component.) (
Figure 5e,f). So EEMD is another proposed method in this case.
Figure 5 gives an example of HR estimation using EEMD instead of using the single notch filter. The final HR estimation value is 153.8 BPM, showing that EEMD is an effective method. It should be noted that the signal which has high signal quality in
Figure 5g is the reconstructed one after the EEMD algorithm. From
Figure 5, compared to the
bpm_prev, the estimated HR by EEMD is closer to the true value, which shows the effectiveness of EEMD. On the other hand, there is a high-frequency resolution of estimation using EEMD, for example,
Figure 6. The final estimated HR is 159.3 BPM, which is different from the spectral peak of the original PPG, 161.1 BPM.
Case 4: (the remaining one): The reason why this situation is carried out alone is that in this case the remaining frequency component is the most reliable. Therefore, if the absolute value of the heart-rate value corresponding to the remaining one and bpm_prev is less than th_pass, this value is considered acceptable. In other words, bpm_track = bpm_prev. According to experience, it is not necessary to use the above complex algorithm due to the low probability of it not being accepted in this case.
Case 5: (leaving more than one): In this circumstance, the heart rate (bpm_track) closest to the absolute value of the past is found. If |bpm_track − bpm_prev| < th_pass, then the value is accepted. Or bpm_track = bpm_prev.
HR Calibration 2: This is similar to the HR Calibration 1 mechanism. In fact, there is often a lack of tracking as the heart rate continues to rise. Therefore, HR calibration is employed mainly to solve this problem. On the one hand, the heart-rate value prediction of the current window is obtained through gray system prediction (GM (1, 1)) of the first 30 values when MAs exist. If the absolute value of the predicted value and the past value is greater than 12, then the next condition will be checked.
On the other hand, three consecutive heart-rate values are dependent on the past. As long as these two conditions occur, the HR Calibration 1 algorithm will start when MAs exist to avoid losing track of spectral peaks. Then, a new
bpm_track is calculated by the repeated single notch filter. If |
bpm_track −
bpm_prev| < 40, it is considered that
bpm_track can be accepted. In order to better verify the effectiveness of the HR Calibration 2 model, we have done a set of comparative experiments to perform our algorithm in conditions using both HR Calibration 2 and not using it.
Figure 7 illustrates the corresponding result, which shows that the HR-tracking trajectory (the blue curve) deviates from normal orbit when there is no calibration model. In contrast, the estimated HR (the black curve) can reflect the correct heart beat changes.
Additionally, the final output HR value of the current window is the average value of the previous windows by the cyclic moving average filter [
16]. By virtue of the moving average filter with different windows, the final output result will be smooth.
4. Discussion
From
Table 2, the NFEEMD algorithm performs better compared to the others. For the first 12 of 23 datasets, the average absolute error (AAE) is 1.12 + 0.51 (mean ± standard deviation) BPM, and AAE is 2.68 + 2.19 BPM for the remaining 11 datasets. For all 23 datasets, an average absolute error of 1.87 BPM and standard deviation of 1.79 BPM are recorded using the NFEEMD framework under intense physical activities.
It should be noted that the most obvious difference between the first 12 datasets and the last 11 datasets is the severity of motion. The activities of sample set T0 on the treadmill have a certain regularity, and the activities of sample set T1 and sample set T2 including arm movements are intense and random. In
Table 2, the average absolute error of the last 11 datasets (2.68 BPM) by using NFEEMD is significantly larger than the first 12 datasets (1.12 BPM). This result is consistent with the severity of the state of motion, so the intenser the movements, the larger the HR-estimation error obtained. Although the errors are a bit larger for the last 11 datasets, HR estimates do not get derailed, which indicates that the direction of the HR estimates is correct. If the HR estimates get derailed for some reason in the process of estimation, our HR-calibration section could work (
Figure 7, for example). As the results of comparisons in
Table 2 show, the NFEEMD algorithm could obtain the most accurate results on HR estimates for the last 11 datasets, as well as the second most accurate results on HR estimates for the first 12 datasets. Compared with the SPECTRAP algorithm, the accuracy of the NFEEMD algorithm is slightly lower than the SPECTRAP for 22 datasets (except dataset 13). In a word, the results in
Table 2 indicate that the NFEEMD algorithm can adapt to more intense circumstances like boxing in the last 11 recordings and our algorithm is more robust.
The Bland–Altman plot is given in
Figure 8 to test agreement between the ground-truth HR values and the estimation HR values, in order to show our algorithm’s better performance. From
Figure 8, we can see that more than 95% of estimated HR values are incorporated in the limit of agreement (LOA) expressed by
where the absolute value of mean
BPM and standard deviation
BPM. The difference error is large when the real HR values are between 50 and 80, which indicates that several HR estimations using NFEEMD do not work well when the subject is in a stationary state except for intense arm exercise like boxing. The algorithm works well when HR is increasing continually, which indicates that the MA-removal method and the heart-rate tracking method are valid when the subject is in a sustained exercise state like running.
In addition,
Figure 9 gives the scatter plot of 23 recordings between the ground-truth HR values and the estimated HR values, where the Pearson correlation coefficient is 0.992 and the fitted line is
(x is the ground-truth HR values and y is the estimated HR value). In
Figure 9, the high Pearson coefficient and the small absolute value of mean indicate the NFEEMD algorithm’s better performance.
In order to verify the performance of the NFEEMD algorithm on HR tracking during intense physical exercise, we applied the proposed algorithm along with the same set of parameters to a new dataset where the subject performed intensive running and upper arm movements. The sampling frequency is 100 Hz. The results are given in
Figure 10. It should be noted that the true HR values are obtained by the heart-rate monitor made by Decathlon and there would be errors in data obtained between this monitor and the ECG. From
Figure 10, the average absolute error is 3.72 BPM, which indicates that the proposed algorithm performs well during intensive physical activities and the algorithm could be available for PPG derived from other platforms.
To examine in more detail the performance of the NFEEMD algorithm with a change in sampling frequency, we experimented in 25 Hz sampling frequency using the same algorithm for one channel PPG and three-channel ACC. The corresponding AAE results for all datasets are listed in
Table 3, which demonstrates that the NFEEMD algorithm performs better in 125 Hz sampling frequency than in 25 Hz. In other words, more detailed information can be recorded at a high sampling frequency so that the HR-estimation accuracy can be improved.
Moreover, the signal sparsification technique through the M-FOCUSS in TROIKA and JOSS is applied on the HR estimation algorithm, which involves extensive computational complexity. For example, for the sampling frequency of 125 Hz, TROIKA takes about 3.5 h to estimate HR for the first 12 datasets on a computer equipped with Intel Core-i7 4790 at 3.6 GHz, 8-GB RAM, Windows 7 64 bit, and MATLAB 2013a. The algorithm proposed by Khan [
20] takes 668 s on the same computer. The NFEEMD algorithm takes 229 s for calculation of the first 12 datasets and 476 s for all 23 datasets using the same computer configuration. In addition, our proposed algorithm takes 86 s for the first 12 datasets and 191 s for all 23 datasets when the sampling frequency is 25 Hz. JOSS takes 300 s for all the datasets at 25 Hz sampling frequency. It is obvious that our algorithm has the advantage of low computational complexity and short running time.
Of course, our algorithm also needs to be improved. On the one hand,
Figure 8 shows that the difference error is large when the real HR values are between 50 and 80. The subjects tend to be in the rest state, or a critical state between rest and movement, when this circumstance occurs. In fact, HR estimation during running may be easier sometimes than in the rest or critical states, since in the latter process, there are large MA peaks and these peaks are actually scattered all over the spectrum. At this moment, there is no effective method to solve this problem. What is more, two-channel compared PPG signals can improve the algorithm performance and the correct HR peak can be determined with more confidence. On the other hand, high-frequency resolution and accuracy are obtained after using the repeated single notch filter advanced in this paper. Of course, an adaptive filter is also an option, although computation is increased. This research will be continued in future work.