Subject-Independent Model for Reconstructing Electrocardiography Signals from Photoplethysmography Signals

Guo, Yanke; Li, Shiyong; Chen, Zhencheng; Tang, Qunfeng

doi:10.3390/app14135773

Open AccessArticle

Subject-Independent Model for Reconstructing Electrocardiography Signals from Photoplethysmography Signals

¹

School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin 541004, China

²

School of Life and Environmental Sciences, Guilin University of Electronic Technology, Guilin 541004, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(13), 5773; https://doi.org/10.3390/app14135773

Submission received: 8 May 2024 / Revised: 24 June 2024 / Accepted: 26 June 2024 / Published: 2 July 2024

(This article belongs to the Special Issue Machine Learning Based Biomedical Signal Processing)

Download

Browse Figures

Versions Notes

Abstract

:

Electrocardiography (ECG) is the gold standard for monitoring vital signs and for diagnosing, controlling, and preventing cardiovascular diseases (CVDs). However, ECG requires continuous user participation, and cannot be used for continuous cardiac monitoring. In contrast to ECG, photoplethysmography (PPG) devices do not require continued user involvement, and can offer ongoing and long-term detection capabilities. However, from a medical perspective, ECG can provide more information about the heart. Currently, most existing work contains different signals recorded from the same subject in training and test sets. This study proposes a neural network model based on a 1D convolutional neural network (CNN) and a bidirectional long short-term memory (BiLSTM) network. This neural network model can directly reconstruct ECG signals from PPG signals. The learned features are captured from the CNN model and fed into the BiLSTM model. In order to verify the validity of the model, it is evaluated using the MIMIC II dataset in the completely subject-independent model (records are placed in a training set, and a test set appears once, but the test signal belongs to a record that is not in the training set). The Pearson’s correlation coefficient between the reconstructed ECG and the reference ECG of the proposed model is 0.963 in the completely subject-independence model. The results of the proposed model are better than those of several cited state-of-the-art models. The results of our trained model indicate that we can obtain reconstructed ECGs that are highly similar to reference ECGs in the completely subject-independent model.

Keywords:

electrocardiogram reconstruction; photoplethysmography; convolutional neural networks; bidirectional long short-term memory

1. Introduction

Cardiovascular diseases (CVDs) are the leading cause of death worldwide. According to the statistics in [1], one person dies from CVDs every 37 s in the United States. The early diagnosis and detection of CVDs can help in treating and avoiding complications that can lead to death [2]. ECG is the gold standard for monitoring vital signs and for diagnosing, controlling, and preventing CVDs [3,4]. Studies have shown that young people, especially athletes, are more susceptible to sudden cardiac arrest than ever before [5]. Regular ECG monitoring facilitates the early detection of CVDs by measuring the electrical activity of the heart and conveying information about cardiac function [2]. However, most traditional ECG devices limit the user’s physical activity. For instance, the Zio Patch is a wearable device worn on the chest. However, prolonged use during multi-day monitoring may elevate the risk of skin sensitization in individuals with sensitive skin [6]. Conversely, the Apple Watch, a popular wrist-based ECG monitoring device, can only measure signals for 30 s and necessitates active user participation, thereby hinders long-term continuous ECG monitoring. Holter monitor is a device used to record the electrical activity of the heart, especially in dynamic electrocardiogram (DCG). However, its recording leads are limited and cannot reflect the condition of the entire heart. Moreover, the patient is in an active state, which will affect the recording quality of the electrocardiogram to some extent.

PPG is a noninvasive signal that reflects the amount of blood pulsation in tissues [7]. The waveform shape (i.e., signal morphology), pulse interval, and amplitude characteristics of PPG provide crucial information about the cardiovascular system [8], including heart rate, heart rate variability [9], respiration [10], and blood pressure [11]. PPGs are more accessible and less expensive to set up than ECGs, making them nearly ubiquitous in clinics and hospitals as finger or toe clips and oximeters. These devices do not require constant user intervention and offer a continuous, long-term detection capability without causing skin irritation. Despite the frequent use of PPG for healthcare monitoring [12], ECG remains the standard and an essential measure for medical diagnosis and is supported by a large body of literature and research. Clinicians still rely on ECG rather than PPG in clinical diagnosis. However, the peak-to-peak intervals of PPG are highly correlated with the RR intervals of ECG [8], suggesting the possibility of deriving ECG from PPG. Therefore, attempts can be made to utilize this correlation to reconstruct ECGs directly from PPGs if ECG signals can be successfully reconstructed from PPG signals acquired from today’s wearable devices. In this manner, machine learning would enable the clinical diagnosis of cardiovascular diseases to be conducted at any time, in any location, and in real-time.

Several studies have proposed using the discrete cosine transform (DCT) method [13], cross-domain joint dictionary learning (XDJDL) method [14], scattering wavelet transform (SWT) method [15], lightweight neural network model [16], and deep learning models based on encoder–decoder structures [17] to reconstruct ECG signals beat by beat. The first three studies use mathematical methods to investigate the correlation between ECG and PPG. In [13], the authors propose a beat-based linear regression model. However, the relationship between ECG and PPG is not linear. In [14], the authors use dictionary learning to learn the features between ECG and PPG. However, the correlation coefficient on the BIDMC dataset [18] is only 0.82. In [15], the authors propose a beat-based nonlinear regression model. The last two studies use deep learning methods to explore the correlation between ECG and PPG. In [16], the authors propose a deep learning model for beat-based autoencoders. The model introduces a compressed version with fewer parameters, but the relationship between the reconstructed and the actual ECG beat on the BIDMC dataset [18] is about 0.89. In [17], the authors propose a deep learning model for beat-based autoencoders. These studies focus on performing ECG reconstruction on a beat-by-beat basis, conduct period segmentation of the signal during data preprocessing, and utilize a peak-to-peak segmentation technique to minimize errors in PPG onset monitoring. However, the varying lengths after segmentation necessitate resampling of the beat data, which could lead to mistakes.

Several studies have proposed P2E-WGAN [19], CardioGAN [20], UNet-BiLSTM [21], bidirectional long short-term memory (BiLSTM) [22], and PPG2ECGps [23] models to reconstruct fixed-length ECG signals. In these studies, the signal is segmented at a fixed length during data preprocessing to prevent errors caused by beat segmentation. Both P2E-WGAN [19] and CardioGAN [20] models are based on generative adversarial network (GAN) structures. The former is a conditional generative adversarial network model that operates directly on one-dimensional time series data to reconstruct ECG signals from PPGs. However, the MIMIC II dataset’s [24] relationship between the reconstructed and the actual ECG is only 0.835. The latter utilizes an attention-based generator, bi-time domain and frequency discriminators, and CycleGAN [25] to obtain ECG signals and develop a representation that connects medical knowledge to the PPG-to-ECG reconstruction task. The model uses an attention-based network, which increases the model parameters’ mathematics and is unsuitable for wearable devices. The UNet-BiLSTM model [21] performs segmentation after aligning the signal in the preprocessing process. However, signal alignment requires peak detection of the signal, and the accuracy of peak detection depends on the accuracy of the peak detection algorithm. Some errors may occur in this process. The BiLSTM model [22] and PPG2ECGps [23] model were proposed for specific subjects.

In [17], the authors categorize the training process as subject-dependent, partially subject-independent, and completely subject-independent based on the level of subject dependency. The subject-dependent is a subject-specific model, i.e., a model designed for a single person. The partially subject-independent and completely subject-independent are group models, with the difference being the division of the dataset. Partially subject-independent refers to using a portion of all subject records for training and another for testing. Movement on one set of subjects and testing on a completely different set of subjects highly depend on the subject matter. The UNet-BiLSTM model [21] was to reconstruct ECG signals in the partially subject-independent model. It only studies the performance of ECG reconstruction from PPG when the test set and training set are from the same subjects. However, it does not study the performance of ECG reconstruction from PPG when the test set and training set are from different subjects. This study proposes a neural network model called CNN-BiLSTM, based on a 1D convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM), for reconstructing ECGs from PPGs. This neural network is designed for a subject-independent rather than a subject-specific model. This study validates the model’s effectiveness of the proposed model in a completely subject-dependent model. This study aims to verify the proposed model’s performance without including subject information, hoping to reconstruct better ECG signals even when facing unknown subjects.

2. Materials and Methods

This section introduces the datasets used in this study, the preprocessing of ECG and PPG signals, the architecture of the proposed CNN and BiLSTM hybrid model, and the model performance evaluation. Figure 1 shows the flowchart of the proposed CNN and BiLSTM hybrid model. Figure 1a shows the flowchart of the training and validation process of the proposed CNN and BiLSTM hybrid model. Figure 1b shows the flowchart of the testing process of the proposed CNN and BiLSTM hybrid model.

2.1. Dataset

In this study, the data used to test the model were extracted from PhysioNet’s Multiparameter Intelligent Monitoring in Intensive Care (MIMIC II) database [24]. This dataset contains 12,000 records of different lengths. Each record comprises three signals, namely arterial blood pressure (ABP), photoplethysmogram (PPG), and lead II electrocardiogram (ECG) signals. These signals are sampled at a frequency of 125 Hz. In the present study, our focus was solely on analyzing PPG and ECG signals. Initially, the signals from the dataset that exceed a duration of 8 min were chosen. Given the signals’ varying lengths, handling them uniformly is not advisable. Therefore, we opted only to consider the initial 8 min of data from each record, disregarding the subsequent portions. In this study, the first 600 records of the previously extracted signals, each with a length of 8 min, were selected as our dataset.

2.2. Preprocessing

After data preprocessing of ECG and PPG signals, paired ECG and PPG signal segments are obtained. Data preprocessing mainly includes four steps: filtering, normalization, segmentation, and data splitting.

Filtering. A fourth-order Chebyshev bandpass filter is applied to the PPG and ECG signals in the range of 0.5–10 Hz and 0.5–20 Hz, respectively.
Normalization. The PPG signal is normalized. After data filtering, the PPG signal was scaled to the range of [0, 1]. Note that since the amplitude of the ECG signal is meaningful, the amplitude of the PPG signal is meaningless. Therefore, only the PPG signal is normalized here, and the ECG signal is not normalized.
Segmentation. The ECG and PPG signals were segmented. The obtained ECG and PPG signals were divided into 8 s segments. In [26], the length of signal segmentation is 1000 samples. Therefore, in this study, the signal is segmented into 8 s (1000 samples) signal segments.
Data splitting. The ECG and PPG signals were divided into completely subject-independence models. Specifically, 60% of the total records were used as a training set, 20% as a validation set, and 20% as a test set.

2.3. Model Architecture

This study proposes a neural network model called CNN-BiLSTM for reconstructing ECG signals, as illustrated in Figure 2. In Figure 2, “Conv1D” represents a one-dimensional convolutional layer; “ReLu” is the activation function used in the corresponding convolutional layer; the “Maxpooling” refers to a one-dimensional maximum pooling layer; “BiLSTM” refers to the bidirectional long short-term memory layer, and the “Fully Connected” represents the fully connected layer.

In Figure 2, our proposed CNN-BiLSTM model consists of a one-dimensional convolutional neural network (CNN) structure and a bidirectional long short-term memory (BiLSTM) network. The model we propose mainly includes the input layer, the CNN layer, the BiLSTM layer, the fully connected layer, and the output layer. The input layer inputs the PPG signal segment. The CNN layer has two convolutional layers, the ReLU layer and the maximum pooling layer. CNN is the most popular technique in deep learning [27]. It is also known as a feature learner, and has great potential to extract useful features from the input raw data automatically. CNN is generally composed of the input layer, the convolution layer, the pooling layer, the fully connected layer, and the output layer. The convolution layer convolves the input signal with the convolution kernel, and then applies an activation function to introduce nonlinearity. The convolution layer and activation function enable the network to learn complex features. The pooling layer can reduce the computational complexity by reducing the size of the feature map. The pooling layer helps to extract the most important features. CNN is usually composed of multiple stacked convolutional layers and pooling layers to extract higher-level features. The advantage of choosing the CNN model is its ability to perform perceptual tasks to learn ECG-related features from PPG signals. The kernel size, stride and padding of the convolutional layer were 3, 1, and 1, respectively. The BiLSTM layer contains two BiLSTM layers. The BiLSTM model was chosen because it can effectively solve sequential and time series problems [28,29]. Long short-term memory (LSTM) and BiLSTM networks are well suited for solving time series problems. BiLSTM models take longer to reach equilibrium than LSTM models but provide better predictions [30]. There are two fully connected layers in the fully connected layer. The output layer outputs the reconstructed ECG signal segment. We combined CNN and BiLSTM to take advantage of CNN to extract the local features and BiLSTM to capture the global features of the sequence.

2.4. Training Options

This study utilized the PyTorch deep learning framework for training and testing the proposed model and optimized it using the Adam optimizer. The initial learning rate was set to 0.001 and decayed by a factor of 0.1 every 50 steps. The neural network model was trained for 500 epochs using a batch size of 256 pairs of ECG and PPG segments for all records. This model was implemented using the PyTorch 2.0.0 end-to-end open-source machine learning platform, with all code implemented in Python 3.9.16. The model used NVIDIA GeForce RTX 3060Ti (Santa Clara, CA, USA) and Intel Core [email protected] hardware (Santa Clara, CA, USA).

To make the reconstructed ECG more similar to the reference ECG, the loss function used in this study is defined as follows:

L o s s = m s e + (1 - | r |) + m a l

(1)

where

\{\begin{matrix} m s e = \frac{1}{l} \sum_{i = 1}^{l} {(E (i) - E_{r} (i))}^{2}, \\ m a l = m a x_{1 \leq i \leq l} (| E (i) - E_{r} (i) |), \\ r = \frac{\sum_{i = 1}^{l} (E (i) - \bar{E}) \sum_{i = 1}^{l} (E_{r} (i) - {\bar{E}}_{r})}{\sqrt{\sum_{i = 1}^{l} {(E (i) - \bar{E})}^{2}} \sqrt{\sum_{i = 1}^{l} {(E_{r} (i) - {\bar{E}}_{r})}^{2}}} . \end{matrix}

(2)

where

m a l

,

m s e

, and r refer to the maximum absolute loss (MAL), mean square error, and Pearson’s correlation coefficient, respectively.

E (i)

and

E_{r} (i)

are the i-th sample points of the reference ECG and reconstructed ECG, respectively. The variable l represents the sample size of the reference ECG. r is often used to measure the linear correlation between signals. r and

m s e

can ensure the global similarity between the two signals.

m a l

can provide a better match between the reconstructed ECG’s R wave and the reference ECG’s R wave. Here, we choose the loss function that combines r,

m s e

and

m a l

in the hope that the R wave of the reconstructed ECG can better match the R wave of the reference ECG while ensuring the global similarity between the two signals. In this way, the reconstructed ECG signal and the reference ECG signal can have a higher similarity.

2.5. Stitching the Reconstructed ECG Segments and Cross-Correlation Alignment

Stitching the reconstructed ECG segments. The output of the model is composed of 1000 samples of reconstructed ECG segments. Each sample is 8 s long. Therefore, they must be spliced together to form a continuous reconstructed ECG signal. When combining two ECG segments, the second ECG segment is placed after the first ECG segment. The spliced signal is used as the first segment, and the subsequent signal segments are further merged as the second segment. This step is repeated until all test segments in the recording are connected together. Finally, a completely reconstructed ECG signal is formed. In this study, we selected 600 records. When segmenting the data, 360 records were used as training sets, 120 records were used as validation sets, and 120 records were used as test sets. The length of each record is 8 min. When segmenting into 8 s, each record is divided into 60 signal segments, and each signal segment is 8 s long. When splicing the reconstructed ECG signal, sixty segmented signals of 8 s need to be spliced into 8 min signals, and finally, there are 120 reconstructed ECG signals spliced into 8 min.
Cross-correlation alignment. Cross-correlation alignment is used on the spliced reference ECG signal and the reconstructed ECG signal to better assess the similarity between them and the referenced ECG signal reduces the error between them. In this study, cross-correlation alignment was used for the spliced reconstructed ECG signals and the reference ECG signals.

2.6. Performance Evaluation

In order to evaluate the performance of the proposed model on the referenced ECG signal and the reconstructed ECG signal, we use the three metrics as in [21] for evaluation in the test set. In this study, We compute the Pearson’s correlation coefficient (r), root mean square error (RMSE), and percentage root mean square difference (PRD) between the reconstructed ECG signal and the reference ECG signal.

3. Results

In order to evaluate the CNN-BiLSTM model performance on the MIMIC II dataset, this section gives the experimental results of the proposed model not including subject information. The completely subject-independent model gives experimental results that do not include subject information. Figure 3 shows the model input PPG signal. Here, the PPG signal of a 3 s segment input to the model was given.

Figure 4 shows a comparison between the reference ECG and reconstructed ECG signals. In Figure 4, the blue line represents the actual ECG signal (reference ECG), and the red line represents the reconstructed ECG signal. Figure 4a shows the ECG signal reconstructed by the proposed model and the reference ECG signal. Figure 4b shows the reconstructed ECG signal and the reference ECG signal after using cross-correlation alignment for Figure 4a. When cross-correlation alignment is not used for the reference ECG signal and the reconstructed ECG signal, the r, RMSE, and PRD between them are 0.996, 0.027, and 1.745, respectively. After using cross-correlation alignment, the r value between the reference ECG signal and the reconstructed ECG signal becomes 0.995, and the other values remain unchanged. As can be seen from Figure 4, the ECG reconstructed by the proposed model is very similar to the reference ECG. However, using cross-correlation alignment for the reconstructed ECG signal and the reference ECG signal has almost no change in the model performance.

In order to verify the performance of the proposed CNN and BiLSTM hybrid model, this study compares the experimental results of the CNN model with the CNN-BiLSTM model. Figure 5 shows the changes in r, RMSE and PRD values between the reconstructed ECG and reference ECG signals by the CNN model and the CNN-BiLSTM model, and the changes in r, RMSE and PRD values after the cross-correlation alignment of the reconstructed ECG signals and the reference ECG signals by the CNN model and the CNN-BiLSTM model are given. This study compares these four different experiments. In this study, the test set contains 120 ECG signals of 8 min. Figure 5 shows the performance evaluation changes of 120 records in four experiments. In Figure 5, the blue solid line shows the performance evaluation changes of the CNN model; the red solid line shows the performance evaluation changes of the proposed CNN and BiLSTM hybrid model; the purple dotted line shows the performance evaluation changes of the reconstructed ECG signal of the CNN model and the reference ECG after the cross-correlation alignment; the green dotted line shows the performance evaluation changes of the reconstructed ECG signal of the proposed CNN and BiLSTM hybrid model and the reference ECG after the cross-correlation alignment.

Figure 5a shows the change in r value between 120 ECG signals in the test set and the ECG signals reconstructed by the model. From Figure 5a, we can find that the r value of the CNN and BiLSTM hybrid model proposed in some records is higher than that of the CNN model, the r value of the CNN and BiLSTM hybrid model proposed in some records is lower than that of the CNN model, and the r value of the CNN and BiLSTM hybrid model proposed in some records is the same as that of the CNN model. The comprehensive comparison shows that the r value of the proposed CNN and BiLSTM hybrid model is higher than that of the CNN model. However, the values using cross-correlation alignment do not vary much. Figure 5b shows the change in RMSE value between 120 ECG signals in the test set and the ECG signals reconstructed by the model. Figure 5c shows the change in PRD value between 120 ECG signals in the test set and the ECG signals reconstructed by the model. By comparing Figure 5b,c, it is found that the RMSE and PRD values between the reconstructed ECG signal and the reference ECG signal of the proposed CNN and BiLSTM hybrid model are lower than those of the CNN model. It is found in Figure 5 that the performance of the proposed CNN and BiLSTM hybrid model is better than that of the CNN model.

Table 1 presents the experimental numerical results that are completely subject-independent models. When the proposed CNN and BiLSTM hybrid model reconstructed the ECG signal and the reference ECG signal without using cross-correlation, the r, RMSE, and PRD of the reference ECG and the reconstructed ECG were 0.963, 0.119, and 7.855, respectively. After using cross-correlation between the reference ECG signal and the reconstructed ECG signal, the r value between them increases to 0.965, the PRD value decreases to 7.867, and the RMSE value remains unchanged. When the CNN model reconstructed the ECG signal and the reference ECG signal without using cross-correlation, the r, RMSE, and PRD of the reference ECG and the reconstructed ECG were 0.953, 0.237, and 12.884, respectively. After using cross-correlation between the reference ECG signal and the reconstructed ECG signal, the r value between them increases to 0.954, and the RMSE and the PRD value remain unchanged. It can be seen from Table 1 that the performance of the CNN and BiLSTM hybrid model proposed in this study is better than that of the CNN model. After the ECG signal reconstructed by the model and the reference ECG signal are aligned using cross-correlation, the model performance does not change much. Therefore, in this study, cross-correlation alignment has almost no effect on the model performance.

4. Discussion

This section discusses the experimental results of the proposed CNN and BiLSTM hybrid model in completely independent subjects and discusses the limitations of this study. Table 2 shows the results of this and other studies in terms of the completely subject-independent model.

Currently, we compare with existing studies that we found. The deep learning model [17] uses the beat and data augmentation to reconstruct ECG signals. The correlation coefficients of the deep learning model [17] without data augmentation and with data augmentation are 0.846 and 0.908, respectively. The correlation coefficients, RMSE, and PRD of the CNN-BiLSTM model proposed in this study are 0.963, 0.119, and 7.885, respectively. Through [17], it was found that the use of data augmentation improved the performance of the model. This study did not use data augmentation, and the model performance was better than the model with data augmentation in [17]. Compared with the deep learning model, the performance of the model proposed in this study is improved to a certain extent.

In this study, we propose a model named CNN-BiLSTM to reconstruct ECG signals. Although the CNN-BiLSTM model shows significant advantages compared with models in previous studies, it also has certain limitations.

The model proposed in this study was only verified on the MIMIC II dataset, and was not verified on multiple different datasets. Validation against multiple datasets is necessary in subsequent studies.
An ECG signal has 12 leads, but we only used the signal from lead II and did not analyze the ECG signals of multiple leads. The model proposed in this study may not reconstruct the ECG signals of other leads from the PPG signal. In subsequent research, we will study multi-lead signals and reconstruct ECG signals of more leads from PPG.
This study focused only on the properties of the complete ECG waveform and did not examine other characteristics, such as R waves and ST segments. A more comprehensive evaluation of the differences between reconstructed and reference ECG features is warranted in follow-up studies.
The PPG signal needed to reconstruct the ECG signal in this study is the preprocessed signal. The preprocessed PPG signal is a relatively clean signal. In future studies, ECG signals will be reconstructed from noisy PPG signals.

5. Conclusions

This study introduces an innovative structural hybrid model that integrates a convolutional neural network (CNN) with a bidirectional long short-term memory (BiLSTM) network. This model is used to reconstruct ECG signals from PPG signals. Our proposed method generates ECG segments of the same length using 8 s PPG signals. To verify the model’s effectiveness, this study compares the model’s performance with the BiLSTM model before and after implementation. The average Pearson’s correlation coefficient between the reconstructed ECG in the completely subject-independent form and the reference ECG in this study’s MIMIC II dataset is 0.963. The experimental results indicate that the model can effectively reconstruct ECG signals from PPG signals and that the reconstructed ECG signals are similar to the reference ECG signals.

Author Contributions

Z.C. designed the study. Y.G., Q.T., S.L. and Z.C. conceived the study, provided directions and feedback, and revised the manuscript. Y.G. drafted the manuscript for submission with revisions and feedback from the contributing authors. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the project supported by the Joint Funds of the National Natural Science Foundation of China (U22A2092), the National Major Scientific Research Instrument and Equipment Development Project (61627807), the Guangxi Science and Technology Major Special Project (2019AA12005), and the Innovation Project of GUET Graduate Education (Grant No. 2022YCXB08).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data can be downloaded via the following link: https://archive.ics.uci.edu/dataset/340/cuff+less+blood+pressure+estimation (accessed on 10 January 2021). The source code will be made available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ABP	arterial blood pressure
BiLSTM	bidirectional long short-term memory
CNN	convolutional neural network
CVD	cardiovascular disease
DCT	discrete cosine transform
ECG	electrocardiogram
GAN	generative adversarial network
MIMIC	Multiparameter Intelligent Monitoring in Intensive Care
r	Pearson’s correlation coefficient
PPG	photoplethysmography
PRD	percentage root mean square difference
RMSE	root mean square error
SWT	scattering wavelet transform
WHO	World Health Organization
XDJDL	cross-domain joint dictionary learning

References

Heron, M. Heart disease facts. Nat. Vital Statist. Rep. 2019, 68, 1–77. [Google Scholar]
Rosiek, A.; Leksowski, K. The risk factors and prevention of cardiovascular disease: The importance of electrocardiogram in the diagnosis and treatment of acute coronary syndrome. Ther. Clin. Risk Manag. 2016, 12, 1223. [Google Scholar] [CrossRef] [PubMed]
Kligfield, P.; Gettes, L.S.; Bailey, J.J.; Childers, R.; Deal, B.J.; Hancock, E.W.; van Herpen, G.; Kors, J.A.; Macfarlane, P.; Mirvis, D.M.; et al. Recommendations for the standardization and interpretation of the electrocardiogram: Part I: The electrocardiogram and its technology a scientific statement from the American Heart Association Electrocardiography and Arrhythmias Committee, Council on Clinical Cardiology; the American College of Cardiology Foundation; and the Heart Rhythm Society endorsed by the International Society for Computerized Electrocardiology. J. Am. Coll. Cardiol. 2007, 49, 1109–1127. [Google Scholar] [PubMed]
Le, T.; Clark, I.; Fortunato, J.; Sharma, M.; Xu, X.; Hsiai, T.K.; Cao, H. Electrocardiogram: Acquisition and Analysis for Biological Investigations and Health Monitoring. In Interfacing Bioelectronics and Biomedical Sensing; Springer: Cham, Switzerland, 2020; pp. 117–142. [Google Scholar]
Mayo Clinic. Sudden Death in Young People: Heart Problems Often Blamed. Available online: https://www.mayoclinic.org/diseases-conditions/sudden-cardiac-arrest/indepth/sudden-death/art-20047571 (accessed on 10 January 2024).
Wang, D.; Yang, X.; Liu, X.; Jing, J.; Fang, S. Detail-preserving pulse wave extraction from facial videos using consume-level camera. Biomed. Opt. Express 2020, 11, 1876–1891. [Google Scholar] [CrossRef] [PubMed]
Reisner, A.; Shaltis, P.A.; McCombie, D.; Asada, H.H. Utility of the photoplethysmogram in circulatory monitoring. Anesth. Amer. Soc. Anesthesiol. 2008, 108, 950–958. [Google Scholar] [CrossRef] [PubMed]
Allen, J. Photoplethysmography and Its Application in Clinical Physiological Measurement. Physiol. Meas. 2007, 28, R1–R39. [Google Scholar] [CrossRef] [PubMed]
Gil, E.; Orini, M.; Bailon, R.; Vergara, J.M.; Mainardi, L.; Laguna, P. Photoplethysmography Pulse Rate Variability as a Surrogate Measurement of Heart Rate Variability During Non-stationary Conditions. Physiol. Meas. 2010, 31, 1271–1290. [Google Scholar] [CrossRef] [PubMed]
Johansson, A. Neural Network for Photoplethysmographic Respiratory Rate Monitoring. Med. Biol. Eng. Comput. 2003, 41, 242–248. [Google Scholar] [CrossRef] [PubMed]
Chua, E.C.-P.; Redmond, S.J.; McDarby, G.; Heneghan, C. Towards Using Photo-plethysmogram Amplitude to Measure Blood Pressure during Sleep. Ann. Biomed. Eng. 2010, 38, 945–954. [Google Scholar] [CrossRef] [PubMed]
Castaneda, D.; Esparza, A.; Ghamari, M.; Soltanpur, C.; Nazeran, H. A review on wearable photoplethysmography sensors and their potential future applications in health care. Int. J. Biosens. Bioelectron. 2018, 4, 195–202. [Google Scholar] [PubMed]
Zhu, Q.; Tian, X.; Wong, C.W.; Wu, M. Learning your heart actions from pulse: ECG waveform reconstruction from PPG. IEEE Internet Things J. 2021, 8, 16734–16748. [Google Scholar] [CrossRef]
Tian, X.; Zhu, Q.; Li, Y.; Wu, M. Cross-domain joint dictionary learning for ECG reconstruction from PPG. In Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020. [Google Scholar]
Omer, O.A.; Salah, M.; Hassan, A.M.; Mubarak, A.S. Beat-by-Beat ECG Monitoring from Photoplythmography Based on Scattering Wavelet Transform. Trait. Signal 2022, 39, 1483–1488. [Google Scholar] [CrossRef]
Li, Y.; Tian, X.; Zhu, Q.; Wu, M. Inferring ECG from PPG for Continuous Cardiac Monitoring Using Lightweight Neural Network. arXiv 2023, arXiv:201204949. [Google Scholar]
Abdelgaber, K.M.; Salah, M.; Omer, O.A.; Farghal, A.E.A.; Mubarak, A.S. Subject-Independent per Beat PPG to Single-Lead ECG Mapping. Information 2023, 14, 377. [Google Scholar] [CrossRef]
Pimentel, M.A.; Johnson, A.E.; Charlton, P.; Birrenkott, D.; Watkinson, P.J.; Tarassenko, L.; Clifton, D.A. Toward a robust estimation of respiratory rate from pulse oximeters. IEEE Trans. Biomed. Eng. 2016, 64, 1914–1923. [Google Scholar] [CrossRef] [PubMed]
Vo, K.; Naeini, E.K.; Naderi, A.; Jilani, D.; Rahmani, A.M.; Dutt, N.; Cao, H. P2E-WGAN: ECG waveform synthesis from PPG with conditional wasserstein generative adversarial networks. In Proceedings of the 36th Annual ACM Symposium on Applied Computing, Virtual Event, 22–26 March 2021; pp. 1030–1036. [Google Scholar]
Sarkar, P.; Etemad, A. CardioGAN: Attentive Generative Adversarial Network with Dual Discriminators for Synthesis of ECG from PPG. In Proceedings of the AAAI Conference on Artificial Intelligence, Delhi, India, 2–9 February 2021; Volume 35, pp. 488–496. [Google Scholar]
Guo, Y.; Tang, Q.; Chen, Z.; Li, S. UNet-BiLSTM: A Deep Learning Method for Reconstructing Electrocardiography from Photoplethysmography. Electronics 2024, 13, 1869. [Google Scholar] [CrossRef]
Tang, Q.; Chen, Z.; Guo, Y.; Liang, Y.; Ward, R.; Menon, C.; Elgendi, M. Robust reconstruction of electrocardiogram using photoplethysmography: A subject-based Model. Front. Physiol. 2022, 13, 859763. [Google Scholar] [CrossRef] [PubMed]
Tang, Q.; Chen, Z.; Ward, R.; Menon, C.; Elfendi, M. PPG2ECGps: An End-to-End Subject-Specific Deep Neural Network Model for Electrocardiogram Reconstruction from Photoplethysmography Signals without Pulse Arrival Time Adjustments. Bioengineering 2023, 10, 630. [Google Scholar] [CrossRef] [PubMed]
Saeed, M.; Lieu, C.; Raber, G.; Mark, R.G. (Eds.) MIMIC II: A massive temporal ICU patient database to support research in intelligent patient monitoring. In Proceedings of the Computers in Cardiology, Memphis, TN, USA, 22–25 September 2002. [Google Scholar]
Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A.A. (Eds.) Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 27–29 October 2017. [Google Scholar]
Mai, Y.; Chen, Z.; Yu, B.; Li, Y.; Pang, Z.; Zhang, H. Non-Contact Heartbeat Detection Based on Ballistocardiogram Using UNet and Bidirectional Long Short-Term Memory. IEEE J. Biomed. Health Informatics 2022, 26, 3720–3730. [Google Scholar] [CrossRef] [PubMed]
Tsinalis, O.; Matthews, P.M.; Guo, Y.; Zafeiriou, S. Automatic Sleep Stage Scoring with Single-Channel EEG Using Convolutional Neural Networks; Cornell University: Ithaca, NY, USA, 2016. [Google Scholar]
Sakib, M.A.M.; Sharif, O.; Hoque, M.M. Offline Bengali Handwritten Sentence Recognition Using BiLSTM and CTC Networks. In Internet of Things and Connected Technologies, Proceedings of the 5th International Conference on Internet of Things and Connected Technologies (ICIoTCT), Patna, India, 3–5 July 2020; Springer: Cham, Switzerland, 2020. [Google Scholar]
Wang, Q.; Feng, C.; Xu, Y.; Zhong, H.; Sheng, V.S. A Novel PrivacyPreserving Speech Recognition Framework Using Bidirectional LSTM. J. Cloud Comput. 2020, 9, 36. [Google Scholar] [CrossRef]
Siami-Namini, S.; Tavakoli, N.; Namin, A.S. The Performance of LSTM and BiLSTM in Forecasting Time Series. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the proposed CNN and BiLSTM hybrid model. (a) Flowchart of the training and validation process. (b) Flowchart of the testing process.

Figure 2. The architecture of the proposed CNN-BiLSTM model.

Figure 3. Model input: PPG signal.

Figure 4. ECG signal reconstructed from PPG signal experimental results. The abbreviations r, RMSE, and PRD are the Pearson’s correlation coefficient, root mean square error, and percentage root mean square difference, respectively. (a) Comparison of the reconstructed ECG signal and reference ECG signal. (b) Comparison of the reconstructed ECG signal and reference ECG signal with cross-correlation alignment.

Figure 5. ECG reconstruction performance of the proposed CNN and BiLSTM hybrid model and the CNN model. The statistics of (a) Pearson’s correlation coefficient r, (b) the root mean square error (RMSE), and (c) the percentage root mean square difference (PRD).

Table 1. Comparison of the CNN-BiLSTM and CNN model’s performance. Note: NR stands for not reported. r, RMSE, and PRD represent Pearson’s correlation coefficient, root mean square error, and percentage root mean square difference, respectively. E and

E_{r}

represent the ECG signal and the reconstructed ECG signal, respectively.

Table 1. Comparison of the CNN-BiLSTM and CNN model’s performance. Note: NR stands for not reported. r, RMSE, and PRD represent Pearson’s correlation coefficient, root mean square error, and percentage root mean square difference, respectively. E and

E_{r}

represent the ECG signal and the reconstructed ECG signal, respectively.

	Align E with $E_{r}$	r	RMSE	PRD
CNN	No	0.953 ± 0.046	0.237 ± 0.077	12.884 ± 3.974
CNN	Yes	0.954 ± 0.044	0.237 ± 0.077	12.884 ± 3.974
CNN-BiLSTM	No	0.963 ± 0.067	0.119 ± 0.085	7.885 ± 5.637
CNN-BiLSTM	Yes	0.965 ± 0.063	0.119 ± 0.085	7.867 ± 5.638

Table 2. Results of the completely subject-independent scheme. Note: NR stands for not reported. r, RMSE, FD, and PRD represent Pearson’s correlation coefficient, root mean square error, Fréchet distance, and percentage root mean square difference, respectively.

Method	Data	Segment Length	Data Augmentation	r	RMSE	PRD
Deep learning [17]	MIMIC II [24]	Beat	No	0.846	NR	NR
Deep learning [17]	MIMIC II [24]	Beat	Yes	0.908	NR	NR
CNN-BLSTM (this study)	MIMIC II [24]	8 s	No	0.963	0.119	7.885

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guo, Y.; Li, S.; Chen, Z.; Tang, Q. Subject-Independent Model for Reconstructing Electrocardiography Signals from Photoplethysmography Signals. Appl. Sci. 2024, 14, 5773. https://doi.org/10.3390/app14135773

AMA Style

Guo Y, Li S, Chen Z, Tang Q. Subject-Independent Model for Reconstructing Electrocardiography Signals from Photoplethysmography Signals. Applied Sciences. 2024; 14(13):5773. https://doi.org/10.3390/app14135773

Chicago/Turabian Style

Guo, Yanke, Shiyong Li, Zhencheng Chen, and Qunfeng Tang. 2024. "Subject-Independent Model for Reconstructing Electrocardiography Signals from Photoplethysmography Signals" Applied Sciences 14, no. 13: 5773. https://doi.org/10.3390/app14135773

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Subject-Independent Model for Reconstructing Electrocardiography Signals from Photoplethysmography Signals

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset

2.2. Preprocessing

2.3. Model Architecture

2.4. Training Options

2.5. Stitching the Reconstructed ECG Segments and Cross-Correlation Alignment

2.6. Performance Evaluation

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI