1. Introduction
Nowadays, work pressures and fast-paced lifestyles produce various health problems for most people, especially cardiovascular disease (CVD), which causes great damage to the human heart and brain [
1]. Once CVD occurs, there is a high probability of severe cerebral hemorrhage or ischemia, which then seriously threatens the health of patients. The World Health Organization (WHO) considers CVD the primary cause of death all over the world, as it accounted for 31% of the global deaths reported in 2018 [
2]. Blood pressure (BP), as one of the main characteristics of the cardiovascular system, is the main basis for clinical diagnosis and treatment. Controlling BP within a reasonable range is the primary goal in daily health [
3,
4]. Therefore, how to measure BP and predict the trend of BP values effectively has become a key point in preventing CVD [
5].
BP measurement methods can be divided into two types: the invasive method and non-invasive method [
6]. The invasive method refers to the arterial puncture method, which punctures or cuts the blood vessel and then puts the high-precision sensor of the measuring devices directly into the blood vessel [
7,
8]. Although the invasive method is the most accurate, it requires high professional skills and is not suitable for daily health monitoring. Non-invasive methods use various sensing technologies and signal processing methods to measure blood pressure indirectly, including the auscultatory method [
9], oscillometric method [
10], and pulse wave based method [
11,
12]. Among them, the pulse wave based method has attracted a wide range of research interests, due to its easy collection and accurate measurement results.
The pulse wave (PW), the regular pulsation of the arteries, is the cyclical change in the pressure caused by the contraction and relaxation of the ventricles [
13]. Thanks to the PW signal containing important information, such as blood pressure and blood oxygen, some studies have already focused on BP measurement based on PW. In 1984, Tanaka et al. [
14] proposed that BP could be measured by pulse wave velocity (PWV) for the first time. The PWV refers to the propagation speed of pulse waves along the arterial vessel wall, which is affected by such factors as the elasticity and thickness of the vessel wall. Furthermore, PWV can be calculated by the pulse wave transmit time (PWTT) and has a positive correlation with BP. As a result, Wibmer et al. [
15] used PWTT to indirectly calculate the BP value. However, current PWTT methods based on wearable devices still lack clinical accuracy. Thus, a novel non-invasive BP monitoring method called CNAP2GO was proposed in [
16], which has become a breakthrough for wearable sensors for BP monitoring in clinical settings. The characteristics can be extracted from the PW, and then models can be established based on the correlation between the characteristics and BP for dynamic BP measurement [
17,
18,
19].
However, the PW data need to be pre-processed for current BP measurement methods, which is cumbersome and requires medical knowledge. Owing to the rapid development of artificial intelligence (AI), machine learning has become a potential solution for BP measurement and prediction, but the features still need to be manually extracted before inputting them into the neural network. Therefore, we consider introducing Convolutional Neural Networks (CNN) for features extraction. Generally, CNN can be used in the field of pattern recognition [
20,
21], natural language processing [
22,
23] and computer vision [
24,
25]. Furthermore, taking into account that PW is a time sequence, we can introduce the Long Short-Term Memory (LSTM) network to further process the features. LSTM is a specific type of Recurrent Neural Network (RNN), and has the ability to avoid the exploding gradient problem and the vanishing gradient problem in processing long sequences [
26,
27]. LSTM is widely used in the field of speech recognition [
28,
29] and time sequence forecasting [
30,
31]. Nowadays, some researchers combine the advantages of CNN and LSTM to design models for sentiment analysis [
32], time sequences forecasting [
33] and other fields. In particular, several kinds of CNN-LSTM methods are used for BP estimation based on electrocardiograms (ECGs) and photoplethysmography (PPG) [
34,
35], but these methods are still not convenient enough for BP monitoring. In this paper, we focus on designing CNN-LSTM methods for BP prediction based on PW directly. Our contributions are as follows:
We introduce the state-of-art LSTM networks to predict blood pressure based on easy-to-collect pulse wave data so as to realize fast and convenient blood pressure monitoring;
In order to avoid complicated processing of pulse wave data, we further use CNN to extract features from pulse wave before inputting to LSTM to achieve direct blood pressure prediction;
We carry out experiments on real-life data sets and set two groups of benchmarks, where Group 1 only uses neural networks without CNN and Group 2 uses CNN to extract features first. The numerical results show that the proposed method can improve the predicted accuracy by up to 30.41% while saving training time.
The remainder of this paper is organized as follows. In
Section 2, we introduce the related work about BP estimation and prediction.
Section 3 describes the BP prediction problem based on PW data. Then CNN-LSTM prediction method is proposed in
Section 4. We then show the numerical result and analyze the performance in
Section 5. Finally, we make brief conclusions and look forward to future work in
Section 6.
2. Related Work
Artificial intelligence (AI) has been used for BP measurement and prediction without professional medical skills. Zhang et al. [
36] used the Genetic Algorithm-Back Propagation Neural Network to estimate BP after extracting 13 parameters from PPG signals. Chen et al. [
37] proposed a continuous BP measurement method based on the K-nearest-neighbor (KNN) algorithm. The experimental results on the MIMIC II data set achieved a root mean square error of 2.47 mmHg. In Reference [
38], the authors set up a support vector machine regression model and random forest regression model for BP prediction, and the average absolute error was less than 5 mmHg. It is worth noting that all of the above methods require data pre-processing.
In recent years, CNN has also been applied to BP estimation and BP risk level prediction. In Reference [
39], the authors used CNN to generate features from PW automatically to estimate BP from PPG, and achieve better accuracy than the conventional method. Sun et al. [
40] adopted a new kind of CNN based on Hilbert–Huang Transform (HHT) to predict blood pressure (BP) risk level from PPG. However, they did not make predictions about BP values. Generally speaking, CNN is rarely used in the field of BP measurement, but some researchers use CNN for features extraction of medical images.
Because of the advantages of LSTM for long sequences, it has been applied to BP estimation. Zhao et al. [
41] utilized the efficient processing characteristics of LSTM for time series to predict the systolic and diastolic BP. However, the LSTM model is used to predict BP for adult goats, which cannot be applied to human. In Reference [
42], the authors used LSTM for BP estimation. However, they designed a two-stage zero-order holding (TZH) algorithm to process the BP data before LSTM networks. Tanveer et al. [
43] proposed a hierarchical Artificial Neural Network-Long Short Term Memory (ANN-LSTM) model for BP estimation, where ANNs layers were used to extract features from ECG and PPG waveforms, and LSTM layers were used to account for the time domain variation of the features. The mean absolute error (MAE) of systolic and diastolic blood pressure were 1.10 and 0.58 mmHg. Furthermore, Eom et al. proposed an end-to-end CNN-RNN architecture using raw signals (ECG, PPG, etc.) without the process of extracting features; the MAE values were 4.06 and 3.33 mmHg for systolic and diastolic BP [
35].
However, current BP estimation methods have strong dependence on ECG and PPG, which cannot be applied to convenient BP estimation. Therefore, we introduce CNN-LSTM networks to predict BP based on easy-to-collect PW data.
3. Problem Description
In our work, the PW and the BP data are obtained from the Multi-parameter Intelligent Monitoring in Intensive Care (MIMIC) data set [
44], which can be downloaded freely from PhysioBank (
https://archive.physionet.org/cgi-bin/atm/ATM, accessed on 12 July 2021). The MIMIC data set collects physiological data of over 90 patients in the intensive care unit (ICU), and the patient is denoted by
u, where
. The sampling frequency of the data is 500 Hz, and thus we denote time as time slots, i.e.,
.
The PW data are obtained via fingertip pulse oximeter, and the unit is millivolt (mV). The PW data of patient
u can be denoted by
. The BP data used in this paper are the arterial blood pressure (ABP), which is obtained by the invasive method, and the unit is millimeter of mercury (mmHg), which is denoted by
. When making BP predictions of
u for time slot
in time slot
t, we use
time slots PW data, i.e.,
, to predict
. The BP prediction can described as a stochastic process, which is shown as follows:
where
f is a non-linear and complicated function.
is the white noise and
is the parameter set. In other words,
is obtained from the mapping of time sequence
. The white noise
obeys a normal distribution with a mean of zero and the standard deviation of
, i.e.,
.
In general, the function
f is difficult to obtain by traditional methods. Therefore, we introduce neural networks to obtain the approximate function
, which is shown as follows:
In order to evaluate the performance of the approximate function
, a function is constructed as follows:
where
denotes the test set for the training of neural networks. Obviously, the smaller the
, the better the performance of
. Therefore, the aim of training neural networks is to minimize
and find the optimal
:
During the training process,
is obtained based on feedforward propagation, and the neural networks update the weight parameters according to the gradient based on back propagation. By repeating this process, the weight parameters are continuously updated until the error of the loss function meets the precision requirement, which means that the approximate function
is fitted [
31].
4. Proposed CNN-LSTM Prediction Method
In this section, we first introduce Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) Networks. Next, the proposed prediction method, which is based on CNN and LSTM, is described.
4.1. Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are a kind of feedforward neural network. In this work, we mainly consider the convolutional layer and pooling layer of CNN.
Convolutional Layer. In CNN, convolution is the most fundamental operation. Basically, the filter can be seen as the neuron of this layer, which has a weighted input and produces an output value. The essence of convolution is 2-D spatial filtering; filters can only slide on the x-axis and y-axis to extract features. The number of filters and their kernel size need to be carefully determined to form feature maps.
Pooling Layer. There are still too many parameters in feature maps after the convolutional layer; thus, the pooling layer is used for subsampling. On the one hand, pooling makes the feature maps smaller to reduce complexity, and on the other hand, it extracts important features. The general idea of pooling is to create a new feature map by taking the maximum or average value, i.e., max pooling and average pooling, respectively [
22].
In summary, we can take advantage of CNN to extract the features of the input data for obtaining better models.
4.2. Long Short-Term Memory Networks
In a traditional neural networks model, it is fully connected between each layers but it is disconnected between the nodes in each layer [
45]. As a result, it is inefficient for handling sequence problems. As mentioned above, the PW and BP data are both a time sequence. Accordingly, Recurrent Neural Networks (RNN) can be used for prediction. The nodes in the hidden layer of RNN are connected, which is different from traditional neural networks. What is more, the input of the hidden layer includes not only the output of the input layer, but also the output of the hidden layer of the previous time slot. The recurrent connections of RNN can add feedback and memory to the network over time. In summary, RNN has a strong learning ability and input generalization ability for sequence problems.
However, when back-propagation is used in a very deep RNN, the gradient of the neural network may become unstable, which will cause the exploding gradient problem or vanishing gradient problem [
46], hence making the generated model unreliable. These problems can be solved by Long Short-Term Memory (LSTM) networks. As a variant of RNN, LSTM is composed of memory units and several gates. In
Figure 1, we describe the structure of the LSTM unit.
In time slot
t, we presume that the input and the output are
and
, respectively. There are three gates in each unit, called input gate
i, forget gate
f and output
o. The values of the three gates are calculated as follows:
where
,
,
,
,
,
are the correlative weight matrices and variable biases.
Then, the process of information updating in LSTM unit is as follows. In time slot
t, the forget gate decides which part of
to drop, according to
. The input gate decides what information to store in the unit, according to
, and
is calculated as follows:
where
and
are the weight matrix and variable bias of the memory unit. In this way, the current unit is updated as follows:
in addition, the output data are calculated by the following:
after that,
and
pass to the next cell in the next time slot. In conclusion, each unit works like a state machine in which three gates have their own weights, so that LSTM can deal with sequence problems better.
4.3. CNN-LSTM Prediction Method
As mentioned above, we use a hybrid model that consists of CNN and LSTM to predict BP values based on PW data. After the PW data are input, the features extraction is performed by CNN first. Then, the features are input into LSTM for further training. The topology of proposed model is shown in
Figure 2.
In order to improve the accuracy of the BP prediction, we use the PW data of multiple recent time slots to predict the BP values of the next time slot, which is called the windows prediction method. As a result, we need to correlate the input data and output data to generate a data set for training and validation.
Subsequently, we use zero-mean (z-score) normalization to standardize the PW and BP data. Assume that the data are denoted by
, for each input data
. The idea of z-score normalization is shown in Equation (
11):
where
is the mean value of
and
are the standard deviation of
. The mean of the processed data is 0, and the standard deviation is 1.
Notice that there are data in the input of the model, which will increase the complexity of the model. Furthermore, as boosts, the features in the input become sparse and difficult to extract. Hence, we use CNN to extract features from the PW data. In detailed, a convolutional layer with 32 filters and a kernel size of 3 is added, followed by a pooling layer with a pool size of 2 for further subsampling, which adopts the max pooling. At this point, the input features are successfully extracted and are more streamlined than the original input data.
Then, the features are passed to two LSTM hidden layers for further training, and each hidden layer has 50 units. Since the BP prediction problem is a regression problem, we use the dense layer (i.e., fully connected layer) with one neuron to receive the tensor from the LSTM hidden layer and output the BP value. Finally, the output is the predicted BP values. As a result, the proposed method is more efficient for the prediction of BP, due to the combination of the feature extraction capability of CNN with the advantage of LSTM for the time sequence.
6. Conclusions and Discussion
In this paper, we have proposed a novel CNN-LSTM predicted model of blood pressure (BP) based on pulse wave (PW) data. In the CNN-LSTM model, a convolutional layer and a pooling layer are used to extract features from PW data. Then two LSTM hidden layers are used for further training. We set two groups of benchmarks. The experiment results based on the MIMIC data set show that the proposed method is close to or better than the existing BP estimation methods. In particular, the proposed method can significantly improve the predicted accuracy by up to while saving training time, due to combining the advantages of CNN and LSTM.
However, we need to train multiple models to perform BP prediction on different patients in our current work, which cannot be widely used for a large number of patients. In the future, we will consider introducing the state-of-art technologies in the area of AI, e.g., transfer learning, to improve the generalization ability of the model. According to place the generalized model in portable devices which can receive PW data from the fingertip pulse oximeter, we can provide convenient and real-time BP monitoring for a wider range of people.