Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Classification of Photoplethysmographic Signal Quality with Deep Convolution Neural Networks for Accurate Measurement of Cardiac Stroke Volume

Appl. Sci. 2020, 10(13), 4612; https://doi.org/10.3390/app10134612

by Shing-Hong Liu¹

, Ren-Xuan Li¹, Jia-Jung Wang^2,*, Wenxi Chen³ and Chun-Hung Su^4,5,*

Reviewer 1:

Robert Amelard

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Appl. Sci. 2020, 10(13), 4612; https://doi.org/10.3390/app10134612

Submission received: 14 May 2020 / Revised: 22 June 2020 / Accepted: 30 June 2020 / Published: 3 July 2020

(This article belongs to the Special Issue Deep Learning and Neuro-Evolution Methods in Biomedicine and Bioinformatics)

Round 1

Reviewer 1 Report

This paper proposes a deep learning method for quantifying the signal quality of individual PPG pulses for the application of estimating stroke volume. The authors propose minor modifications to two existing CNN models using image representations of the PPG pulses, and train them according to estimated SV using a custom system. Results show good classification accuracy.

This paper presents an interesting and novel approach to improving SV estimation, building off of the authors' previous developments in ICG. I recommend that a number of changes should be made prior to publication.

Looking at the authors' previous paper [5], it looks like there is substantial proportional error in the Bland-Altman analysis. Although I acknowledge that this isn't the paper under review, it brings into question the underlying assumptions of system accuracy, since error was computed between the custom stroke volume estimation system against a commercial system for training and evaluation. Please discuss the limitations of this study in the Discussion section.

I was surprised to not find any results on improving stroke volume estimation, since that was the primary goal (and is a major component of the title). Fig 6 and 7 show examples of classified pulses, but a statistical assessment of the test set is needed in order to show that the proposed SQI performance enhances SV estimation.

Furthermore, the proposed method was not compared to existing methods. This should be done to show the novelty of the proposed method.

It is unclear whether the training and testing dataset contained distinct participants. The authors state that it was trained on 1200 pulses. How were these chosen? Were some participants used for a hold-out test set? If there are overlapping participants between the train and test set, the evaluation needs to be redesigned and re-evaluated so that the train and test set are not overlapping.

Since the image is a fixed size, and thus requires 0-padding, how do high heart rates affect system accuracy? I did not see a summary of heart rates during the data collection, which would be useful to know when evaluating the system. Please add this piece to the text.

Why were only male participants recruited? It does not seem like there are any scientific justification to list gender as an exclusion criterion, and solely training on male data affects generalizability of the algorithm.

In general, some figures are small (especially Fig 2) and hard to read/interpret. The quality of the figures should be improved for enhanced readability.

The paper can be made much stronger through improved grammar use throughout the manuscript.

SPECIFIC COMMENTS:
Intro: "unlike previous approaches, our method does not require feature extraction". I don't imagine the other DNN methods required explicit feature extractions either. Please clarify.

Section 2 claims that SV is linearly dependent on LVET. However, it is well known that SV is not linearly dependent on LVET. It has a non-linear response due to the volumetric changes in the ventricle during contraction. Is this a fundamental limitation in the formulation of ICG? Can we trust these values then? See for example:
GS Slavin and M Fung, Electromechanical analysis of optimal trigger delays for cardiac MRI, Journal of Cardiovascular Magnetic Resonance 16(Supple 1):P73, 2014.

Fig 3 doesn't need to span such a large space. Also, the pulses currently look like they're floating in free space. It may be more visually appealing to denote some sort of boundary. What do the colours represent? I found this figure a little confusing.

Fig 3b: the authors claim (in the text) that these are "mid" pulses because the "dicrotic notch is not distinct". However, I can distinctly see the time at which the dicrotic notch appears. Please clarify.

Section 3.1: how were the DPPG and DICG computed? Was it a standard first order discrete derivative? Please clarify in the text.

Section 3.1: "thus the PPG and DPPG signals do not have the time lag." Why is this true? Please elaborate.

Section 3.2: why were these two networks chosen? Why did you choose two networks instead of one? Additional motivation is required here.

It isn't clear in the methods how the 2 networks' outputs were fused. Please explain.

Section 3.4: why was 18% used as a cutoff? This seems arbitrary.

Fig 6: some of these "weak" signals look visually very clean (e.g. all three of the full pulses labeled "3"). Is it valid to call these pulses "weak"?

It seems that there are some important limitations of the study that are not discussed (e.g., suspected performance in cardiovascular disease, age, hypertension, heart rate, etc.). Please add limitations to the Discussion section.

MINOR COMMENTS:
I believe you are taking PPG on the neck (as opposed to the finger), but it is never explicitly stated. Please state this early.

You cannot "prove" that the PPG signal is better than ICG (Section 2, line 110). Please rephrase.

Fig 5: what is the x-axis? epochs? Please label.

Author Response

To Reviewer #1:

Thank the first reviewer for his/her valuable comments that make better this manuscript. The texts in this revised manuscript have been corrected/ modified by red words.

Comments and Suggestions for Authors

Looking at the authors' previous paper [5], it looks like there is substantial proportional error in the Bland-Altman analysis. Although I acknowledge that this isn't the paper under review, it brings into question the underlying assumptions of system accuracy, since error was computed between the custom stroke volume estimation system against a commercial system for training and evaluation. Please discuss the limitations of this study in the Discussion section.

ANS: We have added a paragraph to discuss this problem in Discussion section.

In the previous study [5], the LVETs measured either by the PPG or ICG all had a substantial proportional error with the standard reference measured by the phonocardiography (PCG). Although there exists a linear relationship between the SV and LVET according to Equation (1), the SV measured by medis® CS2000 has been calibrated by means of some parameters. In this study, the SV and LVET measured by medis® CS2000 are used as the reference to compare with those measured by our ICG device. As compared with the reference, two relatively lower errors are found in both the SV and LVET measured with the proposed DCNN models, as shown in Figures 8 and 9. Thus, only those LVETs measured by the high quality PPG pulses can be employed to obtain accurate SVs. With the proposed VGG-19 and ResNet-50, Table 5 indicates that the statistic errors of the measured SV are 4.5 ± 14.7 and 2.6 ±14.2 ml, respectively.

I was surprised to not find any results on improving stroke volume estimation, since that was the primary goal (and is a major component of the title). Fig 6 and 7 show examples of classified pulses, but a statistical assessment of the test set is needed in order to show that the proposed SQI performance enhances SV estimation.

ANS: We have added another table (Table 5) to show that the SV can be improved by the high SQI PPG pulse. Also, we have modified both Figures 8 and 9 and added more sentences to explain them.

The testing samples included 1935 PPG pulses and did not overlap the training samples. The high-quality, middle-quality and low-quality samples comprised 942, 73 and 920 PPG pulses, respectively. When the output value of the 2-D DCNN was between 0.8 and 1.0, between 0.5 and 0.8, or between 0 and 0.5, the PPG pulse was classified as a high-quality, middle-quality or low-quality one. Table 4 shows the performance of the VGG-19 and ResNet-50 in the classification of the high- and low- quality levels. The average accuracy (0.895) of the VGG-19 is lower than that (0.925) accuracy of the ResNet-50. But, the sensitivity (0.970) and specificity (0.970) of the VGG-19 are higher than those (0.915 and 920) of the ResNet-50, respectively. For the all testing data, the statistic error of SV is pretty high and found to be 33.5 ± 76.8 ml. Table 5 shows the statistic errors of SV for the three groups (high-quality, middle-quality and low-quality), as classified by the VGG-19 and ResNet-50. With either of the two models, the high-quality group obviously results in the least SV errors. Also, the SV errors using the ResNet-50 are lower than those using the VGG-19 for the three groups with different quality levels.

Table 5. The statistic errors of SV for the three groups as classified by the VGG-19 and ResNet-50.

	SV error (ml)
Group	VGG-19	ResNet-50
High-quality group (N=942)	4.5 ± 14.7	2.6 ± 14.2
Middle-quality group (N=73)	25.4 ± 42.3	19.9 ± 35.1
Low-quality group (N=920)	64.6 ± 102.1	57.67 ± 95.4

Figure 8 shows the results of SQI classification with the ResNet-50 for the PPG (blue line) and DPPG (orange line) signals moderately corrupted by the baseline drift. The SQI level of each pulse has been determined according to the error percentage between the reference SV by medis® CS2000 and the measured SV by our ICG device. An error percentage of below 18%, between 18% and 20%, or of above 20% represents a high-quality, middle-quality, or low-quality PPG pulse, respectively. The 1st and 3rd rows, and the 2nd and 4th rows of the data correspond to the two SVs, and the two LVETs measured by medis® CS 2000 and our ICG device, respectively. The 5th row of the data denotes the error percentage of SV. The red line represents the output value of the ResNet-50. If the output value is larger than 0.8, between 0.5 and 0.8, and less than 0.5, then the PPG pulse will be classified as a high-, middle- and low-quality one, respectively. The cross and circle symbols denote the first zero-crossing point and minimum-value point of DPPG pulse, respectively. For the 7th PPG pulse in the figure, it belongs to one of the PPG pulses with high quality because it has a sharp valley in the starting ejection zone and a clear dicrotic notch. Thus, its corresponding SV error percentage is found to be relatively low, 0.02, and the output value of the ResNet-50 for this pulse is 1.0. In addition, the 2nd and 3rd pulses both belong to low SQI ones, although they have clear dicrotic notches and flat shape in the starting ejection zones. Since their LVET errors are 80 and 97 ms, their corresponding SV error percentages are found to be 0.49 and 0.42, respectively. Thus, two output values of the ResNet-50 for these two pulses both are 0. For the 5th pulse, it belongs to a middle SQI one because it does not have a sharp valley in the starting ejection zone. Thus, its SV error percentage is 0.2 and the output value of the ResNet-50 for this pulse is 0.6.

Figure 8. The results of SQI classification with the ResNet-50 for the PPG (blue line) and DPPG (orange line) signals moderately corrupted by the baseline drift. The 1st and 3rd rows of the data are the two SVs with medis® CS 2000 and our ICG device, respectively, while the 2nd and 4th rows are the two LVETs with medis® CS 2000 and our ICG device, respectively. The 5th row denotes the error percentages of SV. The red line is the output value of the ResNet-50. The cross and circle symbols represent the first zero-crossing point and minimum-value point of DPPG pulse, respectively.

Figure 9 shows the results of SQI classification with the ResNet-50 for the PPG (blue line) and DPPG (orange line) signals in the presence of a serious baseline drift. When the baseline of the PPG pulses is heavily wandered, the proposed ResNet-50 can still successfully identify these pulses as low SQI ones. Thus, the output values of the ResNet-50 for these pulses all are 0.

Figure 9. The results of SQI classification with the ResNet-50 for the PPG (blue line) and DPPG (orange line) signals corrupted by a serious baseline drift. The 1st and 3rd rows of the data are the two SVs with medis® CS 2000 and our ICG device, respectively, while the 2nd and 4th rows are the two LVETs with medis® CS 2000 and our ICG device, respectively. The 5th row represents the error percentages of SV. The red line denotes the output value of the ResNet-50. The cross and circle symbols represent the first zero-crossing point and minimum-value point of DPPG pulse, respectively.

Furthermore, the proposed method was not compared to existing methods. This should be done to show the novelty of the proposed method.

ANS: We have added some paragraphs in Discussion to compare the proposed DCNN method with the rule-based method.

A classification approach using the DCNN does not need predetermined characteristics or features and makes use of full information embedded in the PPG pulse by taking advantage of a deep learning process [28,29]. In our previous study, we have proposed a rule-based method combining with the fuzzy neural network to determine the SQIs of PPG pulses [12]. In order to increase the tolerance of the rule-based method, a PPG pulse with an error percentage of SV less than 40% is considered to be of high quality. In the test data, the statistic error of PPG pulses classified to be of high quality is set 6.4 ± 12.8 ml. But, the accuracies for successfully determining high- and low-quality pulses achieve only 0.83 and 0.86, respectively. On the other hand, in the present work, we label a PPG pulse with high quality when its error percentage of SV is less than 20%. In the test data the statistic error of pulses classified as the high quality with the proposed ResNet-50 is 2.6 ± 14.2 ml. The accuracies for successfully classifying high- and low-quality PPG pulses are 0.91 and 0.94, respectively. Since the performance of the proposed 2-D DCNN approach for the SQI classification seems to be better than the rule-base method, the DCNN method may be applied in increasing the measurement accuracy of SV.

Moreover, when the PPG signals are corrupted by the serious baseline drift, these PPG pulses should be removed by some algorithms before classifying their SQIs in the rule-based method. In the study, the proposed 2-D DCNN approaches (VGG-19 or ResNet-50) can make use of the morphologies of PPG and DPPG waveforms to determine their SQIs. The PPG and DPPG signals are first merged and transformed to an image, as shown in Figure 5, before we can use them to perform the classification task. As shown in Figure 5(c), the image is constructed by the PPG and DPPG pulses in which the PPG pulse is almost lack of fundamental morphology of a traditional PPG waveform, but it can be still correctly classified as a low-quality one by the proposed ResNet-50 (Figure 9). This suggests that the proposed 2-D DCNN approaches may be useful for quality classification of the PPG pulses, even for those seriously corrupted by motion artifacts and power line interference.

It is unclear whether the training and testing dataset contained distinct participants. The authors state that it was trained on 1200 pulses. How were these chosen? Were some participants used for a hold-out test set? If there are overlapping participants between the train and test set, the evaluation needs to be redesigned and re-evaluated so that the train and test set are not overlapping.

ANS: The authors have modified the sentences to explain how to get the training and testing samples.

4.1. Training Outcomes of Deep Convolution Neural Networks

The proposed VGG-19 and ResNet-50 were trained by 1200 PPG pulses that were divided into two categories, high quality (d=1) and low quality (d=0). The high-quality samples included 400 pulses randomly chosen from the 1342 samples, and the low-quality samples comprised 800 pulses randomly chosen from the 1720 samples. We did not use the pulses belonging to the middle quality to train the networks in this study because the sample number of this level was too few, only 73 pulses. In order to balance the sample numbers for the two levels, the high-quality samples were extended to 800 using the 400 samples. Figure 7(a) shows the training results for the VGG-19 model. The training and validation accuracies are 0.88 and 0.90 after training 105 times to avoid the model overfitting the data, respectively. The training and validation loss errors are 0.28 and 0.29, respectively. Figure 7(b) shows the training results for the ResNet-50 model. The training and validation accuracies are 0.94 and 0.96, and loss errors are 0.15 and 0.16 after training 130 times.

4.2. Testing Outcomes of Deep Convolution Neural Networks

Since the image is a fixed size, and thus requires 0-padding, how do high heart rates affect system accuracy? I did not see a summary of heart rates during the data collection, which would be useful to know when evaluating the system. Please add this piece to the text.

ANS: We have added some sentences to explain this question.

3.1. Data Acquisition

Our ICG device was described in the previous study [5], in which analog ICG and PPG signals all were digitalized with a sampling frequency of 500 Hz. The PPG sensors were placed on the neck of the subject. The ICG and PPG signals were filtered to remove the baseline drift and the high frequency noise using a second-order Butterworth bandpass filter in which the lower and upper cutoff frequencies were 0.2 Hz and 10 Hz, respectively. Then, the DPPG and DICG signals were gotten from the PPG and ICG signals by the first order discrete derivative, which were passed a zero-phase forward and reverse second-order Butterworth lowpass filter. Its cutoff frequency was 10 Hz. The first zero crossing point for the DPPG signal during one heart cycle was used to segment the pulse. Because the heart rates of the included subjects were not lower than 60 beats/minute, an image consisted of two pulses, PPG and DPPG, whose length (size) was set to 500 points. If the length of a pulse was less than 500 points, it was padded to become 500-point long with zero points. Figure 5 shows different 150 x 150 images obtained from the segmented PPG (blue line) and DPPG (orange line) signals with three different SQI levels. In Figure 5(a), because the morphologies of the two PPG pulses within the systolic phase are perfect (i.e., the morphologies have the clear distinct dicrotic notch and starting ejection point), they SQIs are high. The two PPG pulses in Figure 5(b) belong to the middle SQI ones, due to a good morphology at the starting ejection point. But, their dicrotic notches are not distinct in the PPG signals. Therefore, the values of their differential signals at the dicrotic notch zone may not be larger than zero. As shown in Figure 5(c), the two PPG pulses own low SQIs, since their amplitudes or baselines have been greatly distorted due to severe motion artifacts.

3.3. Experimental Protocol

This study recruited fourteen healthy male subjects without cardiovascular disease or injured limbs. Their age was between 22 and 29 years (22.7 ± 2.1 years, mean ± standard deviation), weight was between 46 and 78 Kg (61.8 ± 8.8 Kg), height was between 165 and 188 cm (173.1 ± 6.1 cm), and heart rate was between 65 and 78 beats/minute (70.5 ± 3.4 beats/minute). A commercial medical device (medis® CS2000, medis, Germany) with the ICG technology was utilized to measure the beat-to-beat SV that was considered as the reference value in the study. This experiment was approved by the Research Ethics Committee of China Medical University & Hospital (No. CMUH107-REC3-061), Taichung, Taiwan.

Why were only male participants recruited? It does not seem like there are any scientific justification to list gender as an exclusion criterion, and solely training on male data affects generalizability of the algorithm.

ANS: Some of the electrodes must be properly placed onto the chest of the subject. For convenience, the subjects recruited in the study all are males. This including criterion may be one limitation of this study. We have added a paragraph in Discussion section to discuss the limitations related to the present study.

Some limitations exist in the present study. First, because the subjects recruited in this study all are healthy males, the pulses with the middle- or low- quality all are corrupted mostly by the motion artifacts. We do not acquire the PPG pulses belonging to the arrhythmic beats. Thus, the gender and the cardiovascular disease may more or less affect the current results. Second, the pulse morphology would be varied with the arterial compliance which is actually associated with the age and hypertension [30,31]. The age of the included subjects is between 22 and 29 years, and their systolic and diastolic blood pressures all are in a normal range. Thus, the subjects with different ages or with hypertension may have various pulse morphologies that consequently have an effect on the present study. Third, since only the PPG pulses containing one-second duration are used in the study, the heart rates of the recruited participants must be higher than 60 beats/minute.

In general, some figures are small (especially Fig 2) and hard to read/interpret. The quality of the figures should be improved for enhanced readability.

ANS: We have replaced the figures with higher resolution ones (Figures 4 and 6)(the original Figure 2 has been replaced by Figure 4).

Figure 4. Overall procedure for SQI classification of PPG pulses.

Figure 6. The architectures of ResNet-50 and VGG-19.

The paper can be made much stronger through improved grammar use throughout the manuscript.

ANS: We have revised this manuscript and have improved grammar usage throughout the text of the manuscript.

SPECIFIC COMMENTS:
9. Intro: "unlike previous approaches, our method does not require feature extraction". I don't imagine the other DNN methods required explicit feature extractions either. Please clarify.

ANS: We have modified this sentence.

Unlike previous rule-based methods, our proposed approach did not require any features directly extracted from the PPG signal.

Section 2 claims that SV is linearly dependent on LVET. However, it is well known that SV is not linearly dependent on LVET. It has a non-linear response due to the volumetric changes in the ventricle during contraction. Is this a fundamental limitation in the formulation of ICG? Can we trust these values then? See for example:
GS Slavin and M Fung, Electromechanical analysis of optimal trigger delays for cardiac MRI, Journal of Cardiovascular Magnetic Resonance 16(Supple 1):P73, 2014.

ANS: Because SV is measured by the ICG method in this study, the SV shows a linear relation with LVET according to Eq. (1). The SV measured by medis® CS2000 has been calibrated by some parameters. Thus, in this study, we choose the medis® CS2000-based SV and LVET as the references. Our ICG device also applies the ICG method to measure the SV in which PPG pulses are used to help assess the LVET. We have added one paragraph to discuss this issue in Discussion section.

In the previous study [5], a substantial error is usually present in the LVET measured by the PPG or ICG, as compared with the standard reference measured by the phonocardiography. Although the SV has a linear relation with the LVET according to Equation (1), the SV measured by medis® CS2000 has been calibrated through some parameters. In this study, both SV and LVET measured by medis® CS2000 are used as the references to compare with those measured by our ICG device. In the study, one of our findings is that the application of high-quality PPG pulses leads to relatively lower errors in the SV and LVET measurement, as shown in Figures 8 and 9. Thus, only the PPG pulse with high quality can be used to obtain a reliable LVET and subsequently to yield an accurate SV. In Table 5, the SV is measured by the PPG pulses with high quality in which the statistic errors of SV for the VGG-19 and ResNet-50 are found to be relatively low (4.5±14.7 ml and 2.6±14.2 ml), respectively.

Fig 3 doesn't need to span such a large space. Also, the pulses currently look like they're floating in free space. It may be more visually appealing to denote some sort of boundary. What do the colours represent? I found this figure a little confusing.

ANS: We have modified Figure 5 (ie, original Figure 3), and described the colors of PPG and DPPG in the text.

3.1. Data Acquisition

(a)

(b)

(c)

Figure 5. PPG (blue line) and DPPG (orange line) pulses with different quality levels. (a) High quality, (b) Middle quality, (c) Low quality.

Fig 3b: the authors claim (in the text) that these are "mid" pulses because the "dicrotic notch is not distinct". However, I can distinctly see the time at which the dicrotic notch appears. Please clarify.

ANS: We have modified this sentence and made it more clear.

3.1. Data Acquisition

Section 3.1: how were the DPPG and DICG computed? Was it a standard first order discrete derivative? Please clarify in the text.

ANS: We have modified that sentence.

Then, the DPPG and DICG signals were gotten from the PPG and ICG signals by the first order discrete derivative, which were passed a zero-phase forward and reverse second-order Butterworth lowpass filter.

Section 3.1: "thus the PPG and DPPG signals do not have the time lag." Why is this true? Please elaborate.

ANS: We have deleted this sentence.

Section 3.2: why were these two networks chosen? Why did you choose two networks instead of one? Additional motivation is required here.

ANS: Since the number of samples is not large and the characteristics of patterns do not have many differences, we chose the DCNN with the many layers. We have added this sentence in the text.

3.2. Network Architectures

Since the number of samples was not large and there were not many differences in the characteristics of patterns, 2-D DCNNs were chosen to perform classification task in the study. We build two 2-D DCNNs based on the trained DRNN architecture with 50-layer network (ResNet-50) [22], and the trained DCNN architecture with 19-layer network (VGG-19) [23]. In the output layer, we replaced the 1000 fully-connected with the softmax activation by a 1 fully-connected with the sigmoid activation. The VGG-19 and ResNet-50 are the base models in this study, which are pretrained for object detection task on the ImageNet dataset [24]. The architectures of the two 2-D DCNNs are shown in Figure 6, whose detailed descriptions are shown in Tables 1 and 2, respectively. In Table 1, the filters in the VGG-19 all are of 3x3 sizes. The down-sampling is performed directly by the maximum pooling layers that have a stride of 2, and batch normalization is performed right after each convolution and before ReLU activation. Two fully connected layers have sizes of 1024. For the ResNet-50, the main theme is to skip blocks of convolutional layers by using shortcut connections, as shown in Figure 6. The dot lines indicate that the dimensions of input and output are different. Thus, the 1 × 1 convolution with a stride of 2 is used to do the projection shortcut. The solid lines represent that the dimensions of input and output are the same. Then, the identity shortcut is used. In Table 2, the filters in ResNet-50 follow two design rules. First, when the feature sizes of input and output are the same, the layers have the same number of filters. Second, when the feature map size is halved, the number of filters is doubled. The down-sampling is performed directly by convolutional layers that have a stride of 2 and batch normalization is performed right after each convolution and before ReLU activation. The network ends with a global average pooling layer with a 7x7 filter.

It isn't clear in the methods how the 2 networks' outputs were fused. Please explain.

ANS: We have modified the related sentences and Tables 1 & 2.

3.2. Network Architectures

Table 1. Fundamental information about the VGG-19 layers and associated parameters of the network architecture.

Type	Filter Size	Channel #	Input size
Conv1	3×3 3×3	64 64	150×150×3 150×150×64
Max pool	3×3	-	150×150×64
Conv2	3×3 3×3	128 128	75×75×64 75×75×128
Max pool	3×3	-	75×75×128
Conv3	3×3 3×3 3×3 3×3	256 256 256 256	37×37×128 37×37×256 37×37×256 37×37×256
Max pool	3×3	-	37×37×256
Conv4	3×3 3×3 3×3 3×3	512 512 512 512	18×18×256 18×18×512 18×18×512 18×18×512
Max pool	3×3	-	18×18×512
Conv5	3×3 3×3 3×3 3×3	512 512 512 512	9×9×512 9×9×512 9×9×512 9×9×512
Max pool	3×3	-	9×9×512
Flattn	-	1	4×4×512
Fc	-	1	8192
Out	-	1	1024

Table 2. Fundamental information about the ResNet-50 layers and associated parameters of the network architecture.

Type	Filter Size	Channel #	Input size	Times
Conv1	7×7	64	156×156×3	1
Max pool	3×3	-	77×77×64	1
Conv2	1×1 3×3 1×1	64 64 256	38×38×64 38×38×64 38×38×256	3
Conv3	1×1 3×3 1×1	128 128 512	19×19×128 19×19×128 19×19×512	4
Conv4	1×1 3×3 1×1	256 256 1024	10×10×256 10×10×256 10×10×1024	6
Conv5	1×1 3×3 1×1	512 512 2048	5×5×512 5×5×512 5×5×2048	3
Avg pool	7×7	-	5×5×2048	1
Flattn	-	1	5×5×2048	1
Out	-	1	51200	1

Section 3.4: why was 18% used as a cutoff? This seems arbitrary.

ANS: When the thresholds of error percentage of SV for the high- and low-quality pulses are respectively set at 18% and 20%, the highest accuracy can be obtained for the testing data.

Fig 6: some of these "weak" signals look visually very clean (e.g. all three of the full pulses labeled "3"). Is it valid to call these pulses "weak"?

ANS: We define the quality level of the pulse according to the error percentage of SV. Although a few pulses look like clean, some degrees of distortion really exist in their morphologies. The authors have modified Figures 8 and 9, and revised their description in the text.

It seems that there are some important limitations of the study that are not discussed (e.g., suspected performance in cardiovascular disease, age, hypertension, heart rate, etc.). Please add limitations to the Discussion section.

ANS: We have added a paragraph to discuss the limitations in Discussion section.

There are some limitations in the present study. First, because the subjects recruited in this study all are healthy males, the pulses with the middle- or low- quality all are corrupted mostly by the motion artifacts. In the study, we do not acquire the PPG pulses belonging to the arrhythmic beats. Thus, the gender and the cardiovascular disease may somewhat affect the current results. Second, the PPG pulse morphology would be varied with the vascular compliance which is closely associated with the age and hypertension [30,31]. The age of the included subjects is between 22 and 29 years, and their systolic and diastolic blood pressures all are in a normal range. Thus, the subjects with different ages or with hypertension may have various pulse morphologies that may consequently influence the present outcome. Third, only one-second episodes of PPG signals are employed in the current study. To make sure that each one-second PPG signal contains at least one cardiac cycle data, the heart rates of the recruited participants must be higher than 60 beats/minute.

MINOR COMMENTS:
20. I believe you are taking PPG on the neck (as opposed to the finger), but it is never explicitly stated. Please state this early.

ANS: We have added a sentence to describe this question in 3.1 Data Acquisition.

3.1. Data Acquisition

Our ICG device was described in the previous study [5], in which analog ICG and PPG signals all were digitalized with a sampling frequency of 500 Hz. The PPG sensors were placed at the neck of the subject.

You cannot "prove" that the PPG signal is better than ICG (Section 2, line 110). Please rephrase.

ANS: We have deleted this sentence.

Impedance Cardiography

In 1966, Kubicek et al. proposed an ICG, a noninvasive technique, to assess the continuous SV [16]. Equation (1) governs the ICG method,

, (1)

where r_b is the blood resistivity that is assumed to be a constant value of 150 ohm×cm, L is the distance (cm) between two recording electrodes on the neck and chest, Z₀ is the base impedance (ohm) between the recording electrodes indicating initial thoracic cavity, and dZ⁄dt(max) is the absolute value (ohm/sec) of the maximum change of the ICG impedance signal. According to Equation (1), the SV has an absolute linear relationship with the LVET and dZ⁄dt(max). Figure 3 shows the synchronous ECG, ICG, and PPG signals, and the differential ICG (DICG) and differential PPG (DPPG) signals. The PPG signal seems to be corrupted by less motion artifacts than the ICG signal, as shown in Figure 3 [5]. The LVET is defined in the DPPG signal as the time interval between the first zero crossing point and the minimum point. The LVETs of heart beats measured by high-quality PPG pulses would become more accuracy.

Fig 5: what is the x-axis? epochs? Please label.

ANS: We have modified Figure 7.

(a) (b)

Figure 7. Accuracy and loss profiles of training and validation in the two models. (a) VGG-19 model; (b) ResNet-50 model.

Author Response File: Author Response.pdf

Reviewer 2 Report

The manuscript submitted by Liu and co-authors describes a deep learning approach to analyse signals from photoplethismography.

The manuscript is interesting and relevant. It can be improved in several ways for which the following points are suggested. It would be good to start the article thinking that a reader may have never encountered a PPG signal previously. As it is written, it would be very difficult for a novice to understand what this paper is about. The second phrase states “… two types, transmission and reflection”, and I understand what this means because I know about PPG, but this should not be left to either readers familiar with the technique or to the imagination. A figure illustrating what PPG is, how it is obtained (transmission and reflection) and the signals obtained would be necessary. This would also support later text where the main peak, dicrotic notch etc. are mentioned. Again, for a reader who is not familiar with a PPG signal it would be impossible to know what these terms mean.

Figure 2 should be improved: 1) each signal should be displayed in an axis, otherwise it is misleading because as it stands it implies that the ICG signal has a higher amplitude than an PPG, which has a higher amplitude than DICG, etc. 2) The caption should significantly expanded and explain the signals and the acronyms. Captions and figures should be self-explanatory so that a reader can understand them without reading the text. This applies to all figures. 3) it would be good to add an ECG signal, with which most people are familiar, to illustrate time.

Figs3/4 the authors should be consistent and select either high quality, middle quality, low quality OR good middle bad. I am surprised that there are more low quality pulses than high quality, is this something the authors expected?

Figs 6 and 7 is very difficult to interpret, refer to my previous comment. Even if it has a legend, the caption should explain the figure COMPLETELY. What do you mean by 1,2,3? What are the x? The value of the red line changes between the last two “1”, why? Where is the “serious baseline drift in Fig 7?

Finally, since this manuscript states the need of AI based methods against several rule-based algorithms it would be important to compare against one such method.

Author Response

To Reviewer #2:

Thank the first reviewer for his/her valuable comments that make better this manuscript. The texts in this revised manuscript have been corrected/ modified by red words.

Comments and Suggestions for Authors

The manuscript submitted by Liu and co-authors describes a deep learning approach to analyse signals from photoplethismography.

The second phrase states “… two types, transmission and reflection”, and I understand what this means because I know about PPG, but this should not be left to either readers familiar with the technique or to the imagination. A figure illustrating what PPG is, how it is obtained (transmission and reflection) and the signals obtained would be necessary.

ANS: In Introduction section, several sentences and a new figure (Fig. 1) have been added to describe the two types of PPG, transmission and reflection.

The non-invasive techniques for the photoplethysmography include two optical types, transmission and reflection [6], as shown in Figure 1. A light-emitting diode (LED) is often used to generate low-intensity infrared light on the skin, and a portion of the light will be absorbed mainly by both arterial and venous blood. For the reflection PPG, the non-absorbed light will be reflected and detected by a photo diode. The LED and photo diode are placed on the same side, as shown in Figure 1(a). For the transmission PPG, the non-absorbed light will be transmitted and detected by a photo diode. The LED and photo diode are placed on the opposite side, as shown in Figure 1(b). In either the reflection or the transmission method, the PPG signal represents the changes in blood volume (Figure 1), though it cannot be used to quantify the amount of blood.

(a) (b)

Figure 1. Schematic illustration of the photoplethysmography. The LED illuminates the skin and the non-absorbed light will be detected by the photo diode. (a) For the reflected method, the LED and photo diode are on the same side, (b) For the transmitted method, the LED and photo diode are on the opposite side.

This would also support later text where the main peak, dicrotic notch etc. are mentioned. Again, for a reader who is not familiar with a PPG signal it would be impossible to know what these terms mean.

ANS: We have added a new figure (Figure 2) to describe the characteristics of a PPG pulse.

PPG is a noninvasive optical measurement method, in which the change of blood volume interconnects the physiological responses to circulatory events in peripheral blood vessels. Thus, its waveform bears the regular morphological characteristics [7,8]. As shown in Figure 2, there are a lot of physiological characteristics in a PPG pulse, including the main peak, dicrotic notch, pulse width, and amplitude, and so on. As a result, many researchers have used those significant characteristics (i.e., the rule-based methods) to determine the quality of each PPG pulse. Then, the signal quality index (SQI) represents the corrupted degree of the PPG pulse. Liu et al. have employed the fuzzy rules to determine the SQI of PPG pulses [9], and Fischer et al. have applied the characteristics of PPG waveform and the decision tree to classify its SQI [10]. Li et al. have used the Bayesian hypothesis testing method to analyze the SQI [11]. In these studies, they all need to adjust the thresholds of rule-base method to get the best results. Recently, Liu et al. used the fuzzy neural network to evaluate the SQI [12]. Although they used the artificial intelligent method to gauge the quality of the minor corrupted PPG pulse, the rule-base method was also used to delete the major corrupted PPG pulses.

Figure 2. The characteristics of a PPG pulse chiefly include the main peak, dicrotic notch, pulse width, and pulse amplitude.

Figure 2 should be improved: 1) each signal should be displayed in an axis, otherwise it is misleading because as it stands it implies that the ICG signal has a higher amplitude than an PPG, which has a higher amplitude than DICG, etc. 2) The caption should significantly expanded and explain the signals and the acronyms. Captions and figures should be self-explanatory so that a reader can understand them without reading the text. This applies to all figures. 3) it would be good to add an ECG signal, with which most people are familiar, to illustrate time.

ANS: We have modified the captions of Figure 3 (original Figure 2), and modified the display of ECG, ICG, PPG, DICG and DPPG.

Impedance Cardiography

In 1966, Kubicek et al. proposed an ICG, a noninvasive technique, to assess the continuous SV [16]. Equation (1) governs the ICG method,

, (1)

where r_b is the blood resistivity that is assumed to be a constant value of 150 ohm×cm, L is the distance (cm) between two recording electrodes on the neck and chest, Z₀ is the base impedance (ohm) between the recording electrodes indicating initial thoracic cavity, and dZ⁄dt(max) is the absolute value (ohm/sec) of the maximum change of the ICG impedance signal. According to Equation (1), the SV has an absolute linear relationship with the LVET and dZ⁄dt(max). Figure 3 shows the synchronous ECG, ICG, and PPG signals, and the differential ICG (DICG) and differential PPG (DPPG) signals. In this figure, the PPG signal seems to be corrupted by less motion artifacts than the ICG signal [5]. The LVET is defined in the DPPG signal as the time interval between the first zero crossing point and the minimum point. The LVETs of heart beats measured by high-quality PPG pulses would become more accuracy.

Figure 3. Typical signal registration in the study. From the first row to the fifth row are the ECG, ICG, PPG, DICG and DPPG, respectively. The LVET is defined as the time interval between the first zero crossing point (short arrow) and the minimum point (long arrow) in the DPPG signal.

Figs3/4 the authors should be consistent and select either high quality, middle quality, low quality OR good middle bad. I am surprised that there are more low quality pulses than high quality, is this something the authors expected?

ANS: We have corrected the mistakes in Figures 4 and 6.

Figure 4. Overall procedure for SQI classification of PPG pulses.

Figure 6. The architectures of ResNet-50 and VGG-19.

Figs 6 and 7 is very difficult to interpret, refer to my previous comment. Even if it has a legend, the caption should explain the figure COMPLETELY. What do you mean by 1,2,3? What are the x? The value of the red line changes between the last two “1”, why? Where is the “serious baseline drift in Fig 7?

ANS: We have added two paragraphs in the text to explain Figures 8 and 9, respectively. Also, the figure captions have been modified to completely explain the figures.

Figure 8 shows the results of SQI classification with the ResNet-50 for the PPG (blue line) and DPPG (orange line) signals moderately corrupted by the baseline drift. The SQI level of each pulse has been determined according to the error percentage between the reference SV by medis® CS2000 and the measured SV by our ICG device. An error percentage of below 18%, between 18% and 20%, or of above 20% represents a high-quality, middle-quality, or low-quality PPG pulse, respectively. The first and third rows, and the second and fourth rows of the data correspond to the SVs and the LVETs measured by medis® CS 2000 and our ICG device, respectively. The fifth row of the data denote the error percentages of SV. The red line represents the output value of the ResNet-50. If a value is larger than 0.8, between 0.5 and 0.8, and less than 0.5, then the PPG pulse will be classified as a high-, middle- and low-quality one, respectively. The cross and circle symbols denote the first zero-crossing point and minimum-value point of DPPG pulse, respectively. For the 7th PPG pulse in the figure, it belongs to one of the PPG pulses with high quality because it has a sharp valley in the starting ejection zone and a clear dicrotic notch. Thus, its corresponding SV error percentage is found to be relatively low, 0.02, and the output of the ResNet-50 for this pulse is 1.0. In addition, both of the 2nd and 3rd pulses belong to low SQI ones, although they have clear dicrotic notches and flat shape in the starting ejection zones. Since their LVET errors are 80 and 97 ms, their corresponding SV error percentages are found to be 0.49 and 0.42, respectively. Thus, both output values of the ResNet-50 for these two pulses are 0. For the fifth pulse, it belongs to a middle SQI one because it does not have a sharp valley in the starting ejection zone. Thus, its SV error percentage is 0.2 and the output value of the ResNet-50 for this pulse is 0.6.

Figure 8. The results of SQI classification with the ResNet-50 for the PPG (blue line) and DPPG (orange line) signals moderately corrupted by the baseline drift. The first and third rows of the data are the SVs with medis® CS 2000 and our ICG device, respectively, while the second and fourth rows are the LVETs with medis® CS 2000 and our ICG device, respectively. The fifth row denotes the error percentages of SV. The red line is the output value of the ResNet-50. The cross and circle symbols represent the first zero-crossing point and minimum-value point of DPPG pulse, respectively.

Figure 9. The results of SQI classification with the ResNet-50 for the PPG (blue line) and DPPG (orange line) signals corrupted by a serious baseline drift. The first and third rows of the data are the SVs with medis® CS 2000 and our ICG device, respectively, while the second and fourth rows are the LVETs with medis® CS 2000 and our ICG device, respectively. The fifth row represents the error percentages of SV. The red line denotes the output value of the ResNet-50. The cross and circle symbols represent the first zero-crossing point and minimum-value point of DPPG pulse, respectively.

Finally, since this manuscript states the need of AI based methods against several rule-based algorithms it would be important to compare against one such method.

ANS: We have added some paragraphs in Discussion section to compare the DCNN method with the rule-based method.

A classification approach using the DCNN does not need predetermined characteristics or features and makes use of full information embedded in the PPG pulse by taking advantage of a deep learning process [28,29]. In our previous study, we have proposed a rule-based method combining with the fuzzy neural network to determine the SQIs of PPG pulses [12]. In order to increase the tolerance of the rule-based method, a PPG pulse with an error percentage of SV less than 40% is considered to be of high quality. In the test data, the statistic error of PPG pulses classified to be of high quality is set 6.4 ± 12.8 ml. But, the accuracies for successfully determining high- and low-quality pulses achieve only 0.83 and 0.86, respectively. In the present work, we label a PPG pulse with high quality when its error percentage is less than 20%. In the test data the statistic error of pulses classified as the high quality with the proposed ResNet-50 is 2.6 ± 14.2 ml. The accuracies for successfully classifying high- and low-quality pulses are 0.91 and 0.94, respectively. Since the performance of the proposed 2-D DCNN approach for the SQI classification seems to be better than the rule-base method, the DCNN method may be applied in increasing the measurement accuracy of SV.

Moreover, when the PPG signals are corrupted by the serious baseline drift, these PPG pulses should be removed by some algorithms before classifying their SQIs in the rule-based method. In the study, the proposed 2-D DCNN approaches (VGG-19 or ResNet-50) can make use of the morphologies of PPG and DPPG waveforms to determine their SQIs. The PPG and DPPG signals are first transformed to an image, as shown in Figure 5, before we can use them to perform the classification task. As shown in Figure 5(c), the image is constructed by the PPG and DPPG pulses without fundamental morphology of a traditional PPG waveform, but it can be still correctly classified as a low-quality one by the proposed ResNet-50 (Figure 9). This suggests that the proposed 2-D DCNN approaches may be useful for quality classification of the PPG pulses, even for those seriously corrupted by motion artifacts and power line interference.

Author Response File: Author Response.pdf

Reviewer 3 Report

The paper presents a good contribution; however, the results are not compared with previous methods. I recommend you compare your results with previous related papers.
The text in your figures are difficult to read. I suggest you improve the quality of your figures.
There are some grammatical errors; Please review your manuscript.

Author Response

To Reviewer #3:

Thank the first reviewer for his/her valuable comments that make better this manuscript. The texts in this revised manuscript have been corrected/ modified by red words.

Comments and Suggestions for Authors

The paper presents a good contribution; however, the results are not compared with previous methods. I recommend you compare your results with previous related papers.

ANS: We have added some paragraphs in Discussion section to compare the present results with previous related papers.

The text in your figures are difficult to read. I suggest you improve the quality of your figures.

ANS: We have replotted several figures with high quality, such as Figures 3~6. Please see them in the revised manuscript.

There are some grammatical errors; Please review your manuscript.

ANS: The authors have completely corrected the grammatical errors in the original manuscript.

Author Response File: Author Response.pdf

Article Menu

Classification of Photoplethysmographic Signal Quality with Deep Convolution Neural Networks for Accurate Measurement of Cardiac Stroke Volume

Further Information

Guidelines

MDPI Initiatives

Follow MDPI