Machine Learning Electrocardiogram for Mobile Cardiac Pattern Extraction

Zhang, Qingxue; Zhou, Dian

doi:10.3390/s23125723

Open AccessArticle

Machine Learning Electrocardiogram for Mobile Cardiac Pattern Extraction

by

Qingxue Zhang

^1,* and

Dian Zhou

²

¹

Department of Electrical and Computer Engineering, Department of Biomedical Engineering, Purdue School of Engineering and Technology, 723 W. Michigan St., Indianapolis, IN 46202, USA

²

Department of Electrical and Computer Engineering, University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX 75080, USA

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(12), 5723; https://doi.org/10.3390/s23125723

Submission received: 20 April 2023 / Revised: 26 May 2023 / Accepted: 30 May 2023 / Published: 19 June 2023

(This article belongs to the Special Issue Artificial Intelligence and Sensors II)

Download

Browse Figures

Versions Notes

Abstract

:

Background: Internet-of-things technologies are reshaping healthcare applications. We take a special interest in long-term, out-of-clinic, electrocardiogram (ECG)-based heart health management and propose a machine learning framework to extract crucial patterns from noisy mobile ECG signals. Methods: A three-stage hybrid machine learning framework is proposed for estimating heart-disease-related ECG QRS duration. First, raw heartbeats are recognized from the mobile ECG using a support vector machine (SVM). Then, the QRS boundaries are located using a novel pattern recognition approach, multiview dynamic time warping (MV-DTW). To enhance robustness with motion artifacts in the signal, the MV-DTW path distance is also used to quantize heartbeat-specific distortion conditions. Finally, a regression model is trained to transform the mobile ECG QRS duration into the commonly used standard chest ECG QRS durations. Results: With the proposed framework, the performance of ECG QRS duration estimation is very encouraging, and the correlation coefficient, mean error/standard deviation, mean absolute error, and root mean absolute error are 91.2%, 0.4 ± 2.6, 1.7, and 2.6 ms, respectively, compared with the traditional chest ECG-based measurements. Conclusions: Promising experimental results are demonstrated to indicate the effectiveness of the framework. This study will greatly advance machine-learning-enabled ECG data mining towards smart medical decision support.

Keywords:

smart health; ECG; machine learning; pattern recognition; medical decision support

1. Introduction

Heart failure tops the list as the leading cause of death globally [1]. To manage heart diseases, for many decades, electrocardiograms (ECGs) have been used as a gold-standard vital signal, which intrinsically encode complex cardiac physiological processes [2]. However, traditional in-clinic/hospital ECG-based heart health management is usually expensive and inconvenient, which is a major impediment either for real-time cardiac emergency prediction or for long-term chronic heart disease tracking.

Nowadays, advancement in internet-of-things technologies, such as mobile data capturing and signal processing/machine learning techniques [3,4], are enabling more and more out-of-clinic daily health applications [5,6]. Many promising practices in health management, COVID management, mobile health, and data-driven precision medicine have been advanced [7,8]. Many studies have also been reported in terms of ECG-based heart health monitoring. A chest-worn ECG device is the most popular solution [9,10], but it usually suffers from uncomfortableness/inconvenience due to the chest strap and sweating. Other studies place the ECG electrodes on both wrists or both index fingers for ECG acquisition [11,12], which usually cause inconvenience and discontinuity in monitoring.

Generally, it is highly desirable to explore convenient and comfortable health management systems to enable long-term continuous and real-time precision medicine [1,6,8,13,14,15,16,17,18,19,20,21]. We have previously reported a highly wearable ear-worn blood pressure monitoring system, in which the mobile ECG is used to determine the heartbeat occurrence time [22]. However, no further study has been carried out to explore other potential uses of the highly subtle mobile ECG, which is acquired using a nonstandard but unobstructive lead configuration that can provide superior convenience and comfort. It will greatly advance long-term use because a nonstandard ECG lead configuration does not require the strict placement of electrodes on the two arms or the chest under the cloth. However, this also poses a great challenge to analyze the signals that are sensitive to noise and/or interference. We name this kind of nonstandard and highly convenient lead configuration “mobile ECG” to distinguish it from traditional limb- or chest-based configurations. Mobile ECG is very suitable for long-term use, compared with many other studies’ configurations, which usually use the interlimb ECG that requires a wire or the touch of a finger, or the chest ECG that may need a chest strap [1,6,8,13,14,15,16,17,18,19,20,21]. In this study, we take special interest in whether the mobile ECG can be used to track the duration of the QRS complex, which is the central part of an ECG heartbeat. QRS duration carries a great deal of medical information and has been reported to be relevant to many heart diseases, such as a coronary disease that may cause sudden death [23], a right ventricular disease that reduces blood volume [24], and many other diseases [25,26,27,28,29].

Specifically, in this study, we propose a convenient mobile-ECG-based cardiac health management system for long-term real-time ECG QRS duration tracking. To the best of our knowledge, it is the first study on mobile-ECG-based QRS duration estimation for heart health management. The proposed system is empowered by a novel machine learning framework. With this challenging mobile ECG lead configuration, the signal acquired is very tiny and sensitive to motion artifacts; therefore, it is processed by the proposed sophisticated three-stage machine learning framework (Figure 1). Firstly, a support vector machine (SVM) [30] is introduced to identify raw heartbeats from the subtle mobile ECG. Afterwards, a multiview dynamic time warping (DTW) approach [11] is developed, not only to locate the raw QRS complex in each raw heartbeat, but also to quantize the quality of that heartbeat, referring to a predefined high-quality ECG heartbeat template that is learned using a K-medoid clustering method [31]. The raw heartbeat quality information is used to purify the raw QRS complexes by comparing with a quality threshold learned using a histogram triangle method [32]. Finally, the mobile ECG QRS duration is estimated and then a regression-based calibration model is learned to transform the mobile ECG QRS duration estimate to the commonly used standard chest ECG QRS duration. Promising experimental results are demonstrated to indicate the effectiveness of the proposed framework. This study will greatly advance machine-learning-enabled ECG data mining towards smart medical decision support.

Our contributions are summarized as below:

(1): The novel machine learning framework can intelligently and systematically identify the heartbeats from a noisy and subtle mobile ECG, localize the ECG QRS complexes, purify the complexes, and transform the mobile ECG QRS durations to the standard chest ECG QRS durations.
(2): The support vector machine classifier determines the raw heartbeats from the signal spikes, which include both real and false heartbeats that are due to the motion artifacts.
(3): The ECG QRS localization step leverages the multiview dynamic time warping for sophisticated pattern matching, to compare a given raw heartbeat with the high-quality heartbeat template determined using the k-medoid clustering method.
(4): The purification step further leverages the pattern matching scores to generate the signal quality indices and boost the performance.
(5): The transformation of mobile ECG QRS durations to the commonly used standard chest ECG QRS durations facilitates the convenient usage of the extracted cardiac pattern.

We will then detail the proposed novel machine learning algorithm, give the results and discussions, and conclude this study.

2. Materials and Methods

In this section, details of the proposed system are introduced according to the signal processing flow (Figure 1).

2.1. System Overview

The proposed system is shown in Figure 1, illustrating a three-stage framework leveraging advanced machine learning techniques for QRS duration estimation.

2.2. Stage I: ECG Heartbeat Identification

The filtered mobile ECG was learned by an SVM classifier [30] for raw heartbeat identification purposes as shown in Figure 1. Firstly, the ECG stream was segmented using an adaptive threshold approach to select signal spikes as the heartbeat candidate. Afterwards, ten critical motion-artifact-tolerant features were extracted [22], which were then fed into the SVM classifier to learn a heartbeat identification model. A supervised learning strategy was used to train the SVM classifier, with the chest ECG heartbeats as ground truths to differentiate real and false mobile ECG heartbeats in the learning process. The trained SVM decision model is as (1), where

x_{i}

/

y_{i}

/

α_{i}

are the

i - t h

support vector/its class label/its learned weight factor,

b

is the learned bias,

k

is the kernel (a linear one is chosen to lower the computation load),

x

is a ten-dimension feature vector of a heartbeat candidate, and

f (x)

is the predicted label (a raw heartbeat or a false heartbeat) [33].

f (x) = s i g n (\sum_{i = 1}^{V} α_{i} y_{i} \cdot k (x, x_{i}) + b)

(1)

2.3. Stage II: QRS Localization and then Purification

After identifying raw heartbeats, we developed a pattern recognition approach, multiview dynamic time warping (DTW) [11], not only to locate raw QRS complexes, but also to quantize the raw heartbeat quality for purification purposes (due to motion artifacts, the raw heartbeats include many distorted heartbeats, which are to be shown in the Section 3).

Before the following processing steps, all heartbeats were segmented based on the heartbeat locations identified using the SVM and were scaled to be between 0 and 1. One thing worth noting is that these heartbeat segments are bounded by two adjacent R peaks, meaning that an ECG heartbeat segment mentioned in stage II and III of our framework actually includes the second half of a heartbeat and the first half of the following heartbeat. This segmentation method is based on the consideration that it will facilitate the determination of appropriate boundaries for highly subtle and noisy mobile ECG heartbeats by leveraging the most distinguishable R peaks as the natural heartbeat boundaries.

2.3.1. Representative Heartbeat Template Learning by K-Medoid Clustering

DTW performed nonlinear sequence-to-template matching to determine the relation between a testing signal stream and a predefined template signal stream. To select a high-quality representative heartbeat as the template, we applied a k-medoid clustering method [31]. This approach leverages an unsupervised learning strategy to avoid manual template selection that is both inconvenient and subject to non-optimal selection.

Specifically, for each subject, we clustered all raw ECG heartbeats in the training session into K groups and selected the medoid as the template from the group that had the highest number of instances. K was set as 3, and the Euclidean distance was selected as the distance metric to lower the computation load (all heartbeats were resampled to possess the same length). Firstly, the initial medoid seeds were chosen using a K-means++ method [34] as in (2)–(3), where the

j - t h

seed

c_{j}

was selected from a set of resampled raw heartbeats

\hat{Θ}

with probability

w_{\hat{θ_{j}}}

, which is proportional to the Euclidean distance between

\hat{θ_{j}}

and its closest preselected medoid

c_{p}

.

D_{p}

is a set of all raw heartbeats closest to medoid

c_{p}

. Afterwards, the K-medoid problem was solved by a partitioning around medoids (PAMs) strategy [35] as in (4), which greedily checks if swapping each medoid

c_{j}

and each nonmedoid

c

reduces the summarized instance-to-medoid dissimilarity

ξ

, until no progress can be obtained. Finally, a high-quality heartbeat representing most of the raw heartbeats was learned and selected as the DTW template as in (5). This learning process was performed for each subject, using the training data in the training phase.

c_{j} = S e l e c t (\hat{θ_{j}} | w_{\hat{θ_{j}}}, \forall \hat{θ_{j}} ϵ \hat{Θ})

(2)

w_{\hat{θ_{j}}} = \frac{E u c l i d e a n (\hat{θ_{j}}, c_{p})}{\sum_{{h | \hat{θ_{h}} ϵ D_{p}}} E u c l i d e a n (\hat{θ_{h}}, c_{p})}, \forall \hat{θ_{j}} ϵ \hat{Θ}, p < j

(3)

ξ = \sum_{{c_{p} | p = 1, \dots, K}} \sum_{{h | \hat{θ_{h}} ϵ D_{p}}} E u c l i d e a n (\hat{θ_{h}}, c_{p})

(4)

T = \underset{\forall c_{j}}{argmax} N u m (D_{j})

(5)

2.3.2. QRS Localization by Multiview Dynamic Time Warping

To locate the QRS complex, the multiview DTW (MV-DTW) [11] was developed to nonlinearly match each raw heartbeat with the learned representative heartbeat. To reveal more signal characteristics, in addition to the original ECG amplitude series, another two dimensions (the first derivative series and the local angle series) were also extracted, constructing a three-view heartbeat representation which is more robust to motion artifacts, compared to the original single-view time series. To generate the angle dimension, for each sample, we calculated an angle defined by the current sample and its two neighbors (preceding and following the current sample) with a distance of 10 samples to capture the piece-wise fluctuations.

The testing heartbeat

θ_{j}

of a length of

M

and the template

T

of a length of

N

(both have three dimensions) are shown in (6)–(10), respectively, where

j

is the heartbeat index. MV-DTW-based nonlinear stream matching includes three steps. Firstly, a local distance matrix is generated to evaluate all sample-to-sample distance possibilities, with each element

d^{m, n}

defined as (10). Secondly, a path distance table is constructed by a dynamic programming strategy as (11), where each element

D^{m, n}

is the summation of current local distance

d^{m, n}

and the minimum of three preceding neighboring path distance elements. Thirdly, the QRS onset

Q R S_{j}^{o n}

and offset

Q R S_{j}^{o f f}

in the testing heartbeat

θ_{j}

are located using a backward search method as shown in (12) and (13), not only referring to the QRS boundaries (

Q R S_{T e m p}^{o n}

and

Q R S_{T e m p}^{o f f}

) in the template

T

, but also according to the optimal warping path information learned when generating the path distance table

D

. Leveraging MV-DTW and the learned high-quality heartbeat template, the QRS boundaries are expected to be automatically located, from the highly weak and noisy mobile ECG heartbeats.

θ_{j} = \{θ_{j, 1}, θ_{j, 2}, θ_{j, 3}\}

(6)

T = \{T_{1}, T_{2}, T_{3}\}

(7)

θ_{j, l} = \{θ_{j, l}^{m} | 0 \leq m \leq M - 1\}, \forall l

(8)

T_{l} = {T_{l}^{n} | 0 \leq n \leq N - 1}, \forall l

(9)

d_{j}^{m, n} = \sqrt{\sum_{l = 0}^{L - 1} {(θ_{j, l}^{m} - T_{l}^{n})}^{2}} \forall m, \forall n, L = 3

(10)

D_{j}^{m, n} = \{\begin{matrix} d_{j}^{m, n} + \min \{\begin{matrix} D_{j}^{m - 1, n} \\ D_{j}^{m, n} \\ D_{j}^{m, n - 1} \end{matrix} & \forall m > 0 & \forall n > 0 \\ d_{j}^{m, n} & m = 0 & n = 0 \\ i n f & o t h e r w i s e \end{matrix}

(11)

Q R S_{j}^{o n} = B a c k S e a r c h (Q R S_{T e m p}^{o n}, D_{j})

(12)

Q R S_{j}^{o f f} = B a c k S e a r c h (Q R S_{T e m p}^{o f f}, D_{j})

(13)

2.3.3. Heartbeat Distortion Quantization

The mobile ECG is highly weak and sensitive to motion artifacts, resulting from the highly challenging nonstandard single-lead configuration (more visualization will be shown in the Results section to illustrate distorted QRS complexes). Therefore, to further enhance the robustness, we propose quantizing the quality of raw heartbeats identified by the SVM classifier and purifying the raw QRS complexes located by the MV-DTW. The raw heartbeat distortion

π_{j}

is defined by (14), which corresponds to the stream-to-stream path distance

D_{j}^{m, n}

calculated by the MV-DTW. (14) is a special case of (11), by setting m = M and n = N. A large distance results in a high distortion value, and vice versa.

π_{j} = D_{j}^{M, N}

(14)

2.3.4. Distortion Threshold Learning by Histogram Triangle Search

To perform binary quality labelling (high- or low-quality) for raw heartbeats, a distortion threshold is learned from raw heartbeat distortion values using a histogram-triangle search method [32], for each subject (training data). A left-skewed histogram can be generated based on the heartbeat distortion values, leveraging the fact that high-quality heartbeats usually concentrate in the left part of the histogram (low distortion values) and low-quality raw heartbeats (distorted and false heartbeats) usually spread over a large range due to highly diverse behaviors caused by motion artifacts.

A global search method was applied to find the threshold

τ

that corresponds to a transition point in the histogram, which possesses the maximum perpendicular distance to the histogram hypotenuse as defined in (15), where

b

is a bin in the histogram,

h \{b\}

is a function returning the density value for bin

b

, and

D i s {}

is a function returning the distance from a point

((b, h (b))

to the hypotenuse that is drawn between the maximum point

((b_{m a x}, h (b_{m a x}))

and the rightest point

((b_{r i g h t}, h (b_{r i g h t}))

. The learned distortion threshold can reflect the natural histogram transition point between high and low-quality raw heartbeat. An unsupervised learning strategy was used here since manual threshold selection is both inconvenient and subject to nonoptimal selection.

τ = \underset{b_{\max} \leq b \leq b_{r i g h t}}{argmax} D i s \{((b, h (b)), h (b_{m a x}) ~ h (b_{r i g h t})\}

(15)

2.3.5. QRS Purification

Leveraging quantized raw heartbeat distortion conditions and the learned distortion threshold, we could then generate raw heartbeat-specific signal quality indices (SQIs) to filter out low-quality raw QRS complexes. To further enhance the robustness, besides the calculated distortion sequence

π_{j}

, a smoothed version

η_{j}

was also generated as (16) by a moving average operation with an order of 10 (

A

= 10). Both sequences were compared with the distortion threshold and the index

S Q I_{j}

for the

i - t h

raw heartbeat was finally defined by (17).

η_{j} = \sum_{a = 0}^{A - 1} π_{j - a} / A, \forall j

(16)

S Q I_{j} = \{\begin{matrix} 1 & If π_{j} \leq τ and η_{j} \leq τ \\ 0 & o t h e r w i s e \end{matrix}, \forall j

(17)

2.4. Stage III: QRS Duration Calibration

2.4.1. QRS Duration Estimation

After purifying the raw ECG QRS complexes, we could obtain heartbeat-specific QRS duration estimates based on their boundaries. To improve estimation accuracy, we further averaged the estimates over each one-minute datum of interest in each trail. Therefore, we had fifteen averaged mobile ECG QRS estimates for both training and testing sessions for each subject.

2.4.2. Mobile QRS Duration to Chest QRS Duration Calibration

Considering that QRS durations estimated using different lead configurations (such as lead II-wrists and lead IV-chest) may also be different [36], a bias parameter was learned to transform the mobile ECG QRS duration estimates to the chest-ECG-based ones, for each subject using the training data. The bias parameter was calculated based on the average QRS duration with the mobile ECG, and the average QRS duration with the chest EST, meaning that their difference was used as an adjusting factor for the QRS duration with the mobile ECG. More complex calibration models (nonlinear or higher-order) were not considered, not only to prevent an overfitted model but also to fairly reflect how mobile-ECG-based estimates can mimic chest-ECG-based ones.

3. Results and Discussion

In this section, we give detailed experimental results and a discussion according to the signal processing flow in Figure 1. Except for some learning steps, we use all testing data to visualize the results to take into account the generalization ability of the algorithm.

3.1. Experimental Setup

The mobile ECG dataset with the ECG signal from the very convenient area (the ear [22]) was used. The subtle and motion-artifact-sensitive mobile ECG from eight subjects was preprocessed using a six-order Butterworth bandpass filter (2–30 Hz) for baseline wander suppression and power line interference removal. Each subject had two recordings as the training and testing, respectively, and each recording had thirty minutes. The raw and filtered signals are shown in Figure 2, indicating the low amplitude and the noisy characteristics of the mobile ECG which pose a great challenge for cardiac pattern extraction.

3.2. Heartbeat Identification

The heartbeat locations are determined in the first stage of the proposed framework, which are used to segment the raw heartbeats that will be used in the following QRS location and purification steps. An example of the heartbeat identification results is given in Figure 3, where the red dots correspond to the identified mobile ECG heartbeats, indicating most of the mobile ECG heartbeats were successfully located (e.g., the segment 2 in Figure 3c).

At the same time, we can observe that during some segments (e.g., segment 1 in Figure 3b), there may be many highly distorted heartbeats and some false heartbeats, due to the weakness of the signal, exercise stress, and head movements. Firstly, a nonstandard mobile ECG signal is highly weak, and the peak-to-peak voltage is only around 2% to 5% of that of a traditional chest ECG signal. Secondly, it is impractical to ask monitor wearers to stay strictly still, and thus, there are always some motion artifacts. Thirdly, we asked the participants to ride a bike to introduce exercise stress and more variability to the heartbeat morphologies/intervals. Fourthly, to make the system suited to real-world application scenarios, we further deliberately introduced twenty-second head movements in the second minute (the duration used in algorithm evaluation) of each trail. All the above reasons usually result in low-quality raw heartbeats (distorted or false ones), and significantly impact the QRS duration estimation. Therefore, we propose purifying the raw heartbeats to improve the robustness of the ear-worn system.

3.3. Representative Heartbeat Learned by Clustering

A representative heartbeat was learned with the k-medoid clustering method, which will be used as a template by the MV-DTW approach, for both QRS location and purification purposes. Figure 4 shows an example of the learning results, where three-dimensional raw heartbeats are grouped into three clusters to differentiate their behaviors. The medoid that represents the highest number of raw heartbeats was selected as the representative heartbeat, i.e., the medoid in cluster 3 in this example. One thing worth noting is that three dimensions of each heartbeat were concatenated and fed into the k-medoid algorithm to more effectively visualize each dimension. Figure 4 also shows that the raw ECG heartbeat actually possesses a very low signal-to-noise ratio and is, thus, highly sensitive to noise and motion artifacts. However, the k-medoid clustering approach successfully learned a three-dimensional representative heartbeat with much better morphologies compared with other raw heartbeats.

3.4. QRS Located by Multiview DTW

MV-DTW was applied to nonlinearly match a heartbeat with the learned representative high-quality heartbeat in order to determine point-to-point relations between them, as shown in Figure 5. Optimal warping path encoding nonlinear matching results were generated in the path distance table using a dynamic programming strategy. The ORS boundaries (the offset of one QRS complex and the onset of the following QRS complex, in an R-peak-to-R-peak heartbeat) in the test heartbeat were robustly located using a backward search method referring to the boundary locations in the template. This multiview DTW will be shown to be more tolerant to motion artifacts than the traditional single-view DTW in the next performance summary, leveraging amplitude/derivative/angle information which reveals meaningful point/pair/piece-wise signal characteristics.

3.5. Distortion Quantization and Threshold Learning

MV-DTW also generates a stream-to-stream path distance, which can be used to quantize the quality condition of a raw heartbeat. With these heartbeat-specific distortion values, a threshold is needed to label these heartbeats as high- or low-quality ones. The threshold learning process is shown in Figure 6. The left-skewed intensity histogram of MV-DTW distances (Figure 6a) indicates the fact that high-quality heartbeats usually concentrate in a low-distortion area, while low-quality ones (distorted or false heartbeats) spread over a larger range due to the diversity induced by motion artifacts. Therefore, the transition point in Figure 6a can be determined using the global search method shown in Figure 6b, and the corresponding DTW distance was, thus, used as a distortion threshold.

3.6. Heartbeat Quality Labelling and Purification

Based on the heartbeat distortion values and the learned threshold, the heartbeat-specific SQIs were generated as shown in Figure 7, where (a), (b), and (c) correspond to a one-minute mobile ECG segment, heartbeat-specific distortion values, and SQIs, respectively. The time-varying heartbeat-specific quality conditions can be effectively reflected by the distortion values generated by MV-DTW, which were used to generate the SQIs for heartbeat purification. In such a way, the ECG segments with severe motion artifacts (e.g., the segment 1 in Figure 7d) can be filtered out and the high-quality segments (e.g., the segment 2 in Figure 7e) are reserved. One thing worth noting is that the purification is relatively strict to filter out suspicious raw heartbeats and may result in much fewer remaining heartbeats. The scarification of heartbeats with moderate-quality conditions is meaningful, such that the impact from heartbeats can be effectively suppressed to a large degree, which will be demonstrated in the next performance summary.

3.7. Mobile QRS to Chest QRS Calibration

Due to the QRS duration difference between different lead configurations mentioned before, we introduced a calibration model which was learned based on the training data of each subject. This model can effectively calibrate the mobile ECG QRS duration estimates to predict the standard chest ECG QRS duration estimates, as shown in Figure 8, where the prediction results without and with calibration are visualized in Figure 8(a1,a2,b1,b2), respectively. With calibration, the correlation efficient was improved by 91.2% to 46.6% (without calibration), and the distribution in the BA plot was significantly improved, resulting in a smaller mean error and also a much lower standard deviation.

3.8. Performance Summary

The mobile-ECG-QRS-based chest QRS prediction performance is summarized in Table 1, where the last row corresponds to the proposed framework, and other rows give some simpler options for comparison purposes. There are several interesting observations which can be made. Firstly, approaches with MV-DTW (rows 5–8) showed much better performance compared with those with DTW (rows 1–4), indicating that the MV-DTW can effectively capture more nonlinear patterns in sequence matching and distortion quantization. Secondly, SQI-based purification can effectively contribute to performance improvement, showing that the strict purification operation successfully suppressed most of the unreliable QRS duration estimates due to distorted and false raw heartbeats. Thirdly, the model calibration method was also significant in compensating the ear-lead-to chest-lead QRS duration bias, resulting in a further performance enhancement. With the proposed framework that has MV-DTW, SQI-based purification, and model calibration as shown in Figure 1, the prediction performance is very encouraging, and the correlation coefficient, mean error

\pm

standard deviation, mean absolute error, and root mean absolute error are 91.2%, 0.4

\pm

2.6, 1.7, and 2.6 ms, respectively.

3.9. Future Studies

It will be promising to further enhance the system with more effective pattern extraction [37,38,39,40] studies. The proposed approaches could also be generalized to other applications or signals [41,42,43,44] for event detection, template signal learning, and signal quality purification. These promising possibilities will deepen the current research and broaden its impact in the era of big medical data.

4. Conclusions

In this study, focusing on one major impediment faced by pervasive out-of-clinic ECG-based heart health tracking, i.e., the low level of convenience, we designed and validated a highly convenient cardiac management system by leveraging a novel machine learning framework for noisy mobile ECG signal analysis. We took a special interest in ECG QRS duration tracking, which carries a large amount of medical information and relates to many heart diseases [23,24,25,26,27,28,29]. Firstly, raw heartbeat locations were identified by an SVM classifier. Secondly, QRS boundaries were located with a novel MV-DTW approach, referring to a high-quality heartbeat template learned by a k-medoid clustering method. At the same time, the MV-DTW path distance was used to quantize the distortion conditions of raw heartbeats, which were then compared with a distortion threshold learned by a histogram triangle method to generate heartbeat-specific signal quality indices for purification purposes. Finally, the estimated mobile ECG QRS durations were transformed to the commonly used standard chest ECG QRS durations. Promising experimental results were demonstrated, indicating the effectiveness of the proposed framework. This study will greatly advance machine-learning-enabled ECG data mining towards smart medical decision support.

Author Contributions

Conceptualization, Q.Z. and D.Z.; methodology, Q.Z.; software, Q.Z.; validation, Q.Z.; formal analysis, Q.Z.; investigation, Q.Z.; writing—original draft preparation, Q.Z.; writing—review and editing, Q.Z. and D.Z.; visualization. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

WHO. The 10 Leading Causes of Death in the World. 2014. Available online: http://www.who.int/mediacentre/factsheets/fs310/en/ (accessed on 20 May 2023).
Maron, B.J.; Friedman, R.A.; Kligfield, P.; Levine, B.D.; Viskin, S.; Chaitman, B.R.; Okin, P.M.; Saul, J.P.; Salberg, L.; Van Hare, G.F. Assessment of the 12-Lead ECG as a Screening Test for Detection of Cardiovascular Disease in Healthy General Populations of Young People (12–25 Years of Age). Circulation 2014, 130, 1303–1334. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Alhussein, M.; Muhammad, G.; Hossain, M.S. EEG pathology detection based on deep learning. IEEE Access 2019, 7, 27781–27788. [Google Scholar] [CrossRef]
Amin, F.-E.; Hussain, M.; Ali, Z.; Busaleh, M.; Al Sultan, S.A. Development of a Secure Cloud-based Breast Cancer Diagnosis System. In Proceedings of the 2022 6th International Conference on Cloud and Big Data Computing, Birmingham, UK, 18–20 August 2022; pp. 42–48. [Google Scholar]
Abdul, W.; Ali, Z.; Ghouzali, S.; Alfawaz, B.; Muhammad, G.; Hossain, M.S. Biometric security through visual encryption for fog edge computing. IEEE Access 2017, 5, 5531–5538. [Google Scholar] [CrossRef]
Wang, R.; Lai, J.; Zhang, Z.; Li, X.; Vijayakumar, P.; Karuppiah, M. Privacy-preserving federated learning for internet of medical things under edge computing. IEEE J. Biomed. Health Inform. 2023, 27, 854–865. [Google Scholar] [CrossRef] [PubMed]
Andreu-Perez, J.; Perez-Espinosa, H.; Timonet, E.; Kiani, M.; Girón-Pérez, M.I.; Benitez-Trinidad, A.B.; Jarchi, D.; Rosales-Pérez, A.; Gatzoulis, N.; Reyes-Galaviz, O.F. A generic deep learning based cough analysis system from clinically validated samples for point-of-need COVID-19 test and severity levels. IEEE Trans. Serv. Comput. 2021, 15, 1220–1232. [Google Scholar] [CrossRef] [PubMed]
Ahmed, J.; Nguyen, T.N.; Ali, B.; Javed, M.A.; Mirza, J. On the physical layer security of federated learning based IoMT networks. IEEE J. Biomed. Health Inform. 2022, 27, 691–697. [Google Scholar] [CrossRef]
Zhang, Q.; Frick, K. All-ECG: A Least-Number of Leads ECG Monitor for Standard 12-Lead ECG Tracking During Motion. In Proceedings of the 6th Annual IEEE EMB Strategic Conference on Healthcare Innovations and Point-Of-Care Technologies (IEEE HI-POCT 2019), Bethesda, MD, USA, 20–22 November 2019; pp. 103–106. [Google Scholar]
Pantelopoulos, A.; Bourbakis, N.G. A survey on wearable sensor-based systems for health monitoring and prognosis. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2010, 40, 1–12. [Google Scholar] [CrossRef] [Green Version]
Zhang, Q.; Zhou, D.; Zeng, X. A Novel Framework for Motion-tolerant Instantaneous Heart Rate Estimation By Phase-domain Multi-view Dynamic Time Warping. IEEE Trans. Biomed. Eng. 2017, 64, 2562–2574. [Google Scholar]
Zhang, Q.; Zhou, D.; Zeng, X. Hear the heart: Daily cardiac health monitoring using Ear-ECG and machine learning. In Proceedings of the 8th IEEE Ubiquitous Computing, Electronics and Mobile Communication Conference (IEEE UEMCON), New York, NY, USA, 19–21 October 2017; pp. 448–451. [Google Scholar]
Zou, J.; Zhang, Q. eyeSay: Eye Electrooculography Decoding with Deep Learning. In Proceedings of the 2021 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 10–12 January 2021; pp. 1–3. [Google Scholar]
Nanhore, S.D.; Bartere, M.M. Mobile phone sensing system for health monitoring. Int. J. Sci. Res. 2013, 2, 252–255. [Google Scholar]
Hsieh, J.-H.; Hung, K.-C.; Lin, Y.-L.; Shih, M.-J. A speed-and power-efficient SPIHT design for wearable quality-on-demand ECG applications. IEEE J. Biomed. Health Inform. 2017, 22, 1456–1465. [Google Scholar] [CrossRef]
Zhang, Q. Deep Learning of Electrocardiography Dynamics for Biometric Human Identification in era of IoT. In Proceedings of the 9th IEEE Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (IEEE UEMCON), New York, NY, USA, 8–10 November 2018; pp. 885–888. [Google Scholar]
Oliver, N.; Flores-Mangas, F.; De Oliveira, R. Towards wearable physiological monitoring on a mobile phone. In Mobile Health Solutions for Biomedical Applications; IGI Global: Hershey, PA, USA, 2009; pp. 208–243. [Google Scholar]
Almotiri, S.H.; Khan, M.A.; Alghamdi, M.A. Mobile health (m-health) system in the context of IoT. In Proceedings of the 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW), Vienna, Austria, 22–24 August 2016; pp. 39–42. [Google Scholar]
Zhang, Q.; Zhu, S. Real-time Activity and Fall Risk Detection for Aging Population Using Deep Learning. In Proceedings of the 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (IEEE UEMCON), New York, NY, USA, 8–10 November 2018; pp. 1055–1059. [Google Scholar]
Zhang, Q. Deep Learning of Biomechanical Dynamics in Mobile Daily Activity and Fall Risk Monitoring. In Proceedings of the 6th Annual IEEE EMB Strategic Conference on Healthcare Innovations and Point-of-Care Technologies (IEEE HI-POCT 2019), Bethesda, MD, USA, 20–22 November 2019. [Google Scholar]
Maji, S.; Burke, M.J. Establishing the Input Impedance Requirements of ECG Recording Amplifiers. IEEE Trans. Instrum. Meas. 2020, 69, 825–835. [Google Scholar] [CrossRef]
Zhang, Q.; Zeng, X.; Hu, W.; Zhou, D. A Machine Learning-empowered System for Long-term Motion-tolerant Wearable Monitoring of Blood Pressure and Heart Rate with Ear-ECG/PPG. IEEE Access 2017, 5, 10547–10561. [Google Scholar] [CrossRef]
Teodorescu, C.; Reinier, K.; Uy-Evanado, A.; Navarro, J.; Mariani, R.; Gunson, K.; Jui, J.; Chugh, S.S. Prolonged QRS duration on the resting ECG is associated with sudden death risk in coronary disease, independent of prolonged ventricular repolarization. Heart Rhythm 2011, 8, 1562–1567. [Google Scholar] [CrossRef] [Green Version]
Van Huysduynen, B.H.; van Straten, A.; Swenne, C.A.; Maan, A.C.; van Eck, H.J.R.; Schalij, M.J.; van der Wall, E.E.; de Roos, A.; Hazekamp, M.G.; Vliegen, H.W. Reduction of QRS duration after pulmonary valve replacement in adult Fallot patients is related to reduction of right ventricular volume. Eur. Heart J. 2005, 26, 928–932. [Google Scholar] [CrossRef]
Kurl, S.; Mäkikallio, T.; Rautaharju, P.; Kiviniemi, V.; Laukkanen, J.A. The duration of QRS complex in resting electrocardiogram is a predictor of sudden cardiac death in men. Circulation 2012, 125, 2588–2594. [Google Scholar] [CrossRef] [Green Version]
Yokokawa, M.; Kim, H.M.; Good, E.; Crawford, T.; Chugh, A.; Pelosi, F.; Jongnarangsin, K.; Latchamsetty, R.; Armstrong, W.; Alguire, C. Impact of QRS duration of frequent premature ventricular complexes on the development of cardiomyopathy. Heart Rhythm 2012, 9, 1460–1464. [Google Scholar] [CrossRef] [PubMed]
Gold, M.R.; Thébault, C.; Linde, C.; Abraham, W.T.; Gerritse, B.; Ghio, S.; Sutton, M.S.J.; Daubert, J.-C. The effect of QRS duration and morphology on cardiac resynchronization therapy outcomes in mild heart failure: Results from the REsynchronization reVErses Remodeling in Systolic left vEntricular dysfunction (REVERSE) study. Circulation 2012, 126, 822–829. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Holm, H.; Gudbjartsson, D.F.; Arnar, D.O.; Thorleifsson, G.; Thorgeirsson, G.; Stefansdottir, H.; Gudjonsson, S.A.; Jonasdottir, A.; Mathiesen, E.B.; Njølstad, I. Several common variants modulate heart rate, PR interval and QRS duration. Nat. Genet. 2010, 42, 117–122. [Google Scholar] [CrossRef]
Iuliano, S.; Fisher, S.G.; Karasik, P.E.; Fletcher, R.D.; Singh, S.N.; Department of Veterans Affairs Survival Trial of Antiarrhythmic Therapy in Congestive Heart Failure. QRS duration and mortality in patients with congestive heart failure. Am. Heart J. 2002, 143, 1085–1091. [Google Scholar] [CrossRef] [PubMed]
Zhang, Q.; Zhou, D.; Zeng, X. A novel machine learning-enabled framework for instantaneous heart rate monitoring from motion-artifact-corrupted electrocardiogram signals. Physiol. Meas. 2016, 37, 1945. [Google Scholar] [CrossRef] [PubMed]
Jain, A.K. Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 2010, 31, 651–666. [Google Scholar] [CrossRef]
Zack, G.; Rogers, W.; Latt, S. Automatic measurement of sister chromatid exchange frequency. J. Histochem. Cytochem. 1977, 25, 741–753. [Google Scholar] [CrossRef] [PubMed]
Zhang, Q.; Zhou, D.; Zeng, X. Highly wearable cuff-less blood pressure and heart rate monitoring with single-arm electrocardiogram and photoplethysmogram signals. Biomed. Eng. Online 2017, 16, 23. [Google Scholar] [CrossRef] [Green Version]
Arthur, D.; Vassilvitskii, S. K-means++: The advantages of careful seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, USA, 7–9 January 2007; pp. 1027–1035. [Google Scholar]
Kaufman, L.; Rousseeuw, P.J. Partitioning around Medoids (Program Pam); John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1990; pp. 68–125. [Google Scholar]
Tsouri, G.R.; Ostertag, M.H. Patient-specific 12-lead ECG reconstruction from sparse electrodes using independent component analysis. IEEE J. Biomed. Health Inform. 2014, 18, 476–482. [Google Scholar] [CrossRef]
Wong, J.; Piuri, V.; Scotti, F.; Zhang, Q. Efficient IoT Big Data Streaming with Deep-Learning-Enabled Dynamics. IEEE Internet Things J. 2023, 10, 4770–4782. [Google Scholar] [CrossRef]
Gangadharan, K.; Zhang, Q. Deep Transferable Intelligence for Spatial Variability Characterization and Data-efficient Learning in Biomechanical Measurement. IEEE Trans. Instrum. Meas. 2023, 72, 2509812. [Google Scholar] [CrossRef]
Stauffer, J.; Zhang, Q. SpikeBASE: Spiking Neural Learning Algorithm with Backward Adaptation of Synaptic Efflux. IEEE Trans. Comput. 2022, 71, 2707–2716. [Google Scholar] [CrossRef]
Zou, J.; Zhang, Q. eyeSay: Brain Visual Dynamics Decoding with Deep Learning & Edge Computing. IEEE Trans. Neural Syst. Rehabil. Eng. 2022, 30, 2217–2224. [Google Scholar] [PubMed]
Manogaran, G.; Shakeel, P.M.; Fouad, H.; Nam, Y.; Baskar, S.; Chilamkurti, N.; Sundarasekar, R. Wearable IoT smart-log patch: An edge computing-based Bayesian deep learning network system for multi access physical monitoring system. Sensors 2019, 19, 3030. [Google Scholar] [CrossRef] [Green Version]
Azimi, M.; Eslamlou, A.D.; Pekcan, G. Data-driven structural health monitoring and damage detection through deep learning: State-of-the-art review. Sensors 2020, 20, 2778. [Google Scholar] [CrossRef]
Singh, S.P.; Wang, L.; Gupta, S.; Goli, H.; Padmanabhan, P.; Gulyás, B. 3D deep learning on medical images: A review. Sensors 2020, 20, 5097. [Google Scholar] [CrossRef] [PubMed]
Bhattacharya, D.; Sharma, D.; Kim, W.; Ijaz, M.F.; Singh, P.K. Ensem-HAR: An ensemble deep learning model for smartphone sensor-based human activity recognition for measurement of elderly health monitoring. Biosensors 2022, 12, 393. [Google Scholar] [CrossRef] [PubMed]

Figure 1. System diagram of novel machine learning framework, which identifies the heartbeats with the support vector machine (SVM) in stage I, localizes the QRS complexes with the multiview dynamic time warping (MV-DTW) in the first part of stage II, purifies the QRS complexes in the second part of stage II, and then transforms the mobile-ECG-based QRS durations to commonly used standard chest-ECG-based ones in stage III. The blocks above the dashed line are for training, and those under the line are for testing, so the HB ID model, HB template learning, and distortion threshold learning are only included in the training phase. Notes: Orange blocks: supervised learning steps; blue blocks: unsupervised learning steps; green blocks: pattern recognition steps; HB: heartbeat; ID: identification; SVM: support vector machine; DTW: dynamic time warping.

Figure 2. The mobile ECG (c) owns distinguishable morphologies, but is much weaker and more sensitive to motion artifacts, compared with the traditional chest-ECG (a). P, R, R, S, and T, are characteristic points of an ECG heartbeat. The on and off positions of the QRS complex are also labeled in the chest-ECG zoomed in (b), and the corresponding mobile ECG zoomed in is also given (d) to demonstrate the mobile ECG is of a very low bio-potential and sensitive to noise.

Figure 3. An example of the SVM-based heartbeat identification results (a), corresponding to the second minute data of one trial in the testing session of a subject (subject 6 is selected as an example), and showing the necessity to deal with many highly distorted heartbeats and some faking heartbeats. Two segments are further visualized with more details in (b,c). The red dots are detected heartbeat R peak locations.

Figure 4. An example of a representative heartbeat learned by k-medoid clustering, corresponding to the training session of subject 6, where the medoid in cluster 3 is selected as a high-quality template since it represents the highest number of raw heartbeats (51%) among three medoids. Dark to blue lines: different dimensions of the three-dimensional raw heartbeats; bold lines: different dimensions of the medoids; all dimensions in all raw heartbeats are scaled to be between 0 and 1. Notes: Amp—amplitude; Deri—derivative.

Figure 5. An example of the multiview-DTW-based nonlinear stream matching, corresponding to a raw heartbeat in the testing session of subject 6, and showing that the QRS offset/onset in the three-dimensional test stream is robustly located using a backward search method referring to the locations in the template stream. To determine the off and on boundaries of the QRS complex of the mobile ECG in the bottom, the corresponding preknown off and on locations in the chest ECG template on the left side are firstly mapped to the optimal warping path in the middle graph and then mapped to the mobile ECG in the bottom. Notes: the yellow/blue colors correspond to the highest/lowest path distance values, respectively. Amp—amplitude; Deri—derivative.

Figure 6. An example of the histogram triangle-based distortion threshold learning (a), corresponding to the training session of subject 6, and showing that the transition point (greed dot on the left side) is learned to determine an appropriate distortion threshold (0.25 with normalization). The right plot (b) is based on the distance from each point on the right side of the red envelop to the blue line in the left plot, so each distance is actually corresponding to a perpendicular line to the blue line. The green dot in the right plot is maximum, meaning that there is a sharp transition in the red envelop in the left plot and this transition is a good threshold.

Figure 7. An example of the SQI-based raw heartbeat purification (a), corresponding to the testing session of subject 6, and indicating that segments highly distorted by motion artifacts have been successfully filtered out and other segments with a high quality are reserved. Red/green curves in (b): the smoothed distortion sequence and the distortion threshold, respectively. Two segments are further visualized to demonstrate the high-noise condition (d) and relative the normal-noise condition (e).

Figure 8. Correlation (Corr.) and Bland-Altman (BA) plots between predicted chest-QRS durations (QRSpred) and real chest-QRS durations (QRSchest), without (a,b) calibration (Cali.) model learning. Results is based on the testing sessions of all subjects; MDTW: MV-DTW. (a1) and (b1) give the correlation plots, and (a2,b2) give the BA plots.

Table 1. Ear-QRS-based Chest QRS Estimation Performance Summary on Testing Sessions of All Subjects.

APPROACHES	CR	ME	STD	MAE	RMSE
DTW	−2.1%	10.9	23.8	15.3	26.1
DTW + SQI	−4.9%	8.0	24.8	16.4	26.0
DTW + Cal.	48.5%	2.2	8.2	5.8	8.4
DTW + SQI + Cal.	54.1%	2.7	7.8	5.1	8.2
MV-DTW	28.3%	−3.6	6.2	5.4	7.1
MV-DTW + SQI	46.6%	−5.8	5.1	6.2	7.7
MV-DTW + Cal.	84.1%	−0.3	3.4	2.3	3.4
MV-DTW + SQI + Cal. (Proposed)	91.2%	0.4	2.6	1.7	2.6

Notes. Cal.: model calibration; CR: correlation efficient; ME: mean error; STD: standard deviation; MAE: mean absolute error; RMSE: root mean square error; unit of error: ms.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Q.; Zhou, D. Machine Learning Electrocardiogram for Mobile Cardiac Pattern Extraction. Sensors 2023, 23, 5723. https://doi.org/10.3390/s23125723

AMA Style

Zhang Q, Zhou D. Machine Learning Electrocardiogram for Mobile Cardiac Pattern Extraction. Sensors. 2023; 23(12):5723. https://doi.org/10.3390/s23125723

Chicago/Turabian Style

Zhang, Qingxue, and Dian Zhou. 2023. "Machine Learning Electrocardiogram for Mobile Cardiac Pattern Extraction" Sensors 23, no. 12: 5723. https://doi.org/10.3390/s23125723

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning Electrocardiogram for Mobile Cardiac Pattern Extraction

Abstract

1. Introduction

2. Materials and Methods

2.1. System Overview

2.2. Stage I: ECG Heartbeat Identification

2.3. Stage II: QRS Localization and then Purification

2.3.1. Representative Heartbeat Template Learning by K-Medoid Clustering

2.3.2. QRS Localization by Multiview Dynamic Time Warping

2.3.3. Heartbeat Distortion Quantization

2.3.4. Distortion Threshold Learning by Histogram Triangle Search

2.3.5. QRS Purification

2.4. Stage III: QRS Duration Calibration

2.4.1. QRS Duration Estimation

2.4.2. Mobile QRS Duration to Chest QRS Duration Calibration

3. Results and Discussion

3.1. Experimental Setup

3.2. Heartbeat Identification

3.3. Representative Heartbeat Learned by Clustering

3.4. QRS Located by Multiview DTW

3.5. Distortion Quantization and Threshold Learning

3.6. Heartbeat Quality Labelling and Purification

3.7. Mobile QRS to Chest QRS Calibration

3.8. Performance Summary

3.9. Future Studies

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI