An Algorithm for Initial Localization of Feature Waveforms Based on Differential Analysis Parameter Setting and Its Application in Clinical Electrocardiograms

Xia, Tongnan; Wang, Bei; Huang, Enruo; Du, Yijiang; Zhang, Laiwu; Liu, Ming; Chang, Chin-Chen; Sun, Yaojie

doi:10.3390/electronics13152996

Open AccessArticle

An Algorithm for Initial Localization of Feature Waveforms Based on Differential Analysis Parameter Setting and Its Application in Clinical Electrocardiograms

by

Tongnan Xia

^1,2

,

Bei Wang

³,

Enruo Huang

⁴,

Yijiang Du

^5,6,

Laiwu Zhang

²,

Ming Liu

^7,8,*

,

Chin-Chen Chang

⁹

and

Yaojie Sun

^1,*

¹

School of Information Science and Technology, Fudan University, Shanghai 200433, China

²

Institute for Six Sector Economy, Fudan University, Shanghai 200433, China

³

Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China

⁴

Shanghai Xiazhi Information Technology Co., Ltd., Shanghai 200232, China

⁵

Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China

⁶

Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention, Shanghai 200032, China

⁷

Innovative Center for New Drug Development of Immune Inflammatory Diseases, Ministry of Education, Fudan University, Shanghai 200032, China

⁸

Shanghai Engineering Research Center of AI Technology for Cardiopulmonary Diseases, Zhongshan Hospital, Fudan University, Shanghai 200032, China

⁹

Department of Information Engineering and Computer Science, Feng Chia University, Taichung City 40724, Taiwan

^*

Authors to whom correspondence should be addressed.

Electronics 2024, 13(15), 2996; https://doi.org/10.3390/electronics13152996 (registering DOI)

Submission received: 26 June 2024 / Revised: 9 July 2024 / Accepted: 16 July 2024 / Published: 29 July 2024

Download

Browse Figures

Versions Notes

Abstract

:

In a biological signal analysis system, signals of the same type may exhibit significant variations in their feature waveforms. Biological signals are typically weak, which increases the complexity of their analysis. Furthermore, clinical biomedical signals are susceptible to various interferences from the human body itself, including muscle movements, respiration, and heartbeat. These interference factors further escalate the complexity and difficulty of signal analysis. Therefore, precise and targeted preprocessing is often required before analyzing these clinical biomedical signals to enhance the accuracy and reliability of subsequent feature extraction and classification. Here, we have established an effective and practical algorithm model that integrates preprocessing with the initial localization of target feature waveforms, achieving the following four objectives: 1. Determining the periodic positions of target feature waveforms. 2. Preserving the original amplitude and shape of target feature waveforms while eliminating negative interference. 3. Reducing or eliminating interference from other feature waveforms in the input signal. 4. Decreasing noise in the input signal, such as baseline drift, powerline interference, and muscle artifacts commonly found in biological signals. We have validated the algorithm on clinical electrocardiogram (ECG) data and the authoritative MIT-BIH open-source ECG database demonstrating its effectiveness and reliability.

Keywords:

algorithm model; signal preprocessing; feature waveforms localization; parameter settings

1. Introduction

Biomedical signals have widespread applications in physiology and medicine, effectively assisting in the diagnosis, monitoring, and study of various physiological activities in the human body [1]. Firstly, these signals carry crucial physiological information such as cardiac rhythm [2]. Analyzing these signals enables an understanding of disease states, overall health conditions, and specific physiological activities to be gained [3], thus aiding in medical diagnosis and scientific research. Secondly, biomedical signal analysis is pivotal in clinical monitoring [4]; for instance, ECGs are used to detect cardiac abnormalities [5,6,7,8,9]. Therefore, effective analysis of biomedical signals provides vital support and guidance for medical research and clinical practice [10,11]. However, analyzing biomedical signals poses multiple challenges [12]. Firstly, these signals are often subject to various interferences and noise from the environment, equipment, and physiological activities themselves, such as muscle movements and power supply interference [13,14], which can adversely affect signal quality and interpretation. Secondly, biomedical signals typically exhibit non-stationary, nonlinear characteristics [15], making their analysis complex due to their rich information content. Thirdly, these signals are often sampled at high frequencies, resulting in large data volumes that require effective data processing methods to extract useful information. Lastly, preprocessing is crucial for noise reduction, interference removal, and accurate feature extraction [16]. Effective preprocessing significantly enhances the accuracy and reliability of subsequent analyses.

Traditional preprocessing methods for biomedical signals typically involve two stages: noise reduction and waveform localization [17,18]. These stages often require the implementation of multiple algorithms in coordination, resulting in low efficiency and high complexity. In the noise reduction stage, researchers commonly employ various filters to eliminate noise of different types and frequency ranges [19,20]. The waveform localization process involves multiple methods and algorithms, often integrated with machine learning and neural networks for data training and localization [21]. For instance, methods such as finite impulse response (FIR) filters [22], derivative methods [23], and VMD-based methods [24] are used to extract temporal and morphological features from electrocardiogram signals.

In contrast to traditional methods, our algorithm integrates these two stages. However, in the waveform localization stage, most studies have failed to effectively eliminate negative interference and interference from other feature waves [22,23,25,26,27,28,29]. This complex localization process limits the achievement of efficient research and means that researchers need to master more algorithms and methods [30]. Compared to other biomedical signals, clinical biomedical signals are often subjected to more interference and noise, which increases the difficulty of preprocessing and complicates the preprocessing process. The problem itself arises from two aspects: firstly, bioelectric signals possess unique characteristics; secondly, preprocessing is an elimination process, while feature wave positioning is an enhancement process, making their integration difficult.

Based on the above analysis and identified issues, we have established an easily implementable algorithm. Using a recursive approach, we established this algorithmic model, integrating preprocessing with the initial localization of characteristic waves. The algorithm effectively meets the denoising requirements during preprocessing while maintaining the shape and amplitude of characteristic waveforms. It also effectively attenuates residual signals and eliminates negative interference, thereby reducing interference from other waveforms and directions during subsequent feature wave extraction. Additionally, the algorithm efficiently suppresses short pulse signals connected to characteristic waveforms, thereby enhancing the accuracy of locating the starting positions of characteristic waveforms. We have also proposed a parameter setting method based on differential analysis in statistics. We compared the performance of raw and processed data through Pearson correlation analysis and variance analysis. The validation results demonstrate that this algorithm achieves good results in both clinical and open-source electrocardiogram databases. Figure 1 summarizes our study using clinical ECG data as an example.

2. Methods

2.1. Mathematical Formulation of the Algorithm

In a digital signal processing system, the input is a one-dimensional image, which can be abstracted as a two-variable function denoted by

f (x, y)

. Here,

(x, y)

represents the pixel position, and

f (x, y)

represents the signal intensity at that position. After processing through the image processing system, the output remains a function. Therefore, image processing fundamentally involves operations on a function. The input to the processing system is a two-variable function, and the output is also a two-variable function.

Spatial domain methods directly manipulate pixels in an image. Assuming the output function is

g (x, y)

, the signal intensity at position

(x, y)

is closely related to the pixel values at that position and its neighborhood in the input image. The mathematical model of spatial domain methods can be represented by Formula (1).

g (x, y) = T [f (x, y)],

(1)

In Formula (1),

T

: a signal processing system;

(x, y)

: the pixel position;

f (x, y)

: the signal intensity at position

(x, y)

(the magnitude of the input);

g (x, y)

: the signal output intensity at position

(x, y)

.

For signals characterized by the following features, persistent gradient ascent and descent, with the longest duration in the time domain considered as target feature waveforms; in contrast, signals with discrete or brief gradient changes are regarded as interference or non-feature signals. Additionally, non-feature extraction signals may exhibit negative interference. Our objective is to ensure that the output signal preserves the trend and magnitude of the target feature signals, while removing negative interference and attenuating non-feature and interference signals. The algorithm model we have established addresses these issues, as formulated in Formula (2).

d_{n} = \{\begin{matrix} y_{1}, n = 1 \\ (\sum_{i = 1}^{n - 1} α \cdot {| ∆ y}_{n - i} |) + |∆ y_{n}|, n \geq 2 \\ ∆ y_{i} = \frac{y_{i} - y_{i - 1}}{x_{i} - x_{i - 1}}, i \geq 2 \\ ∆ y_{1} = y_{1} \end{matrix},

(2)

In Formula (2),

y_{i}

: the amplitude of the

i

th input signal;

x_{i}

: the index of the

i

th input signal;

d_{i}

: the amplitude of the

i

th output signal;

α

: a weight,

α \in (0.9, 1)

.

2.2. Meaning and Purpose

To facilitate the explanation of the meaning and purpose of the algorithm, we present Formula (2) in a logical form, as shown in Formula (3).

Y = y_{1} + y_{2},

(3)

In the Formula (3),

Y

:

d_{n}

;

y_{1}

:

\sum_{i = 1}^{n - 1} α \cdot {| ∆ y}_{n - i} |

;

y_{2}

:

|∆ y_{n}|

.

We view the output function of Formula (2) as a superposition of two function results. One is the function

y_{1}

, and the other is the function

y_{2}

. The discrete form of function

y_{1}

is shown in Formula (4).

y_{1 i} = {α \cdot d}_{i - 1}, i \geq 2,

(4)

In Formula (4),

y_{1 i}

: the value of the

i

th output signal; and

α

: the weight,

α \in [0, 1]

.

In Formula (4), corresponding to the function

y_{1}

, the operation is a self-repeating process built on recursive principles. Based on the previous output influencing the current output, when

0 < α < 1

, the current output signal undergoes attenuation. Due to its recursive nature, the output depends on adjacent previous outputs; thus, even if significant changes occur between adjacent sample points, the output signal remains relatively smooth. As

α

approaches 1, the smoothing effect becomes more pronounced because the output relies more heavily on previous outputs. Larger values of

α

can more effectively suppress signal fluctuations, reduce noise interference, and stabilize the signal. When

α

is a fixed value, the output becomes a fixed proportion of the previous output, helping to reduce signal fluctuations and noise. Therefore, in Formula (2), we provide a more effective range for

α

.

The discrete form of

y_{2}

is shown in Formula (5).

y_{2 i} = {| d a t a}_{i} - {d a t a}_{i - 1} |,

(5)

y_{2 i}

: the

i

th predicted value;

{d a t a}_{i}

: the

i

th actual observed value.

Formula (5) describes the function

y_{2}

, which aims to maintain the stability of feature waveforms. When the input signal exhibits continuous gradient changes (whether upward or downward), Formula (5) can capture and preserve these changes, integrating them into the output sequence. This ensures that the output signal maintains consistent trends with the input signal’s continuous variations. If the input signal continues to increase, the output sequence will also continue to rise accordingly. Without considering the removal of negative components in Formula (2), function

y_{1}

, i.e., the first part of Formula (3), enables the output sequence to smoothly follow the overall trend in the input signal.

2.3. Parameter Setting Based on Difference Analysis

Parameter setting is crucial for the algorithm model. In this study, based on the empirical range of the weight parameters of the recursive logic algorithm and the design requirements of each part of our algorithm model, we employed methods derived from statistical differential analysis to further determine the value of the weight constant

α

.

The algorithm model developed in this study aims to smooth and suppress noise and non-feature signals while preserving the key features of the signal to be extracted, thereby highlighting its main characteristics. Therefore, in setting the parameter, we need to consider both aspects comprehensively, striving to minimize interference signals while maintaining the integrity of the main feature signal. Based on the above analysis, we first propose the strategy for parameter settings, as shown in Figure 2.

As shown in Figure 2, our algorithm simultaneously achieves two primary objectives: on one hand, it preserves the variation trends in the feature waveforms to be extracted from the original data, while reducing noise and attenuating non-feature waveforms to minimize interference in subsequent feature extraction processing. Post-processing ensures that non-feature signals exhibit minimal fluctuations around the baseline while maintaining the integrity of the shape of the main feature signals. The setting of parameter α plays a crucial role in balancing these two objectives. In various existing signal processing algorithms, parameters are typically set based on empirical estimates. In contrast, we employ a method based on statistical differential analysis to determine the parameter

α

, aiming to enhance the stability and adaptability of the model.

By performing differential analysis on the observed data (raw data) and predicted data (output after model processing), we can obtain detailed information about their differences. In statistics, if the p-value (significance level) of two datasets is greater than or equal to 0.05, it indicates that their differences are not statistically significant. In the context of our algorithm, this means that the trend in the feature waveforms to be extracted has been preserved, thereby confirming the successful setting of parameter

α

.

For datasets where the differential analysis results are statistically significant, the significance arises from the reduction in non-feature signal components, leading to significant differences in the analysis results. To mitigate the interference of these signals on parameter setting, adjustments are necessary. Specifically, we adjust these signals to correspond to their processed values and then conduct differential analysis again. By continuously optimizing the observed and predicted datasets, we adjust the value of parameter α typically with a precision of 0.01 until it meets the statistical condition (p ≥ 0.05), thus achieving the optimal setting for parameter

α

.

As a decay factor, the parameter

α

typically ranges within the interval (0, 1). When

α

approaches 0, the influence of the absolute difference between the current data point and the previous data point diminishes, indicating that the model becomes smoother and places more emphasis on historical data. Conversely, when

α

approaches 1, the model becomes more flexible and focuses more on recent data changes.

For algorithm models that require both rapid responsiveness to data changes and high stability, or when the data exhibits high autocorrelation, higher values of

α

can better capture the dynamic features of the data. In our configuration, we constrain

α

to the range (0.9, 1) with a precision of 0.01. This ensures that the model can easily respond to changes in the latest data points while avoiding over-reliance on them, thereby retaining a certain level of influence from historical data. This approach helps maintain relative stability when handling rapidly changing data, balancing the weights between historical and recent data.

Based on the above analysis, we first set the parameter α within the range of (0.9, 1). In this study, we chose to initialize

α

to 0.94. Next, we use the preset

α

value to conduct a differential analysis between the observed data set and the predicted data set. The specific steps of the differential analysis are shown in Figure 3. Since the observed data (raw data) and predicted data (model output data) are paired samples, we need to use a paired sample test to assess the differences between these two data sets. By calculating the differences between observed values and predicted values, denoted as d, we assess whether d follows a normal distribution. Given that d does not follow a normal distribution, we opt to use a paired sample Wilcoxon test for the differential analysis of these two data sets.

We observed the p-value of the test results to determine whether it is greater than 0.05. If the p-value is greater than 0.05, it indicates that our parameter settings are successful, effectively attenuating interference from non-feature signals and noise while preserving the feature waveforms of the original signal. If the p-value is not greater than 0.05, it may be due to variations in interfering signals affecting the test results. In such cases, it is necessary to reprocess the observed data set.

These steps involve two procedures. 1. Resetting negative values: First, it is necessary to reset the negative values in the observed data. Since the algorithm model has already eliminated negative interference, we reset the negative values in the predicted data set to the corresponding data values in the output signal. 2. Resetting signals near the baseline: in the observed data: Secondly, signals in the observed data set that are close to the baseline need to be reset to the corresponding predicted values. This step helps reduce the impact of differences caused by the smoothing of signals near the baseline after model processing. The purpose of these steps is to eliminate differences caused by changes in interference signals, ensuring consistency between the observed data set and the predicted data set. This ensures an effective evaluation of whether the parameter settings have achieved optimal results.

Finally, we conduct a paired-sample Wilcoxon test on the reset observed signals and predicted signals. We iteratively adjust the

α

value with a precision of 0.01 until the obtained p-value exceeds 0.05. Once this criterion is met, the current

α

value is confirmed as the final parameter setting for the algorithm, indicating the successful configuration of parameters.

We utilized clinical ECG data samples from Lead II obtained by Inno-12-U ECG equipment at Zhongshan Hospital, Fudan University, as observed data

d_{1}

. Each sample had a sampling frequency of 500 Hz and a duration of 10 s. Based on the differential analysis, we conducted differential tests under different

α

values. Initially, we set

α

to 0.94 as a preset value. Subsequently, after running the algorithm model, we obtained predicted values

y_{1}

. We computed the difference d between

d_{1}

and

y_{1}

, and conducted a normality test on d. The results, as shown in Table 1, indicate a significance level of Sig. (p-value) = 0.000 < 0.05, indicating that data d are not normally distributed.

Since the data were not normally distributed, we needed to conduct a paired sample Wilcoxon test on two sets of data—

d_{1}

and

y_{1}

. The test results are shown in Table 2. The significant test for both data sets shows Sig. (p-value) = 0.000 < 0.05, indicating that

y_{1}

has undergone significant changes, resulting in a significant difference between

y_{1}

and

d_{1}

.

At this point, we cannot confirm whether the setting of parameter

α

meets the requirements to preserve the main features and trends of

d_{1}

. Therefore, we need to reset

d_{1}

. First, we reset the data points in

d_{1}

whose amplitudes do not exceed 0.63 times the maximum amplitude of the observed data to the corresponding data values in

y_{1}

. The reset observed data are denoted as

d_{2}

. Subsequently, we conducted a Wilcoxon test between

d_{2}

and

y_{1}

. If the resulting p-value is greater than 0.05, it indicates that there is no significant difference between the model-processed data and the reset observed data, thus preserving the main features and trends in the feature waves in the observed data.

To further validate the importance of parameter setting and its impact on the results, we conducted a differential analysis using

α

values of 0.93 and 0.95 for two sets of data. According to the test results in Table 3, when

α = 0.94

, p = 0.084 > 0.05, indicating no significant difference between the predicted data and the corresponding reset observed data. However, when

α = 0.93

and

α = 0.95

, p-values are p = 0.001 and p = 0.00, respectively, which are both less than 0.01, demonstrating a highly significant difference between the predicted data and the corresponding reset observed data. Therefore, for such data, we have determined to set the algorithm parameter

α

to 0.94.

2.4. Absolute Value Differential Feedback— $y_{2}$

The algorithm’s other core component, function

y_{2}

, is primarily designed to reflect the slope changes between adjacent data points. By calculating the absolute difference between adjacent data points, this function aims to preserve the trend in feature waveforms in the raw signal. When assessing the magnitude of differences between data points, we choose to ignore their negative direction. This approach makes the model smoother and more capable of effectively handling potential outliers or noise in the data. Through this feedback mechanism based on absolute differences, the model exhibits a buffering effect against data fluctuations, reducing sensitivity to outliers and thereby enhancing the model’s stability. We illustrate the impact of this approach on clinical ECG Lead II signals in Figure 4.

From Figure 4, it is evident that when the algorithm processes data without using absolute values (bottom plot), especially in the presence of significant noise and interference from other feature waveforms in the data, it is more susceptible to negative data influences. This results in the algorithm’s output (bottom plot) significantly deviating from the main trend in the observed data (top plot), while also noticeably reducing the smoothing effect on noise.

3. Case Study

3.1. Clinical ECG Signals

We validated the algorithmic model using ECG signals as an example. An ECG is a critical diagnostic tool that records the electrical activity of the heart over time. Doctors assess the heart’s health and identify conditions such as arrhythmias, myocardial ischemia, and myocardial infarction by analyzing the feature waveforms of the ECG. ECGs play a pivotal role in clinical diagnosis. The algorithmic model established in this study emphasizes the QRS complex of the ECG waveform while smoothing out the remaining waveforms near the baseline. This provides a solid foundation for the subsequent localization and extraction of ECG feature waveforms. Figure 5 illustrates the results of the study.

We selected 10 typical clinical ECG data samples from Zhongshan Hospital, Fudan University. Each data segment lasts for 10 s, and was collected using clinical Lead II, with a sampling frequency of 500 Hz. The research results are shown in Figure 5. To enhance the clarity of the visualization, we selected samples from the 2000th to the 3500th data points in each data segment to display.

From Figure 5, it can be observed that after the model processing, the amplitude of each data segment remains largely unchanged, successfully removing the negative components of the raw data. The positions of feature waveforms after model processing align closely with those in the raw data, making the start and end points of each waveform more clearly visible. The residual non-feature waveforms exhibit lower oscillation frequencies and smoother waveforms after passing through the model, with oscillations centered around the baseline.

3.2. MIT-BIH ECG Signals

The MIT-BIH Arrhythmia Database is a publicly available ECG database established through collaboration between the Massachusetts Institute of Technology (MIT) and the Beth Israel Hospital (BIH) in Israel. We selected 10 different data records with unique IDs from the MLII lead of this database, each recorded at a sampling frequency of 360 Hz. As depicted in Figure 6, to emphasize the results, we visualized segments of data ranging from the 2000th to the 3500th sampling point for each record.

From Figure 6, it is evident that after model processing, the periodic positions and amplitudes of the targeted feature waveforms in each data segment remain consistent with the raw data. Additionally, other feature waveforms and noise exhibit minor fluctuations near the baseline. Post model processing, the feature waveforms became more prominent with unchanged amplitudes. The periodic positions of these feature waveforms were more distinct, eliminating directional interference and aiding in the precise localization of subsequent feature waveforms. Moreover, the oscillation frequency of non-feature waveforms is significantly reduced, thereby enhancing the model’s ability to resist interference in subsequent data processing.

4. Result and Discussion

In this study, we first analyzed the feature waveform of the ECG signals to identify that their trend is primarily manifested through the slope. Subsequently, we employed a differential logic that was markedly different from typical differential algorithms, as depicted in Figure 7, and mitigated negative interference by taking the absolute value. Finally, we needed to effectively integrate these two steps, considering their interrelationships and impacts. To achieve this, we introduced a parameter to balance these two components. Ultimately, based on the signal characteristics and research objectives, we determined the parameter settings using scientific differential analysis methods. Through case studies, we validated the correctness of our algorithmic logic.

In Figure 7, panel (a) presents clinical electrocardiogram data from Zhongshan Hospital, Fudan University, sampled at 500 Hz for 10 s, with amplitudes ranging from −0.1 to 0.3. Panel (b) displays the results of applying a classic differentiation filter to the ECG signal in (a) [31]. Firstly, the amplitude range shifts from −0.03 to 0.05, indicating significant changes compared to the original signal. Secondly, there are alterations in the shape of the characteristic wave signals. Additionally, the short-duration pulses before the characteristic waves are noticeably enhanced, potentially interfering with the accurate localization of their onsets, thereby reducing accuracy. Moreover, this preprocessing method fails to eliminate negative interference effectively. Lastly, compared to the original signal, the noise is significantly amplified. Panel (c) illustrates the output after the original signal has been processed using Formula (2), successfully overcoming the aforementioned issues and achieving the anticipated goal of initial localization of characteristic wave signals.

The example results demonstrate that the output of the algorithm model effectively reduces fluctuations in other signals near the baseline while preserving the trend and amplitude of the feature waveforms. The recursive part

y_{1}

in Formula (3) determines the current output based on previous output signals, thereby reducing the magnitude of waveform changes near the baseline and minimizing high-frequency interference. Additionally, the absolute value differentiation feedback component

y_{2}

in the algorithm preserves the trend of data slope changes, thereby maintaining the trajectory of the feature waveform variations. In the algorithm, we eliminated negative interference in the differentiation feedback component, ensuring symmetry with the negative direction in the raw data. This helps maintain the accurate positioning of feature waveforms and eliminates negative interference and short-term pulses commonly found at the beginning of feature waveforms, further enhancing the accuracy of waveform onset localization. Therefore, the output of the algorithm model not only preserves the shape of feature waveforms but also reduces interference from non-feature waveforms. Compared to general preprocessing and feature extraction algorithms that may alter the shape and amplitude of signals, our algorithm output is more intuitive and reliable.

Due to the recursive logic we employed, which operates on an accumulation principle, combined with our removal of negative interference and short-term pulses at the front of feature waveforms, we introduced an absolute value operation. However, this absolute value operation, while achieving the aforementioned functionalities, tends to cause the output signal to continuously increase, contradicting our initial goal of accurately positioning feature waveforms. Therefore, to balance the impacts of these factors, we introduced a crucial parameter

α

to adjust the balance in this integration process.

Based on the statistical analysis, we conducted variance and Pearson similarity comparisons for two datasets before and after processing in Table 4 and Table 5. The validation metrics in Table 4 and Table 5 aim to evaluate the model’s ability to suppress non-feature waveforms. After model processing, the variance in the output data decreases, indicating reduced fluctuations and smoother data compared to the raw data. Since the validation data segments are near the baseline, we excluded feature waveform data during comparisons. Based on the maximum position of each feature waveform period and time constraints, we set the value of each period’s feature waveform to the mean of the remaining data after removing the feature waveforms. After applying the same processing to both the raw and result data, we calculated and compared their variances. Pearson correlation coefficients in Table 4 and Table 5 were used to verify the similarity between the raw and result data. This validation focuses on analyzing whether the trends in the feature waveforms are consistent. We used the Pearson similarity method to analyze the residual signals after removing non-feature waveforms, based on the parameter settings in the Section 2. All obtained correlation coefficients were greater than 0.8, indicating a significant correlation between the result data and the raw data in terms of feature waveforms, effectively preserving their characteristics. Differential and consistency analyses show that the algorithmic model successfully maintained the trends in the feature waveforms while reducing interference from non-feature waveforms, resulting in smooth fluctuations near the baseline in the processed output signals. Compared to empirical estimation methods, our proposed parameter setting method significantly enhances reliability and applicability.

In the future, we can further explore how to more accurately localize feature waveforms based on the findings of this study, thereby further simplifying the iterative steps and complexity of ECG signal processing systems.

5. Conclusions

This paper proposes a mathematical model based on recursive and differential methods, using statistical differential analysis to set key parameters within the model. In this model, the absolute value function is employed to eliminate negative interference in the differential feedback. The primary aim of the model is to preliminarily locate the period positions of feature waveforms in ECG signals while maintaining their amplitude and trends. Simultaneously, it effectively attenuates interference from other signals, including other feature waveforms, bringing their fluctuations closer to the baseline. This highlights the feature waveform, providing a more reliable foundation for subsequent feature extraction tasks, thereby enhancing the accuracy and efficiency of the extraction process. The study validates the effectiveness of this model using clinical and open-source data.

This algorithm simplifies the preprocessing and feature extraction processes of ECG signals, reducing their complexity and tediousness, thereby enhancing the stability and reliability of subsequent biological signal analysis. Moreover, the algorithm provides a scientific method for parameter setting that does not solely rely on empirical estimation, which is a valuable reference for researchers lacking experience.

Author Contributions

Conceptualization, T.X. and E.H.; methodology, T.X., B.W. and E.H.; software, T.X. and E.H.; validation, T.X., M.L. and C.-C.C.; formal analysis, Y.S.; resources, M.L.; Y.D. and Y.S.; data curation, T.X. and E.H.; writing—original draft preparation, T.X.; writing—review and editing, Y.S. and L.Z.; funding acquisition, Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Shanghai Municipal Commission of Economy and Informatization, grant number 2020-RGZN-02052 and The APC was funded by Yaojie Sun.

Data Availability Statement

All subjects completed the Exemption Application Form for Informed Consent. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of Zhongshan Hospital Fudan University (Approval No.: B2021-649).

Acknowledgments

We acknowledge the support from Project Shanghai Municipal Commission of Economy and Informatization Special Project for Artificial Intelligence Innovation and Development in 2020 and the financial assistance provided by the corresponding author, Yaojie Sun, for this article.

Conflicts of Interest

Author Enruo Huang was employed by the company Shanghai Xiazhi Information Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Gómez-Echavarría, A.; Ugarte, J.P. The fractional Fourier transform as a biomedical signal and image processing tool: A review. Biocybern. Biomed. Eng. 2020, 40, 1081–1093. [Google Scholar] [CrossRef]
Mandigers, L.; Rietdijk, W.J.R. Cardiac Rhythm Changes During Transfer from the Emergency Medical Service to the Emergency Department: A Retrospective Tertiary Single-Center Analysis on Prevalence and Outcomes. J. Emerg. Med. 2023, 65, e180–e187. [Google Scholar] [CrossRef] [PubMed]
Moore, J.P. Interoceptive signals from the heart and coronary circulation in health and disease. Auton. Neurosci. 2024, 253, 103180. [Google Scholar] [CrossRef] [PubMed]
Djambazova, K.V.; Van Ardenne, J.M. Advances in imaging mass spectrometry for biomedical and clinical research. TrAC Trends Anal. Chem. 2023, 169, 117344. [Google Scholar] [CrossRef] [PubMed]
Moreno-Sanchez, P.A.; Garcia-Isla, G. ECG-based data-driven solutions for diagnosis and prognosis of cardiovascular diseases: A systematic review. Comput. Biol. Med. 2024, 172, 108235. [Google Scholar] [CrossRef] [PubMed]
Breen, C.J.; Kelly, G.P. ECG interpretation skill acquisition: A review of learning, teaching and assessment. J. Electrocardiol. 2022, 73, 125–128. [Google Scholar] [CrossRef] [PubMed]
Serhani, M.A.; TEl Kassabi, H.; Ismail, H.; Nujum Navaz, A. ECG Monitoring Systems: Review, Architecture, Processes, and Key Challenges. Sensors 2020, 20, 1796. [Google Scholar] [CrossRef]
Mazhar, T.; Nasir, Q. A Novel Expert System for the Diagnosis and Treatment of Heart Disease. Electronics 2022, 11, 3989. [Google Scholar] [CrossRef]
Hannun, A.Y.; Rajpurkar, P. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 2019, 25, 65–69. [Google Scholar] [CrossRef]
Popov, A.; Ivanko, K. Chapter 1—Introduction to biomedical signals and biomedical imaging. In Advances in Artificial Intelligence; Pal, K., Neelapu, B.C., Sivaraman, J., Eds.; Academic Press: Cambridge, MA, USA, 2024; pp. 1–57. [Google Scholar]
Karapinar Senturk, Z. From signal to image: An effective preprocessing to enable deep learning-based classification of ECG. Mater. Today Proc. 2023, 81, 1–9. [Google Scholar] [CrossRef]
Wu, H. Multiscale entropy with electrocardiograph, electromyography, electroencephalography, and photoplethysmography signals in healthcare: A twelve-year systematic review. Biomed. Signal Process. Control 2024, 93, 106124. [Google Scholar] [CrossRef]
Saha, S.; Barman Mandal, S. FPGA implementation of IIR elliptic filters for de-noising ECG signal. Biomed. Signal Process. Control 2024, 96, 106544. [Google Scholar] [CrossRef]
Szczęsna, A.; Augustyn, D.R. Chaotic biomedical time signal analysis via wavelet scattering transform. J. Comput. Sci. 2023, 72, 102080. [Google Scholar] [CrossRef]
Kizilkaya, A.; Elbi, M.D. A fast approach of implementing the Fourier decomposition method for nonlinear and non-stationary time series analysis. Signal Process. 2023, 206, 108916. [Google Scholar] [CrossRef]
Shamaee, Z.; Mivehchy, M. Dominant noise-aided EMD (DEMD): Extending empirical mode decomposition for noise reduction by incorporating dominant noise and deep classification. Biomed. Signal Process. Control 2023, 80, 104218. [Google Scholar] [CrossRef]
Ardeti, V.A.; Kolluru, V.R. An overview on state-of-the-art electrocardiogram signal processing methods: Traditional to AI-based approaches. Expert Syst. Appl. 2023, 217, 119561. [Google Scholar] [CrossRef]
Wasimuddin, M.; Elleithy, K. Stages-Based ECG Signal Analysis from Traditional Signal Processing to Machine Learning Approaches: A Survey. IEEE Access 2020, 8, 177782–177803. [Google Scholar] [CrossRef]
Jain, S.; Paul, S. Design of filters using current amplifiers for removal of noises from ECG signal. Procedia Comput. Sci. 2023, 218, 1888–1904. [Google Scholar] [CrossRef]
Zarei, S.M.; Moghadam, N.Y. ECG noise removal using wavelet transform during the gait. Gait Posture 2023, 106, S133–S134. [Google Scholar] [CrossRef]
Allam, J.P.; Sahoo, S.P. Multi-stream Bi-GRU network to extract a comprehensive feature set for ECG signal classification. Biomed. Signal Process. Control 2024, 92, 106097. [Google Scholar] [CrossRef]
Lastre-Dominguez, C.; Shmaliy, Y.S. Denoising and Features Extraction of ECG Signals in State Space Using Unbiased FIR Smoothing. IEEE Access 2019, 7, 152166–152178. [Google Scholar] [CrossRef]
Sadhukhan, D.; Mitra, M. R-Peak Detection Algorithm for Ecg using Double Difference And RR Interval Processing. Procedia Technol. 2012, 4, 873–877. [Google Scholar] [CrossRef]
Zheng, X.; Zhou, Q.; Zhou, N.; Liu, R.; Hao, Z.; Qiu, Y. A dichotomy-based variational mode decomposition method for rotating machinery fault diagnosis. Meas. Sci. Technol. 2020, 31, 015003. [Google Scholar] [CrossRef]
Serhal, H.; Abdallah, N. Overview on prediction, detection, and classification of atrial fibrillation using wavelets and AI on ECG. Comput. Biol. Med. 2022, 142, 105168. [Google Scholar] [CrossRef]
Oktivasari, P.; Hasyim, M. A Simple Real-Time System for Detection of Normal and Myocardial Ischemia in The ST segment and T Wave ECG Signal. In Proceedings of the 2019 International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia, 24–25 July 2019; pp. 327–331. [Google Scholar] [CrossRef]
do Vale Madeiro, J.P.; Lobo Marques, J.A. Evaluation of mathematical models for QRS feature extraction and QRS morphology classification in ECG signals. Measurement 2020, 156, 107580. [Google Scholar] [CrossRef]
Curtin, A.E.; Burns, K.V. QRS Complex Detection and Measurement Algorithms for Multichannel ECGs in Cardiac Resynchronization Therapy Patients. IEEE J. Transl. Eng. Health Med. 2018, 6, 1900211. [Google Scholar] [CrossRef]
Umer, M.; Bhatti, B.A. Electrocardiogram Feature Extraction and Pattern Recognition Using a Novel Windowing Algorithm. Adv. Biosci. Biotechnol. 2014, 5, 886–894. [Google Scholar] [CrossRef]
Anbalagan, T.; Nath, M.K. Analysis of various techniques for ECG signal in healthcare, past, present, and future. Biomed. Eng. Adv. 2023, 6, 100089. [Google Scholar] [CrossRef]
Pan, J.; Tompkins, W.J. A real-time QRS detection algorithm. IEEE Trans. Biomed. Eng. 1985, 32, 230–236. [Google Scholar] [CrossRef]

Figure 1. Overview diagram of the algorithm model. The data in the figure are from clinical ECG data for ten typical symptoms randomly selected from Zhongshan Hospital, Fudan University, Shanghai, China. Each segment is 10 s long with a sampling frequency of 500 Hz. Use different colors to represent each data set, and randomly select a band from each data set to highlight with an arrow.

Figure 2. Parameter setting process flowchart.

Figure 3. Parameter setting steps based on differential analysis.

Figure 4. Comparison of different feedback with absolute values. the top The top figure: clinical ECG data of 500 Hz, 10 s. The middle figure: output after Formula (2). The bottom figure: output of Formula (2) without absolute values. The red box in the figure compares the algorithm’s removal of negative interference with cases where negative interference is present.

Figure 5. Clinical ECG data-10 typical conditions. Top image: raw data. Bottom image: data after model processing. The red box in the figure contrasts the initial positioning of feature waveforms before and after algorithmic processing, while the pink circle highlights the comparison before and after processing non-characteristic waveforms. Subfigures (a–j) depict data randomly selected from ten typical disease categories in clinical datasets.

Figure 6. Ten different ECG data samples from MIT-BIH Database Lead II. Top Image: MIT-BIH data; bottom image: data after model processing. (a): From the 90th second (b): From the 990th second (c): From the 790th second (d): From the 900th second (e): From the 790th second (f): From the 680th second (g): From the 480th second (h): From the 400th second (i): From the 270th second (j): From the 200th second. The red box in the figure contrasts the initial positioning of feature waveforms before and after algorithmic processing, while the pink circle highlights the comparison before and after processing non-characteristic waveforms.

Figure 7. Comparison with typical differential-based analysis in ECG. The short-term pulses before characteristic waveforms remain prominent, indicated by the red arrow, while non-characteristic waveforms also remain noticeable, indicated by the black arrow. Subfigures (a–c) respectively represent the original data, results after typical differential filtering algorithm processing, and results after processing with the algorithm proposed in this study.

Table 1. Test of normality for the residuals of observational and predicted values.

	Kolmogorov-Smirnova			Shapiro–Wilk
	Statistic	df	Sig.	Statistic	df	Sig.
D_value	0.163	5000	0.000	0.810	5000	0.000

Lilliefors significance correction. D_value:

d_{1} - y_{1}

,

α = 0.94

.

Table 2. Result of paired-sample Wilcoxon test for two groups of data

d_{1}

and

y_{1}

.

d_{1}

: raw data;

y_{1}

: output data from the algorithm model when

α = 0.94

.

Table 2. Result of paired-sample Wilcoxon test for two groups of data

d_{1}

and

y_{1}

.

d_{1}

: raw data;

y_{1}

: output data from the algorithm model when

α = 0.94

.

		N	Mean Rank	Sum of Ranks
$y_{1} - d_{1}$	Negative Ranks	192 ^a	347.70	66,759.00
	Positive Ranks	4791 ^b	2577.93	12,350,877.00
	Ties	17 ^c
	Total	5000
		$y_{1} - d_{1}$
Z		−60.480 ^d
Asymp. Sig. (2-tailed)		0.000

^a

y_{1}

< Real. ^b

y_{1}

> Real. ^c

y_{1}

= Real. ^d Based on negative ranks.

y_{1} - d_{1}

: the difference between the output processed by the algorithm and the original data.

Table 3. Result of paired-sample Wilcoxon test for the predicted data and the reset observational data corresponding to different values of

α

.

Table 3. Result of paired-sample Wilcoxon test for the predicted data and the reset observational data corresponding to different values of

α

.

		N	Mean Rank	Sum of Ranks
	Negative Ranks	52 ^a	48.47	2520.50
Predicted data α = 0.94	Positive Ranks	59 ^b	62.64	3695.50
Reset raw data α = 0.94	Ties	4889 ^c
	Total	5000
	Negative Ranks	27 ^d	26.59	718.00
Predicted data α = 0.95	Positive Ranks	83 ^e	64.90	5387.00
Reset raw data α = 0.95	Ties	4890 ^f
	Total	5000
	Negative Ranks	70 ^g	60.04	4203.00
Predicted data α = 0.93	Positive Ranks	41 ^h	49.10	2013.00
Reset raw data α = 0.93	Ties	4889 ⁱ
	Total	5000
	Predicted data-Reset raw data, α = 0.94	Predicted data-Reset raw data, α = 0.95	Predicted data-Reset raw data, α = 0.93
Z Asymp. Sig. (2-tailed)	−1.729 ^j 0.084	−6.963 ^j 0.000	3.222 ^k 0.001

^a Predicted data α = 0.94 < reset raw data α = 0.94. ^b Predicted data α = 0.94 > reset raw data α = 0.94. ^c Predicted data α = 0.94 = reset raw data α = 0.94. ^d Predicted data α = 0.95 < reset raw data α = 0.95. ^e Predicted data α = 0.95 > reset raw data α = 0.95. ^f Predicted data α = 0.95 = reset raw data α = 0.95. ^g Predicted data α = 0.93 < reset raw data α = 0.93. ^h Predicted data α = 0.93 > reset raw data α = 0.93. ⁱ Predicted data α = 0.93 = reset raw data α = 0.93. ^j Based on negative ranks. ^k Based on positive ranks.

Table 4. Consistency and variability analysis results for 10 typical clinical ECG conditions. Raw: raw data; result: output signals after model processing.

	Example Data	(a)		(b)		(c)		(d)		(e)
Validation Metrics		Raw	Result	Raw	Result	Raw	Result	Raw	Result	Raw	Result
Variance		6.32 × 10⁻⁴	2.21 × 10⁻⁴	1.20 × 10⁻⁴	6.23 × 10⁻⁵	4.18 × 10⁻³	3.28 × 10⁻⁴	3.63 × 10⁻³	2.28 × 10⁻⁴	1.10 × 10⁻³	1.26 × 10⁻⁴
Correlation coefficient		0.996		0.999		0.995		0.996		0.956
	Example data	(f)		(g)		(h)		(i)		(j)
Validation metrics		Raw	Result	Raw	Result	Raw	Result	Raw	Result	Raw	Result
Validation		3.60 × 10⁻⁵	2.34 × 10⁻⁵	4.21 × 10⁻³	3.01 × 10⁻⁵	6.04 × 10⁻⁴	6.89 × 10⁻⁷	2.41 × 10⁻³	1.29 × 10⁻⁴	2.58 × 10⁻³	2.20 × 10⁻⁵
Correlation coefficient		0.933		0.984		0.990		0.997		0.990

Table 5. Consistency and variability analysis results for 5 sets from MIT-BIH database.

	Example Data	(a)		(b)		(c)		(d)		(e)
Validation Metrics		Raw	Result	Raw	Result	Raw	Result	Raw	Result	Raw	Result
Variance		2.68 × 10⁻²	3.07 × 10⁻⁴	8.66 × 10⁻²	5.64 × 10⁻²	4.82 × 10⁻³	9.84 × 10⁻⁴	2.60 × 10⁻²	9.28 × 10⁻⁴	6.80 × 10⁻³	2.15 × 10⁻³
Correlation coefficient		0.998		0.994		0.998		0.997		0.999

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xia, T.; Wang, B.; Huang, E.; Du, Y.; Zhang, L.; Liu, M.; Chang, C.-C.; Sun, Y. An Algorithm for Initial Localization of Feature Waveforms Based on Differential Analysis Parameter Setting and Its Application in Clinical Electrocardiograms. Electronics 2024, 13, 2996. https://doi.org/10.3390/electronics13152996

AMA Style

Xia T, Wang B, Huang E, Du Y, Zhang L, Liu M, Chang C-C, Sun Y. An Algorithm for Initial Localization of Feature Waveforms Based on Differential Analysis Parameter Setting and Its Application in Clinical Electrocardiograms. Electronics. 2024; 13(15):2996. https://doi.org/10.3390/electronics13152996

Chicago/Turabian Style

Xia, Tongnan, Bei Wang, Enruo Huang, Yijiang Du, Laiwu Zhang, Ming Liu, Chin-Chen Chang, and Yaojie Sun. 2024. "An Algorithm for Initial Localization of Feature Waveforms Based on Differential Analysis Parameter Setting and Its Application in Clinical Electrocardiograms" Electronics 13, no. 15: 2996. https://doi.org/10.3390/electronics13152996

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Algorithm for Initial Localization of Feature Waveforms Based on Differential Analysis Parameter Setting and Its Application in Clinical Electrocardiograms

Abstract

1. Introduction

2. Methods

2.1. Mathematical Formulation of the Algorithm

2.2. Meaning and Purpose

2.3. Parameter Setting Based on Difference Analysis

2.4. Absolute Value Differential Feedback— $y_{2}$

3. Case Study

3.1. Clinical ECG Signals

3.2. MIT-BIH ECG Signals

4. Result and Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

An Algorithm for Initial Localization of Feature Waveforms Based on Differential Analysis Parameter Setting and Its Application in Clinical Electrocardiograms

Abstract

1. Introduction

2. Methods

2.1. Mathematical Formulation of the Algorithm

2.2. Meaning and Purpose

2.3. Parameter Setting Based on Difference Analysis

2.4. Absolute Value Differential Feedback— y 2

3. Case Study

3.1. Clinical ECG Signals

3.2. MIT-BIH ECG Signals

4. Result and Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.4. Absolute Value Differential Feedback— $y_{2}$