Optimal Heart Sound Segmentation Algorithm Based on K-Mean Clustering and Wavelet Transform

Xu, Xingchen; Geng, Xingguang; Gao, Zhixing; Yang, Hao; Dai, Zhiwei; Zhang, Haiying

doi:10.3390/app13021170

Open AccessArticle

Optimal Heart Sound Segmentation Algorithm Based on K-Mean Clustering and Wavelet Transform

by

Xingchen Xu

^1,2

,

Xingguang Geng

^1,2,

Zhixing Gao

^1,2

,

Hao Yang

^1,2,

Zhiwei Dai

^1,2 and

Haiying Zhang

^1,2,*

¹

Institute of Microelectronics of the Chinese Academy of Sciences, Beijing 100029, China

²

University of Chinese Academy of Sciences, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(2), 1170; https://doi.org/10.3390/app13021170

Submission received: 23 October 2022 / Revised: 8 December 2022 / Accepted: 11 January 2023 / Published: 15 January 2023

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The accurate localization of S1 and S2 is essential for heart sound segmentation and classification. However, current direct heart sound segmentation algorithms have poor noise immunity and low accuracy. Therefore, this paper proposes a new optimal heart sound segmentation algorithm based on K-means clustering and Haar wavelet transform. The algorithm includes three parts. Firstly, this method uses the Viola integral method and Shannon’s energy-based algorithm to extract the function of the envelope of the heart sound energy. Secondly, the time–frequency domain features of the acquired envelope are extracted from different dimensions and the optimal peak is searched adaptively based on a dynamic segmentation threshold. Finally, K-means clustering and Haar wavelet transform are implemented to localize S1 and S2 of heart sounds in the time domain. After validation, the recognition rate of S1 reached 98.02% and that of S2 reached 96.76%. The model outperforms other effective methods that have been implemented. The algorithm has high robustness and noise immunity. Therefore, it can provide a new method for feature extraction and analysis of heart sound signals collected in clinical settings.

Keywords:

cardiovascular diseases; feature extraction; localization; PCG; segmentation; integral waveform

1. Introduction

Heart disease is the leading cause of death worldwide. Cardiovascular disease is the number one causative factor threatening human life and health [1]. Heart sounds carry early pathological information about cardiovascular diseases [2], which is important for early warning and detection of cardiovascular diseases. Each contraction and relaxation of the heart constitutes one cardiac cycle. A normal heart sound is divided into the first heart sound (S1) and the second heart sound (S2) in each cardiac cycle, as shown in Figure 1. By extracting the features of S1 and S2, early warning and diagnostic modeling of cardiovascular diseases can be achieved [3]. How to accurately segment the heart sound signals is the first and most important step in heart sound modeling. However, heart sound segmentation is a difficult problem due to the non-stationarity of heart sounds and the influence of background noise (including real-world noise and other artificial noise) [4]. In addition, due to the physiological and pathological information in the cardiac cycle, in some pathological PCG signals it is challenging to locate both S1 and S2 [5].

At present, studies on heart sound localization using segmentation are mainly divided into two types: labeled segmentation and unlabeled segmentation. With labeled segmentation, other more mature synchronization signals are usually used as reference signals. Heart sound segmentation can be achieved by correlating heart sound signals according to their time-domain or time–frequency-domain feature labels in the mature signals. For example, El-Segaier et al. proposed to segment heart sounds with reference to corresponding ECG signals [6]. On this basis, Springer et al. and Min et al. used the relationship between the R-peak of the ECG signal and S1 in the heart sound to implement localization of S1 and S2 [7,8]. This method required acquisition of multiple signals, which resulted in certain limitations in practice. Unlabeled segmentation analyzes the heart sound directly, and does not depend on a reference signal. It is the mainstream analysis method at present. It mainly extracts the heart sound envelope based on time domain or frequency domain. The time-domain features of S1 and S2 have been classified and detected. Kurtosis of Hilbert envelopes of heart sound signals were analyzed using a zero-frequency filter to locate PCG signals [9]. Akram et al. proposed a homomorphic envelope method to extract the envelope features of PCG signals [10]. Shivhare et al. extracted the heart sound envelopes using Shannon energy and S-transform [11]. Robustness of detection and classification using stationary wavelet transform and Hilbert phase envelope was used to verify heart sounds and murmurs under normal and pathological conditions [12]. The team also proposed PCG signal decomposition based on empirical wavelet transform to distinguish heart sounds and heart murmurs [13]. The main objective of the above-mentioned study using envelope extraction methods was to highlight the basic heart sound and minimize the noise, which resulted in good accuracy in case of clean PCG signals. However, these algorithms suffered from a high number of burrs in the extracted envelopes and had poor noise immunity. In addition, an unsupervised low-complexity algorithm for detecting S1 and S2 using empirical mode decomposition (EMD) was proposed. In this technique, PCG signals were decomposed into some specific functions called intrinsic mode functions (IMF), to extract the basic characteristics of the signal in the time domain [14]. On this basis, ensemble empirical mode decomposition (EEMD) combined with kurtosis was used to identify heart sound features [15]. Banerjee et al. used variational mode decomposition combined with Shannon energy to analyze PCG signals [16]. The main limitation of this technology was the choice of IMF for heart sound segmentation under different types of spectral characteristics such as heart murmurs, background noise and so on. Furthermore, Zaeemzadeh et al. used Shannon energy and K-means clustering to segment heart sounds [17]. We noticed that K-means is a non-deterministic, unsupervised, numerical, iterative method of clustering [18]. In K-means each cluster is represented by the mean value of objects in the cluster. A group of n objects is divided into k clusters so that intercluster similarity is low and intracluster similarity is high. Similarity is measured in terms of mean value of objects in a cluster. K-means clustering can be used to determine the spacing between S1 and S2 and between S2 and S1 of a normal cardiac cycle.

From these, a new optimal heart sound segmentation algorithm based on K-means clustering and wavelet transform is proposed in this paper. The algorithm consists of three parts. Firstly, the envelope energy function of PCG signals is calculated using the Viola integral method and Shannon’s energy-based algorithm. Secondly, the time–frequency domain feature is used to obtain the optimal peak. Finally, K-means clustering is used to obtain the standard time interval. Haar wavelet transform is used to detect the error point. Then the localization of S1 and S2 is achieved.

2. Methods

The flowchart in Figure 2 illustrates the operating principle of our heart sound segmentation method. Firstly, PCG signals were band-pass filtered. Next, this method used the Viola integral method and Shannon’s energy-based algorithm to calculate the function of the envelope of the heart sound energy. The maximum or peak in the energy curve indicates the likelihood of a heart sound event. The dynamic threshold method was used to obtain the optimal peak value so that we could preliminarily detect the positions of S1 and S2. In view of the inevitably large number of missed and false detections in the preliminary detection, the error correction algorithm characterized by time-domain spacing was designed. Finally, in this study we used K-means clustering and Haar wavelet transform in data processing to segment heart sound signals. Automatic and accurate positioning of S1 and S2 was realized.

2.1. Pre-Processing

Before further processing, a fourth-order Butterworth bandpass filter was used on the signal to retain the PCG signal with a frequency between 20 and 400 Hz, since the main information of the heart sound is concentrated in the low frequency.

2.2. Envelope Extraction Based on Viola Integral and Shannon Energy

The envelope energy function of PCG signals was calculated using the Viola integral method and Shannon’s energy-based algorithm. This method mainly consists of two parts. The first part uses the Viola integral to extract the envelope of the pre-processed audio signal. The second part uses first-order Shannon energy to highlight the low amplitude peak. Compared with the envelope obtained using a Hilbert envelope, we obtained a smoother heart sound envelope using this method.

The Viola integral method was first applied in face detection and recognition. Viola et al. proposed a fast detection algorithm based on an adaptive method and successfully applied it to the fast detection of images [19]. The algorithm is simple, fast, and real-time. Based on this, Yan et al. applied this method to heart sound analysis and achieved good results [20].

We used the Viola integral to extract the envelopes of heart sounds. Then we needed to determine a time scale

L_{T}

.

L_{T}

is related to the duration of S1 and S2. Numerous studies claim that the duration of S1 is usually greater than 0.1 s, so

L_{T}

can be estimated as

L_{T} = 0.5 \times 0.1 \times fs

where

fs

is the sampling frequency of the PCG signal. The mean sequence of the signal

{\bar{X}}_{T} (m)

is expressed as

{\bar{X}}_{T} (m) = \frac{1}{2 L_{T} + 1} \sum_{k = m - L_{T}}^{k = m - L_{T}} X_{T} (k)

(1)

The Viola integral envelope is calculated using

E_{T} (m) = \frac{1}{2 L_{T} + 1} \sum_{k = m - L_{T}}^{k = m - L_{T}} {[X_{T} (k) - {\bar{X}}_{T} (m)]}^{2}

(2)

where

{m = L}_{T} {, L}_{T} {+ 1, \dots, M - 1 - L}_{T}

.

The normalized sequence is the new envelope

E (m)

, which is defined by

E (m) = \frac{E_{T} (m)}{\max (| E_{T} (m) |)}

(3)

Shannon energy can suppress the interference of high-intensity signals, which can reflect the peaks of low-intensity amplitudes [21]. The Shannon energy formula of order n can be written as Equation (4). Here we chose the first order. According to Shannon’s energy formula, when

X (t) = 1

,

logX (t) = 0

. This results in a double-peak problem. The sequence

X (t)

is obtained by multiplying the new envelope with the empirical parameter 0.7, which is to eliminate the double-peak problem after using Shannon’s energy formula.

X (t) = c \times E (m) E (t) = - X {(t)}^{n} logX {(t)}^{n}

(4)

To eliminate small spikes, Shannon energy is converted into the normalized average Shannon energy, which is called Shannon energy envelope (SEE) [22]. The width of the smoothing window

N

is the sampling point of 20 ms.

E_{s (t)} = \frac{1}{N} \sum_{i = 1}^{N} E (t) logE (t)

(5)

After normalization, it is shown as

P (t) = \frac{E_{s (t)} - Mean (E_{s (t)})}{Std (E_{s (t)})}

(6)

The following Figure 3 compares the original signal and the extracted envelope. This method obtained the smoother heart sound envelope using improved Viola integral compared with the envelope obtained by Hilbert envelope.

2.3. Adaptive Optimal-Peak Finding Based on Dynamic Spacing and Dynamic Threshold Segmentation

Adaptive optimal-peak searching based on dynamic threshold segmentation included three parts. The first part was determining the initial step length, i.e., heart rate. It used the Fourier transform after similar empirical mode decomposition (which detected the peak point and then used third-order spline interpolation) of the envelope. The second part was to get the dynamic threshold and extract the upper and lower envelopes. The third part was to determine the optimal peak point iteratively. According to the initial step length, the optimal peak point is obtained using the dynamic distance and dynamic threshold.

On the one hand, the extracted envelope was processed using a similar empirical mode decomposition method to obtain the processed signal

X (n)

. In the time domain, the interval between S1 and next S1 is an instantaneous period in a PCG signal. Using fast Fourier transform (FFT), the fundamental frequency of a PCG signal envelope was obtained and the initial step length

T_{ini}

was determined.

On the other hand, for the extracted envelope, the upper envelope and lower envelope were extracted using a similar empirical mode decomposition. The upper and lower envelopes were averaged to obtain the dynamic threshold, as shown in Figure 4. The purple line represents the dynamic threshold.

Our research set the minimum distance between two peaks as

T_{\min}

. This detected the points where the peak points were greater than the dynamic threshold. The points with relatively small envelope wave peaks (i.e., false positive points) were eliminated.

J (P (i)) = {\begin{matrix} T_{\min} {= T}_{ini} \times 0.6 \\ \begin{matrix} P_{x} {(i + 1) - P}_{x} (i) {> T}_{\min} \\ P_{y} (i) > Thre \end{matrix} \end{matrix}

(7)

In the formula,

P_{x}

is the abscissa of the envelope peak point and

P_{y}

is the ordinate of the peak envelope point.

P_{y} (i)

is the local maximum. The method stated above is based on the selection of dynamic amplitude threshold. Next, time dynamic segmentation can be used to seek peaks. Based on the previous peak point, the time interval was obtained using the abscissa difference of the peak point

T_{off}

, which can get the dynamic time offset

T_{dy}

. Finally, the dynamic time offset was obtained to find the maximum and second maximum peaks in the quasi-periodic segmentation interval. The peak points are the initial S1 and S2.

J (P (i)) = {\begin{matrix} T_{dy} {= diff (P}_{x} (i)) \\ \begin{matrix} T_{off} = Para {. \times T}_{dy} \\ P_{y} (i) > Thre \end{matrix} \end{matrix}

(8)

The parameters in the formula were set to 0.3 and 0.5 to obtain different dynamic spacing in peak seeking. Then the resulting peak points were merged. This had the advantage of preventing omissions and improving accuracy.

For a PCG signal, it is easy to miss the detection of the envelope peak point due to artificial jitter or low-frequency noise interference. Here we propose an algorithm for missing points. The average spacing between the last obtained peak points can be calculated using

\bar{T}

. The upper threshold is defined as

T_{u} = 1.3 \times \bar{T}

. When the distance between two envelope peaks is greater than

T_{u}

, the missing points exist. The lower threshold is defined as

T_{1} = 0.3 \times \bar{T}

. When the distance between two envelope peaks is less than

T_{u}

, the redundant points exist.

It completes the third iteration at this point. Then

\bar{T}

is constantly updated for iteration until the average distance between peak points is the same as in the last iteration. Eventually, a series of optimal envelope peak value points were obtained. In order to prevent entering an endless loop, the number of iterations was set to six based on our experience. The next step was to classify the peak point as S1 and S2.

2.4. Segmentation Algorithm Based on K-Mean Clustering and Haar Wavelet Transform

The ideas mentioned above merely obtain the peak points of S1 and S2, but do not classify the types of S1 and S2. The following approach is proposed to classify the peak points using the time-domain feature. A PCG signal is a quasi-periodic signal according to PCG signal characteristics. In a normal cardiac cycle, the distance from S1 to S2 is invariably larger than that from S2 to S1. In this study, K-means clustering was used to determine the spacing between S1 and S2 and between S2 and S1 of a normal cardiac cycle. Moreover, Haar wavelet transform was used to remove error points.

Based on previous work, K-means clustering was used to classify the distance

S (i)

between peak points, which was divided into four categories. These four categories were the spacing from S1 to S2, the spacing from S2 to S1, the spacing with missing points, and the spacing with redundant points. If there were missing points, the spacing was larger than the normal one. For the same reason, if there were redundant points, the spacing was smaller than usual. We tried to detect the four cluster centers where the cluster centers were minimally spaced, using the Euclidean distance.

Here, the centers of the two largest clusters were taken as the distances from S1 to S2 and S2 to S1. The distances were regarded as masks.

The distance S(i) between peak points is defined by

S (i) = diff (P)

(9)

The following formula is the process of the K-means clustering algorithm:

{\begin{matrix} {dis (c}_{k}) = \sum_{S_{i} \in c_{k}} {| S_{i} {- μ}_{k} |}^{2} \\ S_{k} (i) = \min ({dis (c}_{k})) \\ μ_{k} = mean (S_{k} (i)) \end{matrix}

(10)

A drawing of the K-means clustering effect is shown in Figure 5. The four colors represent four chosen categories. Black dots represent their respective clustering centers. Red cross-symbols are redundant anchor points. Magenta squares are the category of spacing from S1 to S2. Blue asterisks and purple rhombi are the types of spacing from S2 to S1. There are redundant points in the diagram but no missing points.

Then we filtered the envelope peaks whose distances between peak points were within the range of clustering and labeled them. For example, S1 was label 1 and S2 was label 2. There were two solutions for the remaining peak points that were not within this range. First, if the number between two peak points was even or odd with regularity, we needed to fill in the label. Second, in case the data was not ideal, the method for finding error peak points was based on wavelet transform. We divided the odd and even peak point intervals into a column array each. If this column array appeared with a large step interval, it indicated an adverse event resulting in a descending or ascending platform when there was an error point. The high platform was the interval between S1 and S2, and the low platform was the interval between S2 and S1. The high platform and the low platform were symmetric. In this work we used Haar wavelet transform to acquire the step points of the high and low platforms, namely the error points.

Firstly, we used smooth fit on the discrete points to get a curve

f (t)

. Next, a discrete wavelet with Haar function was used to extract the approximate curve. Finally, the mutation point locations were calculated using Haar continuous wavelet transform. Among many orthogonal functions, the Haar wavelet function is simple to construct and easy to calculate.

In this paper, the fifth-level Haar discrete wavelet transform was selected to obtain an approximation of the fifth layer,

A 5

, which can accentuate the mutation position of the curve

f (t)

. Ultimately, the step position of curve

A 5

was obtained by Haar continuous wavelet transform. This study used Haar continuous wavelet transform with a scale of 2⁵ and a step size of 2. Then we selected the correlation coefficient at a scale of 3, because the relevance was the biggest and the step point was the most obvious at scale 3. At this point, we detected the error points and completed the labels of peak points.

When the entire classification was completed, the cycle was interrupted and the iteration finished.

2.5. Boundary Detection

After detecting the peak point mentioned above, less than 20% of the peak point amplitude was detected at the boundary of S1 and S2 durations where the peak points’ horizontal coordinate moved forward and backward within 0.3 s. The boundary did not exceed 0.3 s. If no point with an amplitude of 20% was detected, the smallest point amplitude within the range was selected as the boundary. As shown in Figure 6, this is the localization of S1 and S2 in one heart sound signal cycle. The figure also shows the duration of S1 and S2. The envelope was extracted from the PCG signal using the Viola integral method and Shannon’s energy-based algorithm.

2.6. Complexity of the Proposed Algorithm

Compared to neural networks our method performed better as it did not have the problem of over-fitting and increased complexity. The computational complexity of the algorithm for extracting the envelope is

O (n^{2} + mn + 1)

. Otherwise, for K-means clustering algorithms, computation of the distance between two attribute values takes

O (m^{2} n + m^{2} S^{3})

steps. For each iteration the order of computations is

O ({nkm}_{r} + {nkm}_{c} S)

, where n is the total number of elements, m is the total number of attributes, m_r is the number of numeric attributes, m_c is the number of categorical attributes, S is the average number of distinct categorical values, and k is the number of clusters. If there are p iterations, the computational cost of this algorithm is

O (m^{2} n + m^{2} S^{3} + pn ({km}_{r} + {km}_{c} S))

, which is linear with respect to the number of data objects. Finally, the forward hierarchical Haar wavelet transform is

O (n)

and the inverse wavelet transform is O(n²). As can be seen, the complexity of the proposed algorithm is equal to

O (2 n^{2} + (m^{2} + m + 1) n + 1 + m^{2} S^{3} + pn ({km}_{r} + {km}_{c} S))

. For instance, a decision-tree algorithm is

O (n^{m})

, which is exponential with respect to the number of data objects. It has an order of magnitude higher than K-means clustering. In this way, the proposed method reduces the computational complexity compared to other algorithms.

3. Results

3.1. Dataset

The dataset used in this study is from the PhysioNet/Computing in Cardiology Challenge 2016 training folder. This dataset is the MIT heart sound database. A total of 409 PCG recordings were made at nine different recording positions and orientations from 121 subjects. Each subject contributed several recordings. The subjects were divided into five groups: (1) normal control; (2) murmurs relating to mitral valve prolapse (MVP); (3) innocent or benign murmurs (Benign); (4) aortic disease (AD); and (5) other miscellaneous pathological conditions (MPC). These recordings were performed in an uncontrolled environment. More specifically, they were either performed during in-home visits or in the hospital. Furthermore, they also included stethoscope movements, breathing and intestinal sounds [23].

3.2. Evaluation Criteria

To evaluate the performance of the methods, the sensitivity

Se

and positive predictive value

PPV

were used as evaluation criteria. Sensitivity describes the proportion of identified positive cases to all positive cases. Meanwhile, the positive predictive value is the proportion of true positive cases in all positive cases detected via screening tests. The overall performance of the algorithm is measured via accuracy

Acc

. The respective formulas are defined by

\begin{matrix} & Se = \frac{TP}{TP + FN} \times 100 % \\ & PPV = \frac{TP}{TP + FP} \times 100 % \\ Acc = \frac{TP}{TP + FP + FN} \times 100 % \end{matrix}

(11)

In the formulas, true positives (TP) mean the number of S1 and S2 which are correctly detected in the PCG signals. False positives (FP) represent the number of S1 and S2 obtained by false detection. False negatives (FN) represent the number of misidentified S1 and S2 in heart sounds.

3.3. Evaluation Criteria

Thirty different heart sound signals were randomly selected. Their duration varied from 30 s to 36 s, in which the normal heart sound accounted for 1/5 and abnormal signal accounted for 4/5. Abnormal heart sound included murmurs relating to mitral valve prolapse (MVP), innocent or benign murmurs (Benign), aortic disease (AD), and other miscellaneous pathological conditions (MPC), each of which accounted for 1/5. This study proposes an adaptive PCG signal localization method based on K-means clustering and Haar wavelet transform. The results are shown in Table 1.

The recognition rate of S1 and S2 reached 98.02% and 96.76%, respectively.

Table 2 shows some previously reported studies on heart sound segmentation. At the same time, it compares the algorithm with the most advanced recent algorithms for extracting heart sound features. Our algorithm has lower computational complexity and higher accuracy. To a certain extent, our model can be compared with Shukla’s work [9]. The algorithm used in this reference is not smooth enough to extract the envelope, which causes some trouble for subsequent identification. The zero-frequency filter introduced by the system does not have real-time performance. The algorithm proposed in this paper can be embedded into a real-time processing hardware system later.

Figure 7 shows the segmentation of a heart sound with harmless or benign murmurs. It depicts S1 and S2 and divides a cardiac cycle into four segments, namely S1 duration, systole, S2 duration, and diastole.

4. Discussion

Cardiovascular disease is the leading cause of death worldwide, and heart sounds directly reflect cardiac mechanical and physiological activity. This paper proposed a new optimal heart sound segmentation algorithm based on K-means clustering and wavelet transform, which lays the foundation for identifying cardiovascular disease. Furthermore, we expect it will used in medical devices for early warning of heart diseases in future.

The innovations of this paper are as follows: Traditional segmentation algorithms need to make use of other signals to label heart sound characteristics. This kind of method needs to collect a variety of signals. In addition, the alignment and synchronization of PCG signals with other signals is tedious work, which has certain limitations in practical application. This paper solves the limitations of traditional segmentation algorithms. The envelope extraction algorithms commonly used nowadays are simple and effective, but have poor anti-noise ability, which is not conducive to the location of PCG signals. To solve a series of problems in the literature, such as poor anti-noise ability of the envelope and selection of intrinsic function, this paper proposes use of the Viola integral and Shannon energy to obtain a smaller envelope burr, suppress noise and retain the advantages of high efficiency. The model proposed in this research can identify S1 and S2 in PCG signals using adaptive peak seeking, with an accuracy of 98.02% for S1 and 96.76% for S2. The algorithm was evaluated for different pathologic heart sounds to test the validity of the algorithm on pathological PCG signals. This paper solves the problem of locating pathological heart sound signals. In addition, the algorithm presented in this paper has lower computational complexity compared with other algorithms in the literature. However, the algorithm also has some shortcomings. The K-means clustering algorithms only work well on complete datasets. The dataset used in this article can miss feature values due to technical limitations [26]. Consequently, there is still room for improvement in our proposed algorithm.

In the future, we will continue to explore heart murmurs on this basis, hoping to investigate the following aspects: On the one hand, we will utilize our own developed multi-dimensional kineticardiography acquisition equipment, such as seismocardiogram and phonocardiogram, to extract PCG signals and features. Moreover, we will apply the algorithm to hardware implementation and medical equipment. On the other hand, the segmentation algorithm of heart sounds can achieve a good segmentation effect for normal heart sounds and abnormal heart sounds with moderate murmurs. However, for abnormal heart sounds with high murmurs, the algorithm in this paper cannot achieve good results yet. Finally, we could improve a novel framework for clustering mixed numerical and categorical data with missing values to raise the accuracy of segmentation algorithms. K-CMM can efficiently cluster mixed datasets with missing values when the number of missing values increases in the datasets [27]. The issue deserves further study. Although there are still many problems in the field of PCG signal processing, researchers’ continuous efforts and doctors’ needs must lead to new developments and opportunities for PCG signal processing.

Author Contributions

Conceptualization, X.X., H.Z. and H.Y.; methodology, X.X. and Z.G.; software, X.X. and X.G.; data analysis, X.X.; writing—original draft preparation, X.X.; writing—review and editing, X.X. and X.G.; final approval, X.X., X.G., Z.G., Z.D., H.Y. and H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Institute of Microelectronics of the Chinese Academy of Sciences; grant number E1SA02E.

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the study we conducted was an analysis of a dataset of cardiac sound signals available online. We don’t collect any information from our volunteers.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Please refer to suggested Data Availability Statements at Ref. [23].

Conflicts of Interest

The authors declare no conflict of interest.

References

World Health Statistics 2021: Monitoring Health for the SDGs, Sustainable Development Goals; World Health Organization: Geneva, Switzerland, 2021.
Yuenyong, S.; Nishihara, A.; Kongprawechnon, W.; Tungpimolrut, K. A framework for automatic heart sound analysis without segmentation. Biomed. Eng. Online 2011, 10, 1–23. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Deng, S.W.; Han, J.Q. Towards heart sound classification without segmentation via autocorrelation feature and diffusion maps. Future Gener. Comput. Syst. 2016, 60, 13–21. [Google Scholar] [CrossRef]
Zhang, W.; Han, J. Towards heart sound classification without segmentation using convolutional neural network. In Proceedings of the 2017 Computing in Cardiology (CinC), Rennes, France, 24–27 September 2017; pp. 1–4. [Google Scholar]
Zhang, W.; Han, J.; Deng, S. Heart sound classification based on scaled spectrogram and partial least squares regression. Biomed. Signal Process. Control 2017, 32, 20–28. [Google Scholar] [CrossRef]
El-Segaier, M.; Lilja, O.; Lukkarinen, S.; Sörnmo, L.; Sepponen, R.; Pesonen, E. Computer-based detection and analysis of heart sound and murmur. Ann. Biomed. Eng. 2005, 33, 937–942. [Google Scholar] [CrossRef] [PubMed]
Springer, D.B.; Tarassenko, L.; Clifford, G.D. Logistic regression-HSMM-based heart sound segmentation. IEEE Trans. Biomed. Eng. 2015, 63, 822–832. [Google Scholar] [CrossRef] [PubMed]
Min, S.D.; Shin, H. A localization method for first and second heart sounds based on energy detection and interval regulation. J. Electr. Eng. Technol. 2015, 10, 2126–2134. [Google Scholar] [CrossRef] [Green Version]
Shukla, S.; Singh, S.K.; Mitra, D. An efficient heart sound segmentation approach using kurtosis and zero frequency filter features. Biomed. Signal Process. Control 2020, 57, 101762. [Google Scholar] [CrossRef]
Akram, M.U.; Shaukat, A.; Hussain, F.; Khawaja, S.G.; Butt, W.H. Analysis of PCG signals using quality assessment and homomorphic filters for localization and classification of heart sounds. Comput. Methods Programs Biomed. 2018, 164, 143–157. [Google Scholar]
Shivhare, V.K.; Sharma, S.N.; Shakya, D.K. Detection of heart sounds S1 and S2 using optimized S-transform and back-Propagation Algorithm. In Proceedings of the 2015 IEEE Bombay Section Symposium (IBSS), Mumbai, India, 10–11 September 2015; pp. 1–6. [Google Scholar]
Varghees, V.N.; Ramachandran, K.I. Heart murmur detection and classification using wavelet transform and Hilbert phase envelope. In Proceedings of the 2015 Twenty First National Conference on Communications (NCC), Mumbai, India, 27 February–1 March 2015; pp. 1–6. [Google Scholar]
Varghees, V.N.; Ramachandran, K.I. Effective heart sound segmentation and murmur classification using empirical wavelet transform and instantaneous phase for electronic stethoscope. IEEE Sens. J. 2017, 17, 3861–3872. [Google Scholar] [CrossRef]
Charleston-Villalobos, S.; Aljama-Corrales, A.T.; Gonzalez-Camarena, R. Analysis of simulated heart sounds by intrinsic mode functions. In Proceedings of the 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, New York, NY, USA, 30 August–3 September 2006; pp. 2848–2851. [Google Scholar]
Papadaniil, C.D.; Hadjileontiadis, L.J. Efficient heart sound segmentation and extraction using ensemble empirical mode decomposition and kurtosis features. IEEE J. Biomed. Health Inform. 2013, 18, 1138–1152. [Google Scholar] [CrossRef] [PubMed]
Banerjee, S.; Mishra, M.; Mukherjee, A. Segmentation and detection of first and second heart sounds (s1 and s2) using variational mode decomposition. In Proceedings of the 2016 IEEE EMBS conference on biomedical engineering and sciences (IECBES), Kuala Lumpur, Malaysia, 4–8 December 2016; pp. 565–570. [Google Scholar]
Zaeemzadeh, A.; Nafar, Z.; Setarehdan, S.K. Heart sound segmentation based on recurrence time statistics. In Proceedings of the 2013 20th Iranian Conference on Biomedical Engineering (ICBME), Tehran, Iran, 18–20 December 2013; pp. 215–218. [Google Scholar]
Yadav, J.; Sharma, M. A Review of K-mean Algorithm. Int. J. Eng. Trends Technol. 2013, 4, 2972–2976. [Google Scholar]
Viola, P.; Jones, M.J. Robust Real-Time Face Detection. Int. J. Comput. Vis. 2004, 57, 137–154. [Google Scholar] [CrossRef]
Yan, Z.; Jiang, Z.; Miyamoto, A.; Wei, Y. The moment segmentation analysis of heart sound pattern. Comput. Methods Programs Biomed. 2010, 98, 140–150. [Google Scholar] [CrossRef] [PubMed]
Zeng, W.; Lin, Z.; Yuan, C.; Wang, Q.; Liu, F.; Wang, Y. Detection of heart valve disorders from PCG signals using TQWT, FA-MVEMD, Shannon energy envelope and deterministic learning. Artif. Intell. Rev. 2021, 54, 6063–6100. [Google Scholar] [CrossRef]
Beyramienanlou, H.; Lotfivand, N. Shannon’s energy based algorithm in ECG signal processing. Comput. Math. Methods Med. 2017, 2017, 8081361. [Google Scholar] [CrossRef] [PubMed]
Liu, C.; Springer, D.; Li, Q.; Moody, B.; Juan, R.A.; Chorro, F.J.; Castells, F.; Roig, J.M.; Silva, I.; Johnson, A.E.W.; et al. An open access database for the evaluation of heart sound algorithms. Physiol. Meas. 2016, 37, 2181. [Google Scholar] [CrossRef] [PubMed]
Roquemen-Echeverri, V.; Jacobs, P.G.; Heitner, S.; Schulman, P.M.; Wilson, B.; Mahecha, J.; Mosquera-Lopez, C. An AI-Powered Tool for Automatic Heart Sound Quality Assessment and Segmentation. In Proceedings of the 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Houston, TX, USA, 9–12 December 2021; pp. 3065–3074. [Google Scholar]
Fernando, T.; Ghaemmaghami, H.; Denman, S.; Sridharan, S.; Hussain, N.; Fookes, C. Heart sound segmentation using bidirectional LSTMs with attention. IEEE J. Biomed. Health Inform. 2019, 24, 1601–1609. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Gao, K.; Khan, H.A.; Qu, W. Clustering with Missing Features: A Density-Based Approach. Symmetry 2022, 14, 60. [Google Scholar] [CrossRef]
Dinh, D.T.; Huynh, V.N.; Sriboonchitta, S. Clustering mixed numerical and categorical data with missing values. Inf. Sci. 2021, 571, 418–442. [Google Scholar] [CrossRef]

Figure 1. Normal PCG signal.

Figure 2. The proposed method framework for heart sound segmentation.

Figure 3. The original signal and the extracted envelope. (a) The original PCG signal waveform. (b) Viola integral waveform. (c) Improved Viola integral envelope. (d) Hilbert envelope. (c,d) Comparison of two envelope details.

Figure 4. The dynamic threshold. Blue line is the improved Viola integral envelope. The red and yellow line represent above and following envelopes. The purple line represents the dynamic threshold.

Figure 5. Drawing of the K-means clustering effect.

Figure 6. The boundary of a heart sound. (a) Notes of S1, S2 and the boundaries of S1 and S2 on the envelope. (b) Notes of S1, S2 and the boundaries of S1 and S2 on the original heart sound signal.

Figure 7. Segmentation of a heart sound.

Table 1. Performance of segmentation of heart sounds taken in different conditions.

PCG Signal	Types	FP	FN	TP	Se	PPV	Acc	Std
Total S1	Normal	6	0	212	100%	97.25%	97.25%	±2.46%
	Aortic	3	0	158	100%	98.14%	98.14%	±4.22%
	Benign	3	0	208	100%	98.58%	98.58%	±2.54%
	Miscell.	2	3	206	98.56%	99.04%	97.63%	±3.65%
	MVP	3	0	208	100%	98.58%	98.58%	±1.70%
	Total	17	3	992	99.70%	98.51%	98.02%	±2.93%
Total S2	Normal	9	0	213	100%	95.95%	95.95%	±2.58%
	Aortic	1	3	158	99.37%	99.37%	97.53%	±4.19%
	Benign	8	0	204	100%	96.22%	96.22%	±3.34%
	Miscell.	3	3	208	98.58%	98.58%	97.20%	±3.38%
	MVP	6	0	204	100%	97.14%	97.14%	±3.4%
	Total	27	6	987	99.40%	97.34%	96.76%	±3.35%

Table 2. Comparison of different algorithms for heart sound segmentation.

Method (The Reporter and Year)	Classifier	Computational Complexity	Se/PPV/Sp/Acc(%)
HE, kurtosis, and ZFF (Shukla et al., 2020) [9]		low	Se = 98.61, PPV = 99.11, Acc = 98.07
DWT and HE (Akram et al., 2018) [10]	SVM	low	Se = 87.68, Sp = 87.18, Acc = 87.42
EWT, SEE, and IP (Varghees et al., 2017) [13]	a decision-tree algorithm	low	Se = 98.00, PPV = 97.40, Acc = 95.50
EEMD and kurtosis (Papadaniil et al., 2013) [15]		low	Acc = 83.05
db4 wavelet, SE, and MFCC (Roquemen-Echeverri et al., 2021) [24]	MLP	high	Acc = 93.00
SHAP, ∆, and ∆² (Fernando et al., 2019) [25]	LSTMs and CNN	high	Se = 95.4, Sp = 96.2, PPV = 91.10, Acc = 96.00
Our proposal		low	Se = 99.55, PPV = 97.93, Acc = 97.39

Abbreviations: Ensemble empirical mode decomposition (EEMD), empirical wavelet transform (EWT), Shannon entropy envelope (SEE), instantaneous phase (IP), discrete wavelet transform (DWT), homomorphic envelogram (HE), SHapley additive explain (SHAP), long short-term memory (LSTMs), convolutional neural network (CNN), Hilbert envelope (HE), zero-frequency filter (ZFF), Daubechies (db4 wavelet), Shannon energy (SE), Mel frequency cepstral coefficients (MFCC), multi-layer perceptron (MLP).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, X.; Geng, X.; Gao, Z.; Yang, H.; Dai, Z.; Zhang, H. Optimal Heart Sound Segmentation Algorithm Based on K-Mean Clustering and Wavelet Transform. Appl. Sci. 2023, 13, 1170. https://doi.org/10.3390/app13021170

AMA Style

Xu X, Geng X, Gao Z, Yang H, Dai Z, Zhang H. Optimal Heart Sound Segmentation Algorithm Based on K-Mean Clustering and Wavelet Transform. Applied Sciences. 2023; 13(2):1170. https://doi.org/10.3390/app13021170

Chicago/Turabian Style

Xu, Xingchen, Xingguang Geng, Zhixing Gao, Hao Yang, Zhiwei Dai, and Haiying Zhang. 2023. "Optimal Heart Sound Segmentation Algorithm Based on K-Mean Clustering and Wavelet Transform" Applied Sciences 13, no. 2: 1170. https://doi.org/10.3390/app13021170

APA Style

Xu, X., Geng, X., Gao, Z., Yang, H., Dai, Z., & Zhang, H. (2023). Optimal Heart Sound Segmentation Algorithm Based on K-Mean Clustering and Wavelet Transform. Applied Sciences, 13(2), 1170. https://doi.org/10.3390/app13021170

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimal Heart Sound Segmentation Algorithm Based on K-Mean Clustering and Wavelet Transform

Abstract

1. Introduction

2. Methods

2.1. Pre-Processing

2.2. Envelope Extraction Based on Viola Integral and Shannon Energy

2.3. Adaptive Optimal-Peak Finding Based on Dynamic Spacing and Dynamic Threshold Segmentation

2.4. Segmentation Algorithm Based on K-Mean Clustering and Haar Wavelet Transform

2.5. Boundary Detection

2.6. Complexity of the Proposed Algorithm

3. Results

3.1. Dataset

3.2. Evaluation Criteria

3.3. Evaluation Criteria

4. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI