Deep Contrastive Learning-Based Model for ECG Biometrics

Ammour, Nassim; Jomaa, Rami M.; Islam, Md Saiful; Bazi, Yakoub; Alhichri, Haikel; Alajlan, Naif

doi:10.3390/app13053070

Open AccessArticle

Deep Contrastive Learning-Based Model for ECG Biometrics

by

Nassim Ammour

¹,

Rami M. Jomaa

²,

Md Saiful Islam

²

,

Yakoub Bazi

^1,*

,

Haikel Alhichri

¹

and

Naif Alajlan

¹

Computer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia

²

Computer Science Department, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(5), 3070; https://doi.org/10.3390/app13053070

Submission received: 30 November 2022 / Revised: 16 February 2023 / Accepted: 22 February 2023 / Published: 27 February 2023

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

The electrocardiogram (ECG) signal is shown to be promising as a biometric. To this end, it has been demonstrated that the analysis of ECG signals can be considered as a good solution for increasing the biometric security levels. This can be mainly due to its inherent robustness against presentation attacks. In this work, we present a deep contrastive learning-based system for ECG biometric identification. The proposed system consists of three blocks: a feature extraction backbone based on short time Fourier transform (STFT), a contrastive learning network, and a classification network. We evaluated the proposed system on the Heartprint dataset, a new ECG biometrics multi-session dataset. The experimental analysis shows promising capabilities of the proposed method. In particular, it yields an average top1 accuracy of 98.02% on a new dataset built by gathering 1539 ECG records from 199 subjects collected in multiple sessions with an average interval between sessions of 47 days.

Keywords:

ECG biometric; biometric identification; contrastive learning; deep learning

1. Introduction

Recently, ECG has proven its capability as an emerging biometric modality for authentication and identification purposes. One of the key points that encourages researchers to investigate ECG biometrics is that it is viewed as a secure biometric modality [1,2]. The ECG biometric is an indicator for human liveness, making it more secure and distinctive compared to other traditional biometrics. The inherent liveness property of ECG inspires many works to consider it as the main biometric modality [3,4] or as a complementary modality that, when fused with other biometrics modalities, increases the system capabilities in detecting presentation attacks (PA) [2,5].

One of the early works in utilizing ECG as a biometric modality is the work of Biel et al. [6], where they presented ECG for human identification. Subsequently, several studies presenting ECG as a possible secure and accurate biometric modality were presented in many review papers [7,8,9]. ECG signals have shown their capabilities when used as a sole modality [3,10] or when fused with other biometric modalities such as fingerprint [2,5,11,12], iris [13], face [14], and sound [15].

Recently, deep learning techniques have shown promising capabilities in ECG biometrics [16]. In particular, convolutional neural networks (CNN) are the main deep learning techniques that have been used for ECG biometric authentication. Examples include the works based on vanilla architectures in [4,16], multi-resolution models [17,18], and residual models [19]. In the same way, recurrent neural networks (RNN) such as Long Short-Term Memory (LSTM) [3,20,21] or Gated Recurrent Unit (GRU) [22] have also been successfully applied for ECG biometric authentication.

Ihsanto et al. [19] proposed a deep learning-based framework for ECG biometric authentication. Their method starts with detecting beats from ECG signals and ends with identifying the patients by classifying the signal into one of the patients. The beats are detected either manually or automatically by applying Hamilton’s method [23,24]. Later, the Residual Depthwise Separable Convolutional Neural Network (RDSCNN) model is applied to classify the signal. Their proposed method achieved an accuracy of 98.89% and 97.92% in the automatic beats detection for ECG-ID and MIT-BIH, respectively. The accuracy enhanced to 100% in the manual beats detection in both datasets.

Tirado-Martin et Sanchez-Reillo [25] proposed a deep learning-based model, called BioECG, for ECG biometrics. The proposed model incorporates CNN and LSTM networks for classifying ECG signals. This study was developed and examined on a private dataset that represents potential scenarios in biometric recognition in which data was acquired in different days, in various subject positions, and/or doing different physical activities. Labati et al. [16] proposed a CNN-based approach for ECG biometrics, called Deep-ECG. They evaluated their model on the PTB Diagnostic ECG Database, with identification accuracy of 100%.

The Siamese network is a known deep learning architecture that is composed of two or more identical subnetworks. Several studies included Siamese networks for ECG biometric versification [4,26]. For example, Lanciu et al. [26] developed a system composed of two Siamese convolutional networks and evaluated it on the ECD-ID database. In their method, the R-peaks are detected from the input ECG and then used for training the Siamese network. Their system achieved an overall accuracy of 86.47%. Ibtehaz et al. [4] proposed an ECG biometric system called EDITH, which utilizes a CNN and Siamese architectures. They showed that a single heartbeat is enough for authentication with an accuracy of 96~99.75%. Using multiple beats (3 to 6 beats), they have enhanced the performance to 100%. In their method, the R-peaks are detected from the input ECG signal using a deep learning network (MultiResUNet). Then, the heartbeats are extracted and analyzed further using other deep learning networks. They used a Siamese network to compute the similarity between templates and probe signals, where two identical subnetworks were employed to compare the similarity of two input samples.

It is important to note that all previous works were applied and evaluated on medical ECG datasets or on datasets that were captured using medical sensors. Naturally, it would be better to use ECG datasets that are specifically captured for the purpose of biometric authentication.

In this paper, we present a new method for ECG biometrics identification based on deep learning techniques. The proposed method is based on the contrastive deep learning approach. Specifically, it is composed of three main blocks: feature extraction, contrastive optimization, and classification. In the feature extraction step, we rely on the Short Time Fourier Transform (STFT) which is based on the idea of splitting the signals into segments and then performing the Fast Fourier Method (FFT) method for each segment. This will allow us to better capture the spectral information in ECG signals at different temporal intervals. We evaluated the proposed system using a recent multi-session ECG biometrics dataset called Heartprint. The experimental results show promising capabilities of the method for ECG biometric identification.

The remaining paper is organized as follows: the datasets and proposed method are presented in the second section. In the third section, the experimental results of the proposed system are reported, analyzed, and discussed. Finally, the conclusions and future works are presented.

2. Materials and Methods

2.1. Dataset

In the literature, several ECG datasets are available [27]. Although these datasets are used for evaluating ECG biometric systems as presented in Table 1, most of these datasets are collected for medical purposes or collected using medical ECG devices. This implies that these datasets are captured with high quality, less noise, long durations, and adapted to conditions of relaxing and concentration. These conditions are not available while evaluating biometric systems. Therefore, there is a great need for ECG biometric datasets that are designated and collected for biometric purposes. Recently, such a dataset, called Heartprint [28], became available, and we select it for our work.

2.1.1. Dataset Description

The ECG signals in the Heartprint dataset are collected using ReadMyHeart ECG device by DailyCare BioMedical, Inc., Zhongli City, Taiwan (https://the-gadgeteer.com/2007/08/27/dailycare_readmyheart_100/ accessed on 30 November 2022), as shown in Figure 1. A 15 s ECG signal is captured at a sampling rate of 250 Hz via two electrodes touched to the thumbs of both hands. A dataset is built by gathering 1539 ECG records captured from 199 subjects collected in multiple sessions with an average interval of 47 days as shown in Table 2.

For the experiments in this work, we consider records for the first two sessions, two records from each subjects per session, which means 796 records in total (i.e., 398 records in each session).

2.1.2. Data Preprocessing

Generally, ECG signals may contain different types of noise, such as power-line interfaces, baseline wanders, and patient–electrode motion artifacts. In the preprocessing stage, we removed the noise by using a band-pass Butterworth filter of fourth order with cut-off frequencies of 0.25 and 40 Hz. Then, we detected R-peaks of the records by employing an efficient curvature-based method [29,30]. Finally, we generated windows of ECG signals around each R-peak by segmenting the whole ECG record with a fixed length (i.e., 0.5 s) as illustrated in Figure 2. These windows are presented to the biometric system as instances (samples) [31,32].

2.2. Proposed Method

A set of unique physiological, behavioral, and morphological attributes that can be captured and quantified, can characterize every human. The importance of accurate identification presents a challenging task for humankind insofar as the progress of smartness in society. In this context, we propose a deep learning technique with a contrastive optimization architecture for effectively identifying people through their ECG signals. Figure 3 presents the flowchart of the proposed system which is composed of three main blocks i.e., features extraction, contrastive optimization, and classification.

2.2.1. Features Extraction Backbone

In the features extraction block (see Figure 4), we aim to extract rich discriminative features that highly represent the input ECG signals. It is well known that pre-trained CNN architectures provide very effective deep feature representations. Although these architectures are designed to work for images, it is possible to convert the 1D ECG signal into an image [2,33].

Given an input of a raw ECG heartbeat signal or time series, the backbone feature extraction block first converts the 1D ECG signal into 2D images, using the STFT operation followed a set of convolutional layers. Given a subject

i

, the conversion sub-block takes as input the time-series ECG signals of dimension

n_{i} \times 740

, where

n_{i}

represents the number of heartbeats for the ith subject while 740 is the length of the heartbeat. Then, we randomly select five beats from the

n_{i}

beats to create an input with a fixed size of

5 \times 740

. We apply the STFT on the

5 \times 740

signal to generate a spectral image by extracting the time-frequency spectrogram and then paspassing through trainable convolutional layers. As a result, the conversion sub-block inputs a signal with size

5 \times 740

and outputs an image of size

3 \times 32 \times 32

.

The second sub-block is a pre-trained EfficientNet module used to extract highly discriminative features. One of the key features of EfficientNet model is that it performs scaling in all three dimensions (depth, width, and image resolution) in a principled way [34]. The authors of EfficientNet demonstrated that scaling ConvNets in one of the three dimensions (depth, width, or image resolution) leads to the saturation of the network’s gain as the network gets bigger. In order to overcome this, they proposed a compound scaling mechanism that uniformly scales the three dimensions of the network using fixed scaling coefficients. Additionally, they confirmed the significance of balancing a network in all dimensions by proposing the baseline network EfficientNet-B0 and scaling up this baseline network to get other EfficientNets variants (i.e., B1–B7). These architectures significantly beat other ConvNet architectures on the ImageNet classification while reducing the number of trainable parameters and inference time, which are important properties for real-time applications such as biometric systems. Furthermore, these models can learn features that are transferable and can achieve remarkable results on a variety of image classification problems.

Figure 5 shows the mobile inverted bottleneck layer, which is the main building block used in EfficientNet-B0. This layer is composed of an inverted residual block combined with squeeze-and-excitation (SE) blocks. The input feature map is projected into higher dimensional space using the inverted residual, then a depth-wise convolution operation is applied in the new space. Then, a point-wise convolution (1 × 1 convolution) operation with linear activation is applied to project back the new feature map to a low dimensional space. In the end, a residual connection from the input is added to the output of the point-wise convolution, which generates an output feature map. The model learns the weights of the channels of a feature map using the SE blocks. In the SE block, the input feature map is converted to a feature vector of size c (the number of channels) and then it is fed to two consecutive FC layers. Later, the output of FC layers, which is a vector of size

c

, is used to scale the channels according to their importance. Moreover, other aspects were considered while designing the baseline network by taking into account accuracy and real-world latency on mobile devices. By applying a compound scaling to the baseline network, a family of seven EfficientNet models was generated (EfficientNet-B1 to B7).

2.2.2. Contrastive Learning for Effective Leveraging of Cluster Distinction

Recently, the contrastive learning strategy has produced significant improvements in self-supervised representation learning [35]. The key idea in the contrastive learning technique is to attract an anchor to the same class sample called a positive sample in the embedding space and repel apart this anchor from the different class samples, called negative samples.

For a given input batch of data, and to apply a contrastive representation learning which is intuitively considered as learning by contrast or comparison of the anchor with a positive sample and a negative one, we need to create negative and positive pairs for the contrastive learning process.

Pair generation $P a i r (\cdot)$ : For each feature sample $w = f t r (x)$ , we randomly generate a set of positive samples $w^{+} = P a i r (w)$ from the same class, and a set negative samples $w^{-} = P a i r (w)$ from the other classes.
Encoder network Enc(·): This network receives the generated pairs $w^{+} / w^{-}$ separately and maps them to a couple of representation vectors $r = E n c (w) \in ℛ^{512}$ normalized to the unit hypersphere in $ℛ^{512}$ .
Projection network $P r o j (\cdot)$ : Used only in the contrastive training phase and discarded after, this multi-layer perceptron block maps the vector $r$ to representation vector $z = P r o j (r) \in ℛ^{128}$ . With the aim of measuring distances using inner product in the projection space, the output of the projection network is normalized to a unit hypersphere in $ℛ^{128}$ .

For pair of ECG input data

{x_{i}, x_{j}}

and their representations

{z_{i}, z_{j}}

, We can define the contrastive loss function

ℒ_{C L}

as [36]:

ℒ_{C L} (x_{i}, x_{j}) = 0.5 [y \cdot d {(x_{i}, x_{j})}^{2} + (1 - y) \cdot m a x {(m - d (x_{i}, x_{j}), 0)}^{2}]

(1)

where

d (x_{i}, x_{j})

is the distance between the representations of the input data

x_{i}

and

x_{j}

in the latent space as:

d (x_{i}, x_{j}) = ‖ z_{i} - z_{j} ‖_{2}

(2)

In Equation (1), the ground truth label

y = 1

indicates that the pair of signals are similar (from the same class), and

y = 0

for not-similar pair signals. The hyper-parameter

m

is the margin representing the threshold for the non-similar pair of signals, which is chosen by the user.

3. Results and Discussions

To illustrate the capabilities of the proposed method, we conducted multiple experiments using the Heartprint dataset.

3.1. Experimental Setup and Performance Evaluation

We carried out the experiments on the first and the second sessions of the dataset with the same subjects. To achieve the experiments, we divided this dataset into three parts: 50% for training, 20% for validation, and 30% for testing. At each training epoch, we evaluated the loss function on a validation set. For training our models, we utilized Google Colaboratory (Colab), which is a cloud computing service provided by Google. It offers hardware accelerators such as GPUs and TPUs which can be used to speed up the training process. In our case, the training time for all experiments was around 45 min.

3.2. Experimental Results

After the training process, we tested the model using the test set. Since the problem is a classification problem, we measured the reported top-one results as the main results, then we reported the top-5 accuracy (top-k means the correct class needs to be in the top-k predicted categories to count). The reported results are presented in Table 3 and Table 4.

In the first experiment, we evaluated the system on the first session of the dataset. We obtained an overall accuracy of 98.55% as shown in Table 3. The top-k accuracy starts from 98.20% for the top-1 accuracy and increases to 99.54% for the top-5 accuracy. As can be seen, the model provides good classification results, which motivate the use of ECG as a potential modality for biometrics applications.

In the second experiment, we used the second session of the dataset for evaluation. The model performs well also and an accuracy of 97.49% was obtained, as illustrated in Table 4. The top-k accuracy begins with 97.40% for top-1 accuracy and reaches 99.08% for the top-5 accuracy. These results again confirm the findings of the previous experiment.

In the third experiment, we used the first session for training the model and the second session of the dataset for the test (cross sessions scenario). An overall accuracy of 47.24% was obtained, as illustrated in Table 5 and Figure 6. The model does not perform well, and the accuracy dropped considerably, probably due to a data shifting after a long period of time. The top-k accuracy begins at 46.06% for top-1 accuracy and reaches 56.53% for the top-5 accuracy. These results suggest the development of suitable models for handling acquisitions over long time periods. Possible solutions to increase the accuracy will be the utilization of domain adaption methods to reduce the discrepancy between the distributions of training and testing sessions acquired over long periods.

To assess the robustness of the model in terms of training sample size, we use in the fourth experiment 50% of the first session data as a training set and 50% of the dataset as a test set. During this experiment, and with a reduced set of training samples, the model performs well with an accuracy of 98.66%, as illustrated in Table 6. The top-k accuracy begins with 97.66% for top1 accuracy and reaches 98.91% for the top5 accuracy.

For further analysis, we add other computation metrics to evaluate the method. For such purpose, we use the receiver operating characteristic curve (ROC), which is a graph of the True Positive Rate (TPR) parameter against the False Positive Rate (FPR) parameter. The area under the ROC curve (AUC) measures the area under the ROC curve, and provides the performance of the classifier across all possible thresholds. The area under the curve (AUC-ROC) visualizes how well the proposed deep model with classifier is performing, as illustrated in Figure 7.

Figure 8 illustrates the AUC-ROC curve for the cross-session experiment. In this experiment, we used the data of session 1 to train the model, and we used the data of session 2 to test the model’s performance. For the cross-sessions, we can see that the model does not perform well. A lot of misclassifications have occurred for multiple classes confirming the need to use domain adaptation methods to cope with the data-shift problem.

For the multiclass models, and in order to adapt the binary classification metric to the multiclass classification task (reduce the multiclass classification output into a binary classification one), the One versus Rest (OvR) evaluates the model performance by comparing each class (positive class) against all the others (negative class) at the same time. The One versus One (OvO) compares all possible two-classes combinations of the dataset. The performance measurements of the proposed model are illustrated in Table 7.

In order to manifest the impact of the interposition of the contrastive learning module on the performance of the proposed model, we conducted some experiments to classify the different session datasets using only the STFT feature extraction. The obtained results are illustrated in Table 8. We can remark from the obtained results that the contrastive learning module has contributed to improve the model performance, which motivates its uses as additional learning information to boost the model classification capabilities. Indeed, the contrastive leaning plays a significant role in learning the similarities between the ECG records of the same class and discovering the dissimilarities between the ECG records of different classes, therefore, so the overall accuracy of the system is high. The reported results of the proposed system show a promising performance and encourage us to improve system architecture and design to improve the top-1 accuracy.

4. Conclusions

In this research work, we proposed an end-to-end deep learning system for biometric identification using ECG biometrics signals. The ECG signal inherited by an individual is highly secure and very hard to be forged. The proposed system includes three main blocks. The first block is used to convert a 1D ECG signal into a 2D image suitable for feature extraction via convolutional networks and to extract the representing features using EfficientNet backbone. The second block is a contrastive learning module used to boost the dissimilarity between the different classes. The last block is a classification layer located at the head of the network. The experimental results reveal the promising capabilities of the proposed solution in terms of classification accuracy. For future developments, we plan to investigate advanced fusion strategies based on deep learning methods to improve the classification results, especially for the cross-session scenario. In addition, we plan to rely on domain adaption methods to reduce the discrepancy between training and testing sessions obtained from different long periods.

Author Contributions

Methodology, N.A. (Nassim Ammour); software, R.M.J.; formal analysis, M.S.I. and Y.B.; investigation, Y.B.; resources, H.A.; data curation, M.S.I. and H.A.; writing—original draft preparation, N.A. (Nassim Ammour); writing—review and editing, R.M.J. and Y.B.; supervision, N.A. (Naif Alajlan); funding acquisition, N.A. (Naif Alajlan). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science, Technology and Innovation Plan, King Abdulaziz City for Science and Technology, Saudi Arabia, grant number 13-INF2168-02.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board of King Saud University (E-22-6748).

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Odinaka, I.; Lai, P.-H.; Kaplan, A.D.; O’Sullivan, J.A.; Sirevaag, E.J.; Rohrbaugh, J.W. ECG Biometric Recognition: A Comparative Analysis. IEEE Trans. Inf. Forensics Secur. 2012, 7, 1812–1824. [Google Scholar] [CrossRef]
Jomaa, R.M.; Mathkour, H.; Bazi, Y.; Islam, M.S. End-to-End Deep Learning Fusion of Fingerprint and Electrocardiogram Signals for Presentation Attack Detection. Sensors 2020, 20, 2085. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kim, B.-H.; Pyun, J.-Y. ECG Identification for Personal Authentication Using LSTM-Based Deep Recurrent Neural Networks. Sensors 2020, 20, 3069. [Google Scholar] [CrossRef] [PubMed]
Ibtehaz, N.; Chowdhury, M.E.H.; Khandakar, A.; Kiranyaz, S.; Rahman, M.S.; Tahir, A.; Qiblawey, Y.; Rahman, T. EDITH: ECG Biometrics Aided by Deep Learning for Reliable Individual AuTHentication. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 6, 928–940. [Google Scholar] [CrossRef]
Komeili, M.; Armanfard, N.; Hatzinakos, D. Liveness Detection and Automatic Template Updating Using Fusion of ECG and Fingerprint. IEEE Trans. Inf. Forensics Secur. 2018, 13, 1810–1822. [Google Scholar] [CrossRef]
Biel, L.; Pettersson, O.; Philipson, L.; Wide, P. ECG Analysis: A New Approach in Human Identification. IEEE Trans. Instrum. Meas. 2001, 50, 808–812. [Google Scholar] [CrossRef] [Green Version]
Uwaechia, A.N.; Ramli, D.A. A Comprehensive Survey on ECG Signals as New Biometric Modality for Human Authentication: Recent Advances and Future Challenges. IEEE Access 2021, 9, 97760–97802. [Google Scholar] [CrossRef]
Hong, S.; Zhou, Y.; Shang, J.; Xiao, C.; Sun, J. Opportunities and Challenges of Deep Learning Methods for Electrocardiogram Data: A Systematic Review. Comput. Biol. Med. 2020, 122, 103801. [Google Scholar] [CrossRef]
Hassan, Z.; Jamil, S.O.G. and M. Review of Fiducial and Non-Fiducial Techniques of Feature Extraction in ECG Based Biometric Systems. Indian J. Sci. Technol. 2016, 9, 850–855. [Google Scholar] [CrossRef]
Belo, D.; Bento, N.; Silva, H.; Fred, A.; Gamboa, H. ECG Biometrics Using Deep Learning and Relative Score Threshold Classification. Sensors 2020, 20, 4078. [Google Scholar] [CrossRef]
Jomaa, R.M.; Islam, M.S.; Mathkour, H. Enhancing the Information Content of Fingerprint Biometrics with Heartbeat Signal. In Proceedings of the 2015 World Symposium on Computer Networks and Information Security (WSCNIS), Hammamet, Tunisia, 19–21 September 2015; pp. 1–5. [Google Scholar]
Jomaa, R.M.; Islam, M.S.; Mathkour, H.; Al-Ahmadi, S. A Multilayer System to Boost the Robustness of Fingerprint Authentication against Presentation Attacks by Fusion with Heart-Signal. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 5132–5143. [Google Scholar] [CrossRef]
Regouid, M.; Touahria, M.; Benouis, M.; Costen, N. Multimodal Biometric System for ECG, Ear and Iris Recognition Based on Local Descriptors. Multimed. Tools Appl. 2019, 78, 22509–22535. [Google Scholar] [CrossRef]
Israel, S.A.; Scruggs, W.T.; Worek, W.J.; Irvine, J.M. Fusing Face and ECG for Personal Identification. In Proceedings of the 32nd Applied Imagery Pattern Recognition Workshop, Washington, DC, USA, 15–17 October 2003; pp. 226–231. [Google Scholar]
Bugdol, M.D.; Mitas, A.W. Multimodal Biometric System Combining ECG and Sound Signals. Pattern Recognit. Lett. 2014, 38, 107–112. [Google Scholar] [CrossRef]
Donida Labati, R.; Muñoz, E.; Piuri, V.; Sassi, R.; Scotti, F. Deep-ECG: Convolutional Neural Networks for ECG Biometric Recognition. Pattern Recognit. Lett. 2019, 126, 78–85. [Google Scholar] [CrossRef]
Chu, Y.; Shen, H.; Huang, K. ECG Authentication Method Based on Parallel Multi-Scale One-Dimensional Residual Network with Center and Margin Loss. IEEE Access 2019, 7, 51598–51607. [Google Scholar] [CrossRef]
Zhang, Q.; Zhou, D.; Zeng, X. HeartID: A Multiresolution Convolutional Neural Network for ECG-Based Biometric Human Identification in Smart Health Applications. IEEE Access 2017, 5, 11805–11816. [Google Scholar] [CrossRef]
Ihsanto, E.; Ramli, K.; Sudiana, D.; Gunawan, T.S. Fast and Accurate Algorithm for ECG Authentication Using Residual Depthwise Separable Convolutional Neural Networks. Appl. Sci. 2020, 10, 3304. [Google Scholar] [CrossRef]
Salloum, R.; Kuo, C.-C.J. ECG-Based Biometrics Using Recurrent Neural Networks. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017; pp. 2062–2066. [Google Scholar]
Jyotishi, D.; Dandapat, S. An LSTM-Based Model for Person Identification Using ECG Signal. IEEE Sens. Lett. 2020, 4, 8. [Google Scholar] [CrossRef]
Lynn, H.M.; Pan, S.B.; Kim, P. A Deep Bidirectional GRU Network Model for Biometric Electrocardiogram Classification Based on Recurrent Neural Networks. IEEE Access 2019, 7, 145395–145405. [Google Scholar] [CrossRef]
Hamilton, P.S.; Tompkins, W.J. Quantitative Investigation of QRS Detection Rules Using the MIT/BIH Arrhythmia Database. IEEE Trans. Biomed. Eng. 1986, BME-33, 1157–1165. [Google Scholar] [CrossRef]
Hamilton, P. Open Source ECG Analysis. In Proceedings of the Computers in Cardiology, Memphis, TN, USA, 22–25 September 2002; pp. 101–104. [Google Scholar]
Tirado-Martin, P.; Sanchez-Reillo, R. BioECG: Improving ECG Biometrics with Deep Learning and Enhanced Datasets. Appl. Sci. 2021, 11, 5880. [Google Scholar] [CrossRef]
Ivanciu, L.; Ivanciu, I.-A.; Farago, P.; Roman, M.; Hintea, S. An ECG-Based Authentication System Using Siamese Neural Networks. J. Med. Biol. Eng. 2021, 41, 558–570. [Google Scholar] [CrossRef]
Pinto, J.R.; Cardoso, J.S.; Lourenço, A. Evolution, Current Challenges, and Future Possibilities in ECG Biometrics. IEEE Access 2018, 6, 34746–34776. [Google Scholar] [CrossRef]
Islam, M.S.; Alhichri, H.; Bazi, Y.; Ammour, N.; Alajlan, N.; Jomaa, R.M. Heartprint: A Dataset of Multisession ECG Signal with Long Interval Captured from Fingers for Biometric Recognition. Data 2022, 7, 141. [Google Scholar] [CrossRef]
Islam, M.S.; Alajlan, N. An Efficient QRS Detection Method for ECG Signal Captured from Fingers. In Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), San Jose, CA, USA, 15–19 July 2013; pp. 1–5. [Google Scholar]
Islam, M.S.; Alajlan, N. Augmented-Hilbert Transform for Detecting Peaks of a Finger-ECG Signal. In Proceedings of the 2014 IEEE Conference on Biomedical Engineering and Sciences (IECBES), Kuala Lumpur, Malaysia, 8–10 December 2014; pp. 864–867. [Google Scholar]
AlDuwaile, D.A.; Islam, M.S. Using Convolutional Neural Network and a Single Heartbeat for ECG Biometric Recognition. Entropy 2021, 23, 733. [Google Scholar] [CrossRef] [PubMed]
Alduwaile, D.; Islam, M.S. Single Heartbeat ECG Biometric Recognition Using Convolutional Neural Network. In Proceedings of the 2020 International Conference on Advanced Science and Engineering (ICOASE), Duhok, Iraq, 23–24 December 2020; pp. 145–150. [Google Scholar]
Al Rahhal, M.M.; Bazi, Y.; Almubarak, H.; Alajlan, N.; Al Zuair, M. Dense Convolutional Networks with Focal Loss and Image Generation for Electrocardiogram Classification. IEEE Access 2019, 7, 182225–182237. [Google Scholar] [CrossRef]
Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of Machine Learning Research, Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Long Beach, CA, USA, 9–15 June 2019; Chaudhuri, K., Salakhutdinov, R., Eds.; JMLR: Cambridge, MA, USA, 2019; Volume 97, pp. 6105–6114. [Google Scholar]
Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A Simple Framework for Contrastive Learning of Visual Representations 2020. Int. Conf. Mach. Learn. 2020, 119, 1597–1607. [Google Scholar]
Hadsell, R.; Chopra, S.; LeCun, Y. Dimensionality Reduction by Learning an Invariant Mapping. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 17–22 June 2006; Volume 2, pp. 1735–1742. [Google Scholar]

Figure 1. ECG data collection using the ReadMyHeart device.

Figure 2. A fragment of ECG signal composed 11 beats, with the generated windows around R− peaks enclosed by red rectangles.

Figure 3. Flowchart of the proposed system. The input ECG signal is fed into the feature extraction block, then passed into the contrastive optimization block and, finally, the classification block confirms the identity of the subject.

Figure 4. The feature extraction backbone, which is composed of two sub-blocks, 1D ECG signal to 2D image conversion block and 2D feature map generation block.

Figure 5. The main block of EfficientNet models, which is the mobile inverted bottleneck layer. (BN—Batch normalization layer, FC—Fully connected layer).

Figure 6. Comparison between the top-k accuracy of the proposed system showing three different experiments i.e., session1, session 2, and cross sessions.

Figure 7. ROC-AUC curve.

Figure 8. ROC-AUC curve for the cross-sessions experiment.

Table 1. Summary of the recent ECG biometrics works that are based on deep learning techniques. Acc. for accuracy, EER for equal error rate.

Study (Year)	Method	Dataset	Performance
Ibtehaz et al. [4] (2021)	Deep learning MultiResUNet	ECG-ID MIT-BIH Arrhythmia PTB Diagnostic ECG Database MIT-BIH NSRDB	Acc. 96–99.75% with single beat Acc. 100% with 3–6 beats
Ivanciu et al. [26] (2021)	Siamese NN	ECG-ID	Acc. 86.47%
Ihsanto et al. [19] (2020)	Residual Depthwise Separable CNN (RDSCNN)	ECG-ID MIT-BIH	ECG-ID: Acc. 98.89% MIT-BIH: 97.92%
Tirado-Martin et al. [25] (2021)	CNN and LSTM	Private dataset	EER = 0–5.31 EER = 0–1.35
Labati et al. [16] (2019)	CNN	PTB Diagnostic ECG Database	Identification Acc. 100%

Table 2. Number of ECG records per session in the Heartprint dataset.

	Date	# Subjects	# ECG Records per Subject (Minimum-Maximum)	Total
Session 1	Jan-2012	199	2–6	476
Session 2	June-2012	199	2–5	464
Session-3R	Mar-2022	109	3–6	365
Session-3L	Mar-2022	78	3–3	234
	Total	199	4–11	1539

Table 3. The accuracy of the proposed system on the first session.

Accuracy
Overall Accuracy	$98.55 \pm 0.20$
Top-1	$98.20 \pm 0.20$
Top-2	$98.87 \pm 0.31$
Top-3	$99.25 \pm 0.25$
Top-4	$99.33 \pm 0.21$
Top-5	$99.54 \pm 0.24$

Table 4. The accuracy of the proposed system on the second session.

Accuracy
Overall Accuracy	$97.49 \pm 0.10$
Top-1	$97.40 \pm 0.12$
Top-2	$98.20 \pm 0.06$
Top-3	$98.70 \pm 0.16$
Top-4	$98.91 \pm 0.21$
Top-5	$99.08 \pm 0.16$

Table 5. The accuracy of the proposed system on the cross-session scenario.

Accuracy
Overall Accuracy	$47.24 \pm 0.10$
Top-1	$46.06 \pm 0.12$
Top-2	$50.46 \pm 0.16$
Top-3	$53.22 \pm 0.15$
Top-4	$54.90 \pm 0.31$
Top-5	$56.53 \pm 0.36$

Table 6. The accuracy of the proposed system on the first session dataset with 50% for training and 50% for testing.

Accuracy
Overall Accuracy	$98.66 \pm 0.20$
Top-1	$97.66 \pm 0.11$
Top-2	$98.41 \pm 0.18$
Top-3	$98.66 \pm 0.17$
Top-4	$98.74 \pm 0.21$
Top-5	$98.91 \pm 0.16$

Table 7. The accuracy of the proposed system on the first session dataset.

	Macro	Weighted by Prevalence
One-vs-One ROC AUC scores	$0.999976$	$0.999979$
One-vs-Rest ROC AUC scores	$0.999976$	$0.999979$

Table 8. Sensitivity with respect to the contrastive loss module.

		Over All Accuracy
Dataset		Session 1	Session 2	Session 1 to 2	Session 2 to 1
Contrastive learning	Without	96.61	95.35	39.61	41.96
Contrastive learning	With	98.20	97.40	46.06	47.9

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ammour, N.; Jomaa, R.M.; Islam, M.S.; Bazi, Y.; Alhichri, H.; Alajlan, N. Deep Contrastive Learning-Based Model for ECG Biometrics. Appl. Sci. 2023, 13, 3070. https://doi.org/10.3390/app13053070

AMA Style

Ammour N, Jomaa RM, Islam MS, Bazi Y, Alhichri H, Alajlan N. Deep Contrastive Learning-Based Model for ECG Biometrics. Applied Sciences. 2023; 13(5):3070. https://doi.org/10.3390/app13053070

Chicago/Turabian Style

Ammour, Nassim, Rami M. Jomaa, Md Saiful Islam, Yakoub Bazi, Haikel Alhichri, and Naif Alajlan. 2023. "Deep Contrastive Learning-Based Model for ECG Biometrics" Applied Sciences 13, no. 5: 3070. https://doi.org/10.3390/app13053070

APA Style

Ammour, N., Jomaa, R. M., Islam, M. S., Bazi, Y., Alhichri, H., & Alajlan, N. (2023). Deep Contrastive Learning-Based Model for ECG Biometrics. Applied Sciences, 13(5), 3070. https://doi.org/10.3390/app13053070

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Contrastive Learning-Based Model for ECG Biometrics

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset

2.1.1. Dataset Description

2.1.2. Data Preprocessing

2.2. Proposed Method

2.2.1. Features Extraction Backbone

2.2.2. Contrastive Learning for Effective Leveraging of Cluster Distinction

3. Results and Discussions

3.1. Experimental Setup and Performance Evaluation

3.2. Experimental Results

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI