1. Introduction
Biometric traits, both physical and behavioral, are widely used for the recognition of human identity [
1]. Several common types of traits are currently used for this, and the benefits and drawbacks of their adoption are well known to date, as stated by Maltoni et al. [
2]. Among the biometric traits proposed for more than two decades, the Electrocardiogram (ECG) has been subject to validation as a suitable trait for human recognition since the commonly agreed seminal work by Biel et al. [
3].
The ECG is a signal that enables us to observe the electrical activity of the heart through electrodes placed inside the body or (more commonly) on its surface. The heart comprises excitable cardiac cells that produce Action Potentials (AP) in muscle cells to force them to contract. These APs produce voltage variations on the electrodes that can be recorded [
4]. The time-based representation of the magnitude of the recorded voltage variations results in a series of waves that include shapes and timings that enable their traditional use in disease diagnosis. In medical diagnosis, it is well known that specific pathologies often show their presence on electrocardiograms [
5].
Taking into account the specific characteristics of a biometric trait identified by Maltoni et al. [
2], the high
universality of ECG traits is commonly accepted. As physiological traits, all living beings show them. It is also commonly accepted that physiological traits are less prone to
circumvention than behavioral traits [
6]. Moreover, ECG can be recorded for long periods of time without requiring subjects to show explicit behavior. This is a positive factor for the
collectability and
acceptability of this trait. On the negative side, the common use of pairs of electrodes (also referred to as
leads) to record ECG generally represents a negative factor in terms of the last two characteristics referenced. Another relevant factor that potentially reduces acceptability is related to the fact that ECG can potentially reveal pathologies, even in an environment specifically designed for recognition.
The research studies performed on ECG-based biometrics focus mainly on distinctiveness. In general terms, the quantification of a physiological trait must represent the uniqueness of each physical subject. The results of most works to date clearly show that this premise holds, but they do not guarantee perfect identification/verification metrics, revealing that further research needs to be performed.
Another characteristic that represents a source of uncertainty for ECG-based biometrics is
permanence. It is well known that some physiological traits have high permanence in time as fingerprints. Unfortunately, this does not hold for all physiological traits, and ECG is one of them [
7,
8,
9,
10]. For this, there exist mechanisms that can mitigate the negative impact of low permanence for any biometric trait, such as re-enrollment policies.
In this survey, we focus on ECG-based biometric recognition. Two different types of problems are generally taken into account when referring to this term:
biometric identification and
biometric verification. According to Wayman [
11], the significant amount of development in biometrics in recent decades has created some confusion about the appropriate meaning of both terms. In fact, their use shows a growing amount of ambiguity in written documents such that clarification is clearly needed. At the time of writing, the recommended meanings for “biometric identification” and “biometric verification” according to ISO/IEC 2382-37: 2022 “Information technology —Vocabulary —Part 37: Biometrics” [
12] are the following:
Biometric identification: The process of searching against a biometric enrollment database to find and return the biometric reference identifier(s) attributable to a single individual.
Biometric verification: The process of confirming a biometric claim through comparison.
Although the presented meaning for the term identification seems to be commonly accepted, many contemporary works use different terms with the meaning presented above for biometric verification. Authentication has a widespread use as a synonym in a certain number of works subject to this survey, although the aforementioned recommendation does not encourage its use as an alternative to verification.
In more than two decades since the first agreed-upon article on ECG-based biometric recognition, many different techniques have been used to engage the subject. Early works commonly consider the subject as solvable by using probabilistic classifier techniques and interpret it as a Nearest Neighbor classification problem or, alternatively, use well-known Machine Learning (ML) classification algorithms. When using the referenced approaches, feature extraction is commonly treated as an important subject, as well-selected features are known to have a positive impact on recognition metrics. At this point, most studies use fiducial features, as they are commonly used in clinical diagnosis. This requires the delineation of ECG, for which several techniques already exist at this time. Specifically, treating the problem of feature selection and/or reduction has been common since early work. It is not until the problem starts to be approached by using learned feature-based methods that feature selection and/or reduction stops being specifically addressed. With time, non-fiducial features have started to be commonly used in papers. Since then, most works that have used explicit features have used non-fiducial ones. In recent years, it has become more common to find papers using different NNs to solve the recognition problem. In these papers, features are both engineered and learned, depending on the case.
1.1. Prior Work
There exist several survey papers that address different aspects of ECG-based biometric recognition to date. The first survey that specifically focused on the subject is the work by Odinaka et al. [
13]. Although it was a comprehensive paper that mentioned most works up to the publication date, it is clearly outdated now. This work also presented a fundamental problem with respect to conducting a comparative study of recognition techniques, which is the need for available common ECG databases. In fact, this particular work compared different techniques on a single, private ECG database. Fratini et al. [
14] surveyed most works related to ECG-based biometric identification and identity verification (commonly referenced as authentication, especially in the older literature). The work proposed a unified framework to compare different techniques for the first time. It explicitly identified the impossibility of comparing performance metrics when the works used private ECG databases and identified the most widely available publicly available databases at the time. That work is almost ten years old at the time of writing. Pinto et al. [
15] published a survey specifically focused on ECG-based biometrics. In their work, a comprehensive list of articles was referenced together with a short list of 12 publicly available ECG databases, some of which were available on PhysioNet [
16]. The survey by Rathore et al. [
17] was also directly related to our subject, but with a specific focus on different sensing modalities to quantify cardiac activity. Its taxonomy for the cardiac sensing domain broadened the spectrum of sensing modalities beyond the registration of electrical activity. Although the subject was not treated in depth, we should also mention the work by Dargan and Kumar [
18]. It surveyed different types of biometric traits, and for ECG-based biometrics, it presented four articles as representative of the main findings in the area up to the date of writing. Finally, it is also worth mentioning the work by Uwaechia and Ramli [
19]. It presented the main findings in terms of ECG classification for biometrics, showing the increasing number of works that used NNs for feature extraction and/or classification. It also included a short selection of commonly used ECG databases.
Table 1 summarizes previous works that surveyed biometrics using ECG.
Although they referenced some ECG databases, the ECG-based biometric recognition surveys mentioned above did not systematically elaborate a list of the available databases. Since these databases are a fundamental resource for the development of verifiable studies on the subject, Merone et al. [
6] reviewed the ECG databases mentioned in works focused on different subjects at the time. They found that only 5 out of the 15 identified databases mentioned in those papers were public. In addition, only two of the five publicly available databases were specifically conceived for biometric recognition. Furthermore, the survey by Flores et al. [
20] identified up to eight freely available ECG databases. However, although of interest to our subject, the focus of the work was on cardiovascular diseases.
Table 2 shows the aforementioned works that survey ECG databases.
Finally, even though they are only marginally related to the main subject of ECG-based biometric recognition, it is worth mentioning some surveys on ECG analysis which address valuable techniques. The survey by Berkaya et al. [
22] presented the works in terms of ECG analysis techniques. Although biometric identification was mentioned, it did not focus on ECG-based biometrics. The work enumerated a list of 21 multi-purpose databases that were available on PhysioNet [
16], together with references to articles which had used them. The authors of this work consider the survey in [
23] also worth mentioning, as it presented ECG analysis in light of the change of interest that seemed to be occurring from traditional signal processing approaches to ML and Deep Learning (DL) techniques. It is also worth mentioning, despite being centered on analysis techniques, the survey by Merdjanovska and Rashkovska [
21]. It also focused on the importance of available ECG databases and included a list of 45 publicly available databases for clinical applications. Finally, the work by Pereira et al. [
24] specifically surveyed ECG data acquisition methods, given their impact on biometric recognition.
Table 3 shows the works mentioned that survey ECG analysis and acquisition.
1.2. Paper Contributions
This work offers a detailed survey of existing studies on ECG-based biometric recognition systems for identification and identity verification. It also offers a detailed survey on the ECG databases available and mentioned in the referenced studies. To the best of our knowledge, a comprehensive survey that covered both the existing articles and the databases used was still missing. This work intends to fill that gap, so that readers interested in the subject can obtain a global understanding of the state of the art and easily find specific papers by using different criteria. This survey contributes to the available literature on ECG-based recognition as follows:
1.3. Paper Organization
This survey is organized in terms of the two main goals pursued in it: first, a description of the main ECG recognition techniques used in the existing literature is performed, including the most relevant references; then, a comprehensive survey of the ECG databases available and used in the aforementioned works is performed.
Section 2.1 describes the different techniques used for feature extraction.
Section 2.2 presents the classification techniques used in the literature.
Section 3 describes the information that characterizes the databases for their survey. Finally, the tables containing the exhaustive compilation of existing papers and ECG databases can be found in
Appendix A and
Appendix B, respectively.
2. ECG-Based Biometric Recognition Studies
The time period covered by this survey starts in 2001, the year of publication of the seminal article by Biel et al. [
3], up to the first semester of 2024.
Table A1 contains an extensive compilation of ECG-based biometric recognition works during this period. The works are listed in reverse chronological order, and apart from the year of publication, other relevant data are summarized for each one. The databases used are identified for each work to help the reader understand if replicability and/or result comparison is possible. Moreover, to provide a factor commonly known to have an impact on the recognition metrics, as the different studies did not use the available records of all individuals in many cases, the number of individuals used is provided (if mentioned). The metrics obtained for identification and/or verification are also provided, so that the obtained performance can be compared.
A large number of studies, especially early ones, explicitly use fiducial features, while newer ones commonly use engineered non-fiducial features or even learn them. The authors consider this a relevant decision for each article, and the information provided in
Table A1 also includes whether fiducial or non-fiducial features are used, as well as the feature reduction technique, if applied. Another relevant decision in each published study is the selection of the classifier used, which is also provided in
Table A1.
Feature extraction and classification techniques have been identified in previous works as two of the most relevant factors to categorize studies. For this reason, the following sections treat them in detail and provide taxonomies for each one.
2.1. Extraction of ECG Features
There are a large number of features that can be extracted from ECGs. The literature accepts the separation of these features into two main groups:
fiducial and
non-fiducial, as shown in
Figure 1.
Fiducial features, also called
handcrafted or engineered features, are extracted from the ECG by direct observation and represent specific identifiable elements of the heartbeat that are based on time, amplitude, or morphology. Time-based features refer to the time distance between pairs of well-known ECG points: peaks and onset/offset points of the P and T waves, and the QRS complex.
Non-fiducial features, on the other hand, include all those features that are not identified as fiducial. Using alternative features overcomes the main problem of fiducial ones, which is a certain amount of uncertainty emerging from the algorithms used [
25]. A third group that is sometimes identified in the literature is called
hybrid (or partially fiducial) features. Hybrid methods combine the non-fiducial extraction of features with the need to identify at least one fiducial point (typically the R peaks).
ECG delineation, also known in the early literature as
ECG segmentation (no to be confused with the segmentation stage in ECG processing), is the processing of the ECG to extract
fiducial features by direct observation of three main types of elements: time, amplitude, and morphology. The upper branch of
Figure 1 shows these three types of fiducial features, while the lower one shows non-fiducial features. Obtaining specific features by ECG delineation is useful not only for biometrics but also for other applications, such as diagnosis in clinical settings.
Table 4 provides a nonexhaustive list of the advantages and disadvantages of using fiducial and non-fiducial features in ECG-based biometric recognition.
2.1.1. Fiducial Features
The left heartbeat in
Figure 2 shows eight common time-based features used in different studies. P and T waves, QRS complexes, PR and ST segments, and the widths of the PT and QT intervals have been typically used. Another feature that has been frequently used for biometric recognition is the R-R interval. Some of the aforementioned intervals, specifically the R-R interval and the durations of P and T waves, have an important amount of variability when the heart rate changes. For this reason, some works rely on normalized time-based features to represent a percentage of a heartbeat instead of an interval in seconds [
3,
28,
29,
30,
31,
32].
Several studies state that ECG amplitudes change among individuals [
31,
34]. Although this fact has been validated, intraclass variance is still a subject of study. The right heartbeat in
Figure 2 shows four common amplitude-based features. Their quantification relies on the removal of the existing baseline wander from the ECG, as all the measures compare their mutual positions with respect to the R peak. Other measures of amplitude include the ST segment amplitude [
3], the amplitudes of the first and second derivatives of the ECG [
35,
36], slope and angle features [
37], and even the ratios of the described features, representing alternative ways of normalization.
Both time- and amplitude-based features result from a more or less direct observation of the ECG, and their values can be directly inferred from it. A third indirect alternative has been considered to extract features from the ECG or parts of it. It basically consists in obtaining them as a result of processing the ECG, which is sometimes not trivial. A simple method consists in quantifying the slopes between waves and angles in the QRS complex [
32]. Another technique is based on the calculation of the average value of certain parts of the ECG, for example, the QRS complex, with respect to multiple other parts of the heartbeat [
38,
39,
40,
41,
42,
43,
44].
Table 5 shows a list of fiducial features that have been used in the biometric recognition literature to date [
45,
46].
As mentioned above, some studies take into account that most fiducial feature values are a function of heart rate. Although this can be easily noticed for time-based features, amplitude- and morphology-based features can also be affected by the heart rate, as it fluctuates with physical activity, strong emotions, or drug consumption. Many
normalization strategies have been proposed on a per-feature basis [
31], full heartbeat resampling [
47], partial QT interval normalization [
36,
45,
48,
49], R centering by using different resampling ratios on the left and right sides of R [
50], and independent wave resampling and recombination [
39].
The extraction of fiducial features is a mature subject, where there is a large number of existing studies based on different principles: heuristics using different transforms [
51,
52,
53,
54,
55,
56,
57,
58,
59,
60], windowing algorithms [
61,
62,
63], time-domain methods [
64,
65], mathematical morphology-based approaches [
66,
67,
68], Machine Learning techniques [
69,
70,
71], or algorithms based on Neural Networks [
72]. The work by Wasimuddin et al. [
23] can serve as a reference for a comprehensive study of this task. However, even if this subject can be considered mature, to date, it is still not fully understood, given the complexity and variability of ECGs, the different environments for ECG extraction, and the absence of a formal scheme for signal acquisition [
23,
73,
74]. A comparison of several techniques for the extraction of fiducial features has been performed in [
25], concluding that heuristic methods produce less precise results than wavelet-based and statistical methods.
Heart Rate Variability (HRV), characterized by the pattern of heartbeat occurrence times, is the subject of study for diagnostic purposes, as it can reveal problems in the nervous system [
4]. Multiple causes can produce HRV, among which are ectopic beats and noise bursts. This phenomenon can cause variability in the determination of fiducial points. The work by Beraza and Romero [
25] revealed a relevant amount of variability in time-based fiducial points when delineating ECGs with different algorithms. This work also revealed that different algorithms led to substantial differences in delineation. To mitigate the effects of HRV, a number of studies have been carried out. An important milestone was the work by Martínez et al. [
52], which used wavelets to represent the ECG, resulting in better results than heuristic-based algorithms. Other alternatives include the work by Akhbari et al. [
75], who used a multi-hidden Markov model; Lee et al. [
27], who applied a curvature-based vertex selection technique; Akhbari et al. [
76], whose work was based on a switching Kalman filter; or Bae et al. [
77], who used a one-dimensional bilateral filter detector. Another well-known technique to mitigate the effects of HRV consists in clustering heartbeat morphologies [
4]. This method helps to take into account the presence of ectopic heartbeats and other variability sources, so that the fiducial characteristics can eventually be extracted separately for each cluster. For a more in-depth analysis of the subject, the authors refer the reader to the surveys on ECG analysis and acquisition in
Table 3.
2.1.2. Non-Fiducial Features
Several techniques to obtain non-fiducial features have been devised to date, and all can be identified in terms of the principles on which they are based, as shown in
Figure 1.
Autocorrelation-based features are extracted from fixed-length windows of
N samples taken from the discrete time-based ECG signal segments, which are used to obtain their normalized Autocorrelation (AC). Heuristics are usually applied to determine
N. According to Plataniotis et al. [
78] and Lee [
79], a time interval in the range of 5 to 10 s is a good choice for biometric recognition. It is a common technique to follow the extraction of AC series with some dimensionality reduction technique. Discrete Cosine Transform (DCT) is typically used for this [
80,
81,
82,
83,
84,
85,
86] (this combination is known as the Autocorrelation/Discrete Cosine Transform (AC/DCT) method), since it has been commonly used for signal processing and data compression [
87], but Linear Discriminant Analysis (LDA) has also been used [
78,
82,
83,
88].
Alternatively, time-based ECGs can be projected into
n-dimensional spaces by means of a
time-delay technique. This technique is based on considering the sequences of samples of ECG windows,
, and the delayed sequences of samples obtained through time shifting,
. The first works using this technique for ECG-based biometrics were performed by Fang and Chan [
89,
90]. In these works, time delay was used to derive three time-shifted ECG sequences from 5 s windows (also called
epochs), using delays between 4 ms and 36 ms. Those three sequences were then used to build a normalized three-dimensional trajectory in the
phase space that could be used as a feature. Similar alternative techniques have also been devised, using three different leads instead of time-shifted ECG sequences. The reduction in features in the phase space is performed by discretizing the state space by using a cubic grid
and annotating the cells that are crossed by the trajectories to construct a
coarse-grained structure. Recognition is then based on structure comparison methods [
40,
91,
92].
Another well-known alternative to search for non-fiducial ECG features is to elaborate on
frequency-based features. Several studies have been performed in this area. Coefficients of the Fourier transform were used as features by Saechia et al. [
93], the Mel-Frequency Cepstrum Coefficients (MFCCs) [
94] were used in two studies [
95,
96], other authors used procedures similar to the Hilbert–Huang transform over the Empirical Mode Decomposition (EMD) of ECG segments (epochs) or a modified version of it to obtain features [
97,
98], and Loong et al. [
99] used the first 40 points of the linear predictive coding (LPC) spectrum as features for overlapped 5 s ECG windows.
Finally, there exist studies
based on other principles in addition to the basic ones presented in the taxonomy in
Figure 1. In some cases, the research work has been directed to extracting features in multi-lead scenarios [
89,
100], others have used feature extraction techniques such as Cepstral Analysis [
101,
102], Piecewise Linear Representation (PLR) [
103], Pulse Active Ratio [
104], autoregressive coefficients and mean of power spectral density (PSD) [
105], compressed ECG [
106], wavelet coefficients [
107], and even combinations of known methods such as wavelet decomposition, frequency analysis, and correlation coefficients [
108,
109].
2.1.3. Features in Neural Networks
Although NNs and DNNs were seldomly used techniques in early works, more recent studies have incorporated them more frequently. Basic cases included those where a high number of ECG signals were available or for niche applications, such as continuous identification or identity verification [
110,
111,
112,
113,
114,
115,
116,
117,
118,
119]. Their use commonly avoids the need to devise specific methods to analyze ECGs and extract features in a process commonly referred to as “feature engineering” [
120]. It should be noted that in some cases, these works mention the need for a large number of ECG signals for training purposes [
121] and even the need to extend the ECG datasets by data augmentation [
74,
120,
122,
123], using DGMs [
124,
125] or even auxiliary classifier GANs [
126].
As NNs for ECG-based biometric recognition are used more often, a new type of classification has emerged to identify methods that lead to the acquisition of characteristics:
manual feature-based methods and methods based on learned features [
127].
Manual feature-based methods are those that generate both fiducial and non-fiducial features, as previously presented in this section. They assume that all ECGs share common characteristics that can be reflected in distinctive features. Unfortunately, this assumption does not always hold for all ECGs. For example, not all heartbeats of a subject can be considered similar, especially when there is a pathology, such as an ectopic heartbeat arrhythmia [
128]. It can be verified by inspecting
Table A1 that most of the early biometric recognition studies use sets of individuals who do not suffer from heart pathologies or use selected individuals with minor arrhythmias given the lack of other alternatives. When pathologically affected individuals are used, performance tends to decrease [
104,
107,
113].
Learned feature-based methods are conceived to overcome the proven fact that not all ECGs share common characteristics [
127,
129]. They try to learn the structural and hierarchical characteristics of actual ECGs in order to extract representative features by generalization [
127]. Although learned feature-based methods have NNs, and specially CNNs, as a natural environment for development, other ML techniques have also been used [
112,
130,
131,
132,
133,
134,
135,
136,
137]. It should be mentioned that until very recently, there were no available studies using autoencoders for feature extraction [
117,
121,
138,
139,
140].
2.1.4. Learned Features and Fiducial Points
It is particularly interesting to draw attention to the fact that most studies that use non-fiducial or learned features still rely on a segmentation stage that requires at least a minimum number of fiducial points [
32,
39,
40,
48,
85,
89,
102,
107,
108,
111,
112,
115,
129,
136,
141,
142,
143,
144,
145,
146,
147,
148,
149,
150,
151,
152,
153,
154,
155,
156,
157,
158]. This underscores the continued significance of a delineation stage that, even if it is not used for the direct generation of fiducial features, serves as assistance for a further feature extraction stage. As an example, in one of the first works in this sense, Irvine et al. [
129] used Principal Components Analysis (PCA) for feature extraction from the samples of each heartbeat. To achieve this, segmenting the ECG signal into heartbeats by identifying a certain number of fiducial points through delineation was previously needed.
2.2. Classification Techniques
There are a large number of classifier types used to date for ECG-based biometric recognition in the literature.
Figure 3 shows a concise taxonomy of the ones that are most commonly used in the area.
GMCs learn the characteristics of each class by using training data, thus enabling the determination of potential feasible input data corresponding to each one. The seminal work by Biel et al. [
3] used the Soft Independent Modelling of Class Analogy (SIMCA) classifier. It modeled each class independently and belonged to this category. The works by [
28,
31,
35,
44,
47,
148,
153,
159,
160,
161,
162,
163] used LDACs, which are also GMCs. Additional studies using GMCs include [
161], which used the naive Bayes classifier, and [
78,
108,
161], which used the Log-Likelihood Ratio (LLR).
Support Vector Machine (SVM) is another commonly used type of classifier for ECG-based biometrics. It projects feature vectors into a high-dimensional space in order to find the hyperplanes that represent the boundaries between classes. Nonlinear classification is possible through kernel functions. To the best of the authors’ knowledge, the first work using the SVM classifier in ECG-based biometrics was the one by Li and Narayanan [
102]. Since then, many other works that use SVMs have been published [
44,
107,
152,
156,
157,
164,
165,
166,
167,
168,
169,
170,
171,
172,
173,
174].
Dissimilarity-based classifiers compute dissimilarity metrics between a feature vector that must be recognized and a set of labeled vectors obtained during the training phase. A subset of
k labeled vectors is determined on the basis of the minimum dissimilarity with the vector to be recognized, and finally, a single (or multiple) label is predicted.
k-NN is the most common ML algorithm that behaves as described. This type of classifier has been commonly used in a large number of works on the subject since 2001 to date, as can be verified by evaluating
Table A1: [
32,
38,
39,
40,
45,
82,
83,
85,
89,
104,
105,
113,
117,
129,
136,
139,
143,
150,
151,
153,
156,
157,
163,
164,
170,
175,
176,
177,
178,
179,
180,
181,
182,
183,
184,
185,
186,
187,
188]. Many of the referenced works implemented the simplest form of the
k-NN classifier, where
. Indeed, they rarely mention
k-NN, denoting it as the
nearest-neighbor criterion classifier instead. One of the works that explicitly mentioned using
k-NN with
was the one by Gürkan et al. [
176]. It is worth mentioning that not all works that used the
k-NN classifier were conceived to use the complete training set of feature vectors to obtain predictions. Some of the works used a subset of
prototype features, derived from the training set, thus achieving different advantages [
104,
136,
142].
Using
Neural Network (NN) classifiers and/or feature extractors represents a relevant trend in this area. Several works have used NNs for ECG-based biometrics since the early days of ECG-based biometric recognition. In the period between 2002 and 2019, we have found 16 works that used them [
30,
48,
93,
99,
105,
111,
112,
115,
138,
145,
147,
155,
157,
166,
189,
190]. By inspecting
Table A1, it can be verified that the number of works that use NNs has recently increased significantly. As an example, since 2020, 30 out of the 44 works identified on ECG-based biometrics have used NNs, either alone or together with other types of classifiers and/or feature extractors [
118,
121,
122,
125,
127,
158,
171,
173,
174,
182,
191,
192,
193,
194,
195,
196,
197,
198,
199,
200,
201,
202,
203,
204,
205,
206,
207,
208,
209].
The works in the literature use different types of NNs.
Figure 3 shows the most commonly used ones: CNNs, DNNs, RNNs, SNNs, and LSTM-NNs.
CNNs are, by far, the most commonly used NNs in this area [
112,
115,
118,
121,
122,
127,
158,
173,
191,
193,
195,
196,
200,
202,
203,
204,
205,
206,
207,
209]. In some works, features are engineered to accommodate 2D matrices, so that 2D CNNs can be used as classifiers. Occasionally, spectrogram images of ECGs have been used, as in the works by da Silva Luz et al. [
112], Ammour et al. [
203], Aleidan et al. [
210]. In other cases, plain 2D images of ECGs were used [
194,
204,
205], as well as Gramian angular fields [
206] or other engineered 2D features [
155,
192,
193].
DNNs have recently been incorporated for ECG-based biometrics, having a limited number of references at the time [
125,
174,
191,
194].
Another interesting type of NN are
SNNs. They have also been used in some works, as in Ibtehaz et al. [
158], Tirado-Martin and Sanchez-Reillo [
192], Prakash et al. [
204].
Since an ECG is a time-based signal, several works have used
LSTM-NNs as classifiers for ECG recognition, as in Prakash et al. [
118], Tirado-Martin and Sanchez-Reillo [
192], Jyotishi and Dandapat [
199].
RNNs have also been used, as in the works by Salloum and Kuo [
111], Kim et al. [
122]. Continuing with the consideration of the recognition of a time-based signal, the publication by Aslan and Choi [
208] uses
GINs.
Worth mentioning also are the works by Sepahvand and Abdali-Mohammadi [
182], Chee and Ramli [
197], which use plain fully connected NNs, attaining good performance metrics. Another interesting line of work uses
PNNs, as in the publications by Ghofrani and Bostani [
105], Li et al. [
198]. Finally, it is also worth noting that
Autoencoders have started to be used in this area, both for automated feature extraction and together with other types of classifiers [
117,
121,
138,
139].
In addition to the most common classification techniques previously described, ECG recognition studies have used a larger variety of classifiers. Decision Tree (DT) classifiers were used by Aziz et al. [
168], Hwang et al. [
170], Bhuva and Kumar [
173]. The Random forest (RF) classifier was used in the works by Fatimah et al. [
172], Carvalho and Brás [
211]. LDA and Quadratic Discriminant Analysis (QDA) classifiers have also been used by Zhang and Wei [
161] and Sarkar et al. [
153], respectively.
A depiction summarizing the time evolution regarding the use of each classification technique with horizontal bars is shown in
Figure 4. The leftmost part of each bar indicates the year of the earliest dated work identified by using each classification technique. The rightmost part of the bar indicates the latest study date available so far. NN-based classification is considered a single group with works dating from 2002. Within this group, the most relevant techniques are also identified below the group bar.
3. ECG Databases
Research involving ECGs requires the collection of actual recordings to be carried out, and the most convenient environment to obtain them is the clinical environment, where electrocardiographs are commonly used on a daily basis. Due to their immediate proximity, clinical research can be the primary beneficiary of the acquired ECG data. However, there exist a number of reasons that favor the publication of sets of ECG records, also called ECG databases or ECG datasets, in the available literature: the replicability of original studies and the comparison of results obtained by using different techniques and common databases are among the most relevant ones.
Table A2 shows a comprehensive compilation of existing ECG databases that have been mentioned in ECG-based biometric recognition studies. As shown in
Table 2, only three previous works have aimed to list available ECG databases, and none of them has done so comprehensively [
6,
20,
21]. In fact, the most recent survey presented a list of 45 databases, while our work identifies and lists 74. In order to ease its use, a database should be easily available. In this sense, some of the presented databases are publicly available, others are only available upon request, and finally, some of them were collected for specific studies and are not available to date. Some databases contain exclusively ECGs, while others store ECGs together with other signals. Databases containing different types of signals are usually intended for multivariable or fusion research in emotion recognition or polysomnography studies.
Table A2 has been ordered so that newer databases are shown first. This helps identify the fact that new databases are still being published and mentioned in papers to date. The authors interpret this as reflecting the fact that there is no common agreement on a set of reference databases to use for study comparisons and reproducibility in the area. By inspecting
Table A2, the databases most frequently used in the literature are MIT-BIH (one or more of the ten available), PTB, and ECG-ID. In addition, it can be seen that only 9 out of the 67 databases identified contain records from more than 1000 subjects. This is consistent with the fact that although electrocardiograph devices are widely available today, both in clinical and research environments, collecting a large number of ECGs and compiling them into a database requires considerable effort.
The databases have been considered in terms of the more relevant variables to evaluate the extent of the works that use them, as well as for comparison purposes: the ECG
sampling frequencies, the
recorded leads, the
time length of the recordings, and the
number of subjects (not to be confused with the number of available records in the database). The Analog-to-Digital (A/D) processing of ECG recordings has made it possible to parameterize the signal capturing fidelity in terms of reduced quantization noise and captured frequencies. Current digitization devices use a minimum of 12 bits for the digital conversion of samples. Taking into consideration that some ECG waves have an amplitude of a few millivolts and that a maximum of a 1-volt dynamic range can be achieved (without taking into consideration the wandering error that can be removed), this sets a maximum on the quantization noise of
µV, enough for the dynamics of the signals under study and to identify low-amplitude waves such as P or T [
212]. Indeed, newer databases typically use 12, 15 or 16 bits for quantization.
Regarding the ECG sampling frequency, the established literature considers that while a wide range of sampling rates can be used, 500 samples/s is the minimum rate that should be used [
4]. Lower sampling rates can be used, but this factor can decrease performance, as there is a loss of fine details in the recordings [
213]. If low-sampling-rate databases are definitely needed, downsampling can be applied as in, for example, the works by Gong et al. [
125], Censi et al. [
212].
It seems clear that using a higher number of leads (for example, a 12-lead configuration) can improve recognition metrics [
213]. However, single-lead configurations have also been shown to be valid for biometric recognition [
214]. The works referenced in
Table A1 use different configurations with single or multiple leads for recognition. Non-conventional leads, placed on the upper arm, behind the ears, on the wrist, or on the fingers, have also been tested, with promising results [
215,
216,
217]. Unfortunately, using more than one (or two) leads restricts the application of biometric recognition to clinical scenarios, where multi-lead configurations are common. In terms of acceptability,
off-the-person single-lead configurations are much simpler and are perceived as more acceptable in common scenarios compared with multi-lead configurations, also commonly known as
on-the-person configurations. For an in-depth taxonomy of ECG data acquisition systems, the authors refer the reader to the work by Plácido da Silva et al. [
218]. The current literature favors off-the-person techniques that enable better acceptance by individuals, as well as short setup times and simplicity, typically implementing
continuous verification (identity verification in wearable/portable devices) [
115,
156,
173,
219,
220,
221,
222,
223,
224].
Short-term ECG data (less than several minutes) are commonly used for clinical diagnostic purposes, as they regard a reasonable time for the detection of most cardiac diseases. On the other hand, long-term ECG data are needed for the diagnostic of diseases that intermittently express themselves in ECGs, such as paroxysmal ventricular fibrillation or atrial fibrillation. Considering biometric recognition, it is commonly considered that shorter ECG segments lead to lower recognition metrics. In this sense, Ramos et al. [
225] recently mentioned the lower limit of 10 s as the minimum length of the ECG segment that contains enough information for biometric recognition. Determining optimal ECG segment size is still a pending task, as it appears to depend on several factors, such as the quality of the acquired data, preprocessing, feature extraction, and classification techniques. While most available databases to date collect short-term ECG data and long-term ECG data for different diagnostic purposes, some of the newest databases offer ECG segments between 10 and 20 s long, such as PTB-XL, the Lobachevsky University Electrocardiography Database (LUDB), the Chapman University and Shaoxing People’s Hospital database, Motion Artifact Contaminated ECG, and the ECG-ID database. At the time of writing, it can be concluded that the lengths of the recorded segments in most available databases convey enough information for biometric recognition purposes.
One of the most relevant parameters for deciding which database to use for biometric purposes is the number of different subjects whose ECGs have been recorded. The literature on ECG-based biometric recognition uses readily available databases whose conception was mainly for clinical research. Those databases usually contain records of a number of different subjects that typically are in the tens or a few hundreds at most, as shown in
Table A1. An intuitive approach to biometric recognition states that recognition performance metrics should decrease as the number of different subjects increases. This fact has been validated by experiments with readily available databases in the work by Jekova et al. [
214]. Another important criterion to consider when selecting a database is its use in previous research. Choosing a database that has been previously used for biometric recognition experiments can be positive when comparisons of results are of interest.
4. Discussion
There exist many published studies that involve identification, verification, or both problems at the same time. Most of these studies claim to achieve excellent performance metrics. For identification purposes, the most common performance metric used is Accuracy (ACC). Early studies sometimes express identification performance in terms of other ratios, and newer studies sometimes mention metrics that can quantify the effect of training imbalance, such as precision, recall, or F-score. On the other hand, all studies that treat identity verification use the Equal Error Rate (EER) metric.
The authors consider it to be extremely relevant that the vast majority of studies use sets with only tens or a few hundreds of individuals in their experiments. Early works typically used a few tens of individuals, while more recent works show a clear trend to include numbers in the hundreds. Only a limited number of studies, especially newer ones, report metrics resulting from experimenting with numbers of individuals in the thousands. As the number of individuals increases, studies can evaluate how well their techniques behave in the presence of factors that such as class separation or intra- and interclass variability. A contributing factor to the small number of individuals in most studies is the restricted access to extensive databases with records for many individuals. With a few exceptions, such as the PTB-XL database, there were no publicly available databases with records from thousands of individuals until recently. Another relevant factor preventing studies from trying to experiment with large numbers of individuals is the intention of the recorded ECGs. To date, most databases with records from a high number of individuals exist for clinical purposes. Their records usually show pathologies, with the ECG signal sometimes behaving far from a Normal Sinus Rhythm (NSR) and eventually resulting in a negative impact on recognition metrics. In this sense, ECG-based biometric recognition techniques should take into account that aside from specific cardiac affections that require clinical intervention, many healthy subjects also present sporadic irregularities (such as ectopic beats) that should be acceptable for recognition techniques. It is also important to note that in a large number of studies, the ECG sets are selected from healthy individuals when possible or from individuals who show specific homogeneous pathologies, such as arrhythmias. From the study of the available papers and databases used, the authors consider that further work is needed to determine the effects of existing pathologies on recognition metrics.
With the increasing number of studies that use NN-based techniques, the composition of ECG databases, both in terms of number of individuals and length of records, becomes increasingly important. On the one hand, it can be shown from the metrics in
Table A1 that NN-based classifiers can obtain excellent recognition metrics with hundreds of individuals. Unfortunately, there are only a few studies that use sets with thousands of individuals. On the other hand, the accepted principle that short records are sufficient for experimentation on ECG-based biometrics poses a challenge for training NN-based classifiers. This has resulted in a recent body of literature that uses data augmentation, DGMs, or GANs for coping with short-time recordings of each individual in databases and the need for enough examples to train the models.
To summarize the problem posed by the availability of databases, the work by Barros et al. [
226] is an example of a study that recognizes this fact: “
Existing works on ECG for user authentication do not consider a population size close to a real application”.
Assessing the characteristics of the databases used by studies over time, it is feasible to have an approximate idea of future needs and requirements and anticipate potential trends in future databases. The motivation to validate novel recognition methods with an increasing number of individuals closer to real applications will likely result in new databases with recordings of individuals in the thousands and even tens of thousands. The same motivation will probably result in the availability of databases with higher ratios of healthy versus pathological subjects to try to meet the expected ratios of real applications. This may even result in more initiatives to compile datasets in non-clinical environments. Another factor that may impact the configuration of future databases is the increasing number of studies that use NNs for biometric recognition, as shown in
Figure 4. This will require an increasing number of individuals and longer records available for each one, to avoid the need for augmentation techniques to obtain an acceptable number of training data. Finally, with the emergence of new applications that rely on signals obtained from one or two leads at most, the authors believe that an increasing number of the new databases will include recordings obtained with configurations other than those used for medical purposes.
One of the most evident applications of biometric recognition is recognition in clinical environments (also known as
patient identification), where acceptability does not represent an issue, as ECG monitoring is commonly used [
227]. Consequently, an interesting application domain that has a high potential for development is continuous identity verification (also known as
continuous authentication). This type of application seems to be gaining relevance, encompassed with ambient intelligence and wearables [
110,
111,
112,
113,
114,
115,
116,
117,
118,
119,
173]. A tangible example of a continuous identity verification scenario was presented in the work by Lourenço et al. [
215], where ECGs were recorded by electrodes placed on the surface of a steering wheel. Another scenario was shown by Kim et al. [
217]. In this study, multimodal biometrics were used, capturing the photomatrix and ECGs by means of a wrist band.
Another relevant fact arising from the available literature is that most of the works have a batch-based approach to the problem. The work by Wang et al. [
183] is the only one to apply Online ML techniques. Given the state of the art, as well as the available databases, the authors consider that the validation of the ECG as a biometric trait for a realistic number of individuals is still a pending issue. A further challenge for practical applications will be the need to maintain ECG-based biometric systems with a very large number of enrolled individuals and the need to enroll and disenroll them. This represents an important issue if batch-based techniques are used, possibly forcing researchers to use alternative approaches, such as online techniques.
Finally, there are no specific works that focus on the impact on the performance of the variance in ECGs over time (
permanence). As a physical-based trait, the ECG will change when the physical characteristics that produce it are modified. Some studies try to take this short-time factor into account by using different recordings of the same individuals in training and evaluation. Again, this type of studies cannot be specifically conducted until long-term record databases of the same individuals exist. Moreover, for the purpose of studying the effect of aging, those intervals should last years. For a more in-depth knowledge of the current work about aging in the ECG, the authors refer the reader to the survey by Ansari et al. [
228].
The authors have identified two basic scenarios for which the surveyed ECG databases were originally intended. Historically, the most natural scenario has been the
clinical environment. Common applications here were clinical decision support, research on multiple cardiac conditions (arrythmia, assumption of drugs, etc.), development of automatic ECG analysis methods, and long-term longitudinal studies. Most databases referenced in the studies were originally intended for use in this scenario. The databases most used in studies in this scenario are those freely available: the MIT-BIH group of Physionet databases [
16,
229,
230,
231,
232,
233,
234,
235], the PTB Diagnostic ECG Database [
33], and the QT database [
236]. Besides the referenced ones, it is worth mentioning the ECG-ViEW II database [
237], not used for ECG biometric recognition to date but relevant for research related to permanence. Within the clinical scenario, patient identification has emerged as a new application, as explained above.
The focus of this survey is the second scenario:
general biometric recognition using ECGs. Only a few databases originally include biometric recognition among their goals: Check Your Biosignals Here initiative (CYBHi) database [
238], University of Toronto ECG Database (UofTDB) [
239], ECG-ID database [
240], Chosun University CU-ECG [
171,
208], or Heartprint database [
241]. In any case, various databases not originally designed for biometric applications have also been frequently used for applications in this scenario, such as the PTB Diagnostic ECG Database [
33], the MIT-BIH Arrhythmia Database [
229], or the ECG-ID Database [
240].
5. Conclusions
ECG-based biometric recognition is still a very active area of research, despite the fact that there are studies in the area dating back more than two decades. To date, there are no other works that fully cover both the studies performed on the subject and the ECG databases used in the area. This survey provides a comprehensive overview of both. It enables readers to obtain an understanding of the state of the art, identify the types of recognition techniques applied over time, search for specific studies by using different criteria, and come to know the available ECG databases to choose among them for the evaluation of novel techniques.
The available studies claim to achieve excellent performance metrics in most cases. Unfortunately, the vast majority of these studies use ECG databases with population sizes much smaller than those expected in real applications, thus limiting the validation of their techniques. This survey shows that one factor causing this problem has been the restricted access to databases with a large number of records from many individuals until recently. Another cause is the fact that most of the available databases containing a large number of individuals exist only for clinical purposes.
Another relevant conclusion arising from this survey is related to the evolution of recognition techniques. Although many different methods have been used for ECG-based recognition, most recent studies show a clear tendency to use NNs and, more specifically, CNNs, DNNs, or GINs. However, a thorough comparison of the performance of the Shallow and DL classification methods used, for different databases and application scenarios, is still lacking in the literature.
Finally, note that there are still several open research challenges in the area. The nature of an actual biometric system, which involves the enrollment and disenrollment of large numbers of individuals, seems to be better suited for online techniques than for batch approaches. Unfortunately, only marginal work has been published to date in this area. Another important area of research is the study of the impact on recognition metrics of the variance in ECGs over time caused by aging and the emergence of pathologies. There are also a limited number of studies on this subject.