Sleep Apnea Classification Using the Mean Euler–Poincaré Characteristic and AI Techniques

Ramos-Martinez, Moises; Sorcia-Vázquez, Felipe D. J.; Ortiz-Torres, Gerardo; Martínez García, Mario; Mena-Enriquez, Mayra G.; Sarmiento-Bustos, Estela; Mixteco-Sánchez, Juan Carlos; Rentería-Vargas, Erasmo Misael; Valdez-Resendiz, Jesús E.; Rumbo-Morales, Jesse Yoe

doi:10.3390/a17110527

Open AccessArticle

Sleep Apnea Classification Using the Mean Euler–Poincaré Characteristic and AI Techniques

by

Moises Ramos-Martinez

¹

,

Felipe D. J. Sorcia-Vázquez

¹

,

Gerardo Ortiz-Torres

¹

,

Mario Martínez García

¹

,

Mayra G. Mena-Enriquez

²

,

Estela Sarmiento-Bustos

³

,

Juan Carlos Mixteco-Sánchez

⁴

,

Erasmo Misael Rentería-Vargas

¹

,

Jesús E. Valdez-Resendiz

^5,*

and

Jesse Yoe Rumbo-Morales

¹

Departamento de Ciencias Computacionales e Ingenierías, Universidad de Guadalajara, Carretera Guadalajara-Ameca Km.45.5, Ameca 46600, Jalisco, Mexico

²

Biomedical Sciences Department, University of Guadalajara, Tonalá 45425, Jalisco, Mexico

³

División Académica de Mecánica Industrial, Universidad Tecnológica Emiliano Zapata del Estado de Morelos, Av. Universidad Tecnológica No. 1, Col. Palo Escrito, Emiliano Zapata 62760, Morelos, Mexico

⁴

Natural and Exact Sciences Department, University of Guadalajara, Ameca 46600, Jalisco, Mexico

⁵

School of Engineering and Sciences, Tecnologico de Monterrey, Monterrey 64700, Nuevo Leon, Mexico

^*

Author to whom correspondence should be addressed.

Algorithms 2024, 17(11), 527; https://doi.org/10.3390/a17110527

Submission received: 10 October 2024 / Revised: 8 November 2024 / Accepted: 13 November 2024 / Published: 15 November 2024

(This article belongs to the Special Issue Artificial Intelligence Algorithms for Medicine (2nd Edition))

Download

Browse Figures

Versions Notes

Abstract

:

Sleep apnea is a sleep disorder that disrupts breathing during sleep. This study aims to classify sleep apnea using a machine learning approach and a Euler–Poincaré characteristic (EPC) model derived from electrocardiogram (ECG) signals. An ensemble K-nearest neighbors classifier and a feedforward neural network were implemented using the EPC model as inputs. ECG signals were preprocessed with a polynomial-based scheme to reduce noise, and the processed signals were transformed into a non-Gaussian physiological random field (NGPRF) for EPC model extraction from excursion sets. The classifiers were then applied to the EPC model inputs. Using the Apnea-ECG dataset, the proposed method achieved an accuracy of 98.5%, sensitivity of 94.5%, and specificity of 100%. Combining machine learning methods and geometrical features can effectively diagnose sleep apnea from single-lead ECG signals. The EPC model enhances clinical decision-making for evaluating this disease.

Keywords:

sleep apnea; Euler characteristic; machine learning; random field

1. Introduction

Sleep apnea is a sleep disorder. During sleep, decreased breathing occurs in periods of a few seconds to minutes; this condition is known as sleep apnea. Generally, there is a high prevalence of undiagnosed cases among the population suffering from obstructive sleep apnea (OSA). This prevalence is higher in men than women; nearly one billion people have OSA worldwide [1]. OSA affects people with cardiovascular disease (arrhythmia, heart failure, coronary syndrome) and increases sympathetic activity that compromises the heart [2,3,4]. Also, OSA affects people with comorbidities with hypertension, as mentioned in [5].

OSA can be diagnosed using polysomnography; this method examines sleep and respiration parameters using electroencephalograms (EEGs), electrocardiograms (ECGs), and other measures, e.g., pulse oximetry (

S_{P} O_{2}

), airflow measurement, and so on [6,7]. This process is complex and time-consuming because it requires a complete test in a controlled environment to monitor the patient’s sleep and other signals. Also, this diagnosis is unfeasible and extremely expensive for a large population. Some alternative methods were developed using single ECG signals and machine learning techniques to diagnose OSA problems.

Studies, including [8,9,10], have demonstrated the capability of ECG signals to identify respiratory events. Furthermore, research results such as in [11,12], indicate that heart rate variability (HRV) signals can be used to measure and detect apnea events. Atrial fibrillation (AF) is widely related to OSA problems, as mentioned in [13,14], where the authors conducted questionnaires and home sleep studies to check this relation.

Researchers have explored two approaches to detect sleep apnea events in the literature: feature engineering-based techniques and deep learning-based methods. In the first category, the ECG signal is analyzed, and different features are extracted from the P-wave (atrial depolarization), QRS complex (ventricular depolarization), and T-wave (ventricular repolarization), which reflect different cardiac cycle phases. For example, Babaeizadeh et al. presented a cost-effective, noninvasive algorithm for detecting sleep apnea using single-lead ECG. By analyzing heart rate variability and QRS complexes, the algorithm accurately classifies apnea episodes, achieving 84.7% accuracy and demonstrating high potential for sleep medicine applications [15]. Yilmaz explored the utility of using a single-lead ECG to determine sleep stages and detect OSA across 30-s epochs throughout the night, potentially replacing more complex polysomnography systems. Data from 18 subjects, including 10 diagnosed with OSA, revealed promising results: support vector machines (SVMs) and quadratic discriminant analysis (QDA) achieved over 82% accuracy in classifying sleep stages and around 88% in detecting OSA. Using the RR-interval (time interval between two successive R-peaks in an ECG), three features were extracted, resulting in high classification performance. This suggests that single-lead ECG, combined with SVM or QDA, could effectively monitor sleep and OSA. However, integrating additional physiological features could further enhance accuracy [16]. Zarei et al. described an algorithm that detects sleep apnea using ECG features and a combined convolutional neural network (CNN) and long short-term memory (LSTM) recurrent network [17]. Using the same combination of networks, we looked at the work by Bahrami and Forouzanfar [18], which reported accuracy, sensitivity, and specificity of 88.13%, 84.26%, and 92.27%, respectively.

Feature extraction using machine learning (ML) methods was employed to detect OSA events. In the study by Pant et al. [19], wavelet transform was used to extract features from the sub-bands and classify them using various ML methods. In the work of Yang et al., the HRV analysis and ECG-derived respiration (EDR) were used to extract complementary information using a residual group network [20]; their results had an accuracy of 90.3%, a sensitivity of 87.6%, and a specificity of 91.9%. Liu et al. demonstrated a CNN that learned features from the ECG, reaching a sensitivity of 88.2% in classification [21].

Instead of relying on an overall diagnostic analysis of the entire biomedical signal, certain studies focus on determining if a patient is experiencing apnea by analyzing specific indicators. One approach measures the variation in oxygen saturation over a short interval (12 s) [22,23]; other indicators include non-linear methods mentioned in [24]. Maier et al. used Holter-ECG to screen for sleep-disordered breathing (SDB) by comparing it with nocturnal polygraphy (PG) in 50 cardiology patients. The ECG estimates demonstrated strong agreement with PG, exhibiting high specificity (0.96) and good sensitivity (0.77) for detecting significant SDB, promising an effective, cost-efficient method for SDB screening [25].

Some studies have adopted an approach of using fragments of the signal, such as analyzing it minute by minute [26] or using windows of varying sizes [27], to identify occurrences of apnea instead of making an overall diagnosis of OSA, which is generally regarded as the responsibility of physicians.

Various biomedical signals are utilized for the classification and diagnosis of OSA. For instance, the photoplethysmography signal ([28], the

S_{p} O_{2}

signal [24,29,30], the oronasal air flow signal [31], the ECG signal [23,26], and so on. These works use techniques that involve RR variability [26,32,33], frequency domain features using wavelet transform [29,34], and machine learning techniques as neural networks [17,29,34,35,36], and so on.

Artificial intelligence models currently play a huge role in medicine, especially artificial neural networks, which have been used for the classification and detection of diseases such as diabetes [37], heart diseases [38], chronic kidney disease [39], and in oncological diagnostics [40]. Due to the efficiency and utility of these tools, we use neural networks in this work.

Our study makes significant contributions to the field of ECG analysis by introducing an innovative algorithm that integrates geometric models as features and inputs for AI models, specifically designed to classify and detect sleep apnea and heart diseases. This algorithm leverages geometric features that closely match the behavior of the ECG, enhancing the accuracy of classifications. We ensure relevance and continuity in ECG research by addressing the challenges posed by limited and imbalanced datasets and aligning our work with historical benchmarks established in the 2000 challenge [41] using the Apnea-ECG database [42]. Overall, this comprehensive approach not only deepens our understanding of ECG signal behavior but also lays the groundwork for future studies involving larger and more balanced datasets, ultimately improving diagnostic capabilities and patient outcomes.

Existing OSA detection methods often rely on handcrafted features and struggle to achieve high accuracy with single-lead ECG signals. To address these limitations, this article introduces a novel algorithm that uses random algebraic polynomials (RAP) for signal smoothing and creates a non-Gaussian physiological random field (NGPRF) to capture intricate, non-Gaussian ECG features. This innovative framework models critical ECG peaks (P-waves, QRS-complex, T-waves) and extracts their geometrical characteristics as Euler–Poincaré characteristics (EPCs). By capturing unique geometrical forms in the ECG signal, the EPCs provide robust inputs to our classification system. Unlike conventional approaches, we integrate these advanced geometric descriptors with machine learning classifiers, including ensemble K-nearest neighbors (EKNN) and feedforward multi-layer neural networks (FNNs), to significantly improve OSA detection accuracy. This novel combination of RAP-based signal processing and advanced classification methods bridges existing gaps, making single-lead ECG-based OSA detection more accurate and potentially adaptable for real-time and diverse clinical applications.

This article is organized as follows: In Section 2, we describe the mathematics involved. In Section 3, we describe the data used. In Section 4, we discuss the methodology applied to the cases. Then, in Section 5, we present the results and discussion. Finally, a conclusion and future works are provided in Section 6.

2. Mathematical Framework

This section can help one understand the main steps of the methodology mentioned in [43].

2.1. Polynomization

The first step of the method involves pre-processing to smooth the signal. In this step, we model the ECG signals using RAP to reduce noise and construct the NGPRF. RAP is denoted by

φ (X, ξ)

, for an ECG cycle j from 1 to N. The elemental waves in the ECG are P-waves, QRS complex, and T-waves. In this case, each cycle is divided by a threshold

R (j)

, resulting in two intervals:

I_{1} = [1 R (j)]

and

I_{2} = [R E (j)]

, with

E (j)

representing the final index in each cycle. Additionally, we define the RAP of order D representing the P-waves, as follows:

φ_{B} (S, \bar{Ω}, X (j)) = \sum_{k = 0}^{D} a_{k} (j, \bar{Ω}) X {(j)}^{k},

(1)

where

S = (Ω, F, P)

represents a complete probability space,

Ω

denotes the sample space,

F

denotes the

σ

-algebra of events, and

P

denotes the probability measure on

F

. Moreover,

a_{k} (j, \bar{ω})

denotes a sequence of independent and identically distributed random variables for each cycle. The selected interval includes both the R-peak and the P-wave, represented as

X (j) = [1, R (j)]

, where

R (j)

indicates the index of the R-peak.

2.2. Building Non-Gaussian Physiological Random Field

This framework defines a two-dimensional NGPRF,

Φ = (X, ξ)

, within a complete probability space

S

. It models physiological signals across cardiac cycles, mapping

θ : ξ \to R^{D}

. Here,

X \in D \subset R^{2}

, and

θ

denotes a two-dimensional NGPRF defined over N cardiac cycles as follows:

Φ (X_{1}, X_{2}, \bar{ξ}) = \sum_{c = 1}^{N} φ_{B} (S, \bar{ξ}, X (j)) δ (j - X_{2})

(2)

with

φ_{B} (S, \bar{ξ}, X (j))

as the random polynomial that represents the ECG behavior, and

δ

is the Dirac’s distribution. According to Equation (2), polynomials

φ_{B} (S, \bar{ξ}, X (j))

are used to decompose each ECG signal at a fixed point into an NGPRF in the coordinate

X_{2}

.

2.3. Excursion Set

In the random field theory, the excursion set is the main concept. Consider

θ (X, ξ)

as NGPRF, where

X \in D \subset R^{2}

is defined within a set

D

. The excursion set is a geometric entity defined as the collection of points where the field exceeds a certain threshold value. Specifically, for a given threshold

λ

, the excursion set is given by the following:

A_{λ} (θ, D) = {X \in D, θ (X, ξ) \geq λ}

(3)

The set

A_{λ}

of

θ (X, ξ)

above a threshold

λ

consists of points in

D \subset R^{2}

, where

θ (X, ξ)

exceeds

λ

. Stratified manifolds

D \in R^{2}

can be partitioned into disjoint unions of

k -

dimensional manifolds, expressed as

D = \cup_{k = 0}^{d i m D} \partial_{k} D

, where each stratum

\partial_{k} D

represents several

k -

dimensional manifolds. The non-Gaussian properties of

θ

necessitate the consideration of generalized random fields, expressed as follows:

θ (X, ξ) = F (θ_{*} (X, ξ)) = F (θ_{*}^{1} (X, ξ), \dots, θ_{*}^{m} (X, ξ))

(4)

where

θ (X, ξ)

represents a collection of independent and identically distributed (i.i.d.) Gaussian random fields (GRFs), each defined over a topological space

D

. The function

F : R^{m} \to R

is a smooth, piecewise twice continuously differentiable function with appropriate boundary conditions. The excursion set of a real-valued non-Gaussian field

θ = F \circ θ_{*}

above a threshold

λ

is equivalent to the excursion set for a vector-valued Gaussian field

θ_{*}

in

F^{- 1} [λ, \infty)

, and is given by the following:

A_{λ} (θ, D) = A_{λ} (F \circ θ_{*}, D) = {X \in D, (F \circ θ_{*}) (X, ξ \geq λ}

(5)

In our context, let

S_{S} = (s_{1}, \dots, s_{S^{M}})

denote the set of signatures, where each

s_{k}

represents a signed unit weight associated with the sub-Gaussian random field

θ_{s_{k}} (X, ξ)

. Furthermore, we define

S_{M} = Σ_{k = 1}^{M} {(- 1)}^{k} s_{k}

. We shall characterize the NGPRF, defined by Equation (2), as follows:

θ (X, ξ, S_{M}) = (s_{1} θ_{S_{1}} (X, ξ), \dots, s_{S_{M}} θ_{S_{S_{M}}} (X, ξ),

(6)

where

s_{S_{M}} θ_{s_{S_{M}}} (X, ξ)

represents a set of i.i.d.-signed GRFs with

X \in R^{2}

,

ξ \in Ω

, and

s_{2 n} = + 1

,

s_{2 n + 1} = - 1

.

2.4. Euler–Poincaré Characteristic

Let

ϕ (A_{λ} (θ, D))

represent the EPC of the set

A_{λ} (θ, D)

. For multiple trials, if the expected value

E {ϕ (A_{λ} (θ, D))}

is computable, the random field theory imposes specific regularity conditions on

θ (X, ξ)

. These conditions ensure that the realizations of

θ

are smooth and that the boundary

\partial D

is smooth as well. Thus,

D

is regarded as a regular

C^{2}

domain within a compact subset of

R^{2}

, bounded by a smooth, one-dimensional

C^{2}

manifold

\partial D

.

Consider

θ (X, ξ)

, where

X = [χ_{1}, χ_{2}] \subset R^{2}

, as a stationary, non-isotropic random field. Define

{\dot{θ}}_{i} (X, ξ) = \frac{\partial θ (X, ξ)}{\partial χ_{i}}

and

{\ddot{θ}}_{i j} (X, ξ) = \frac{\partial^{2} θ (X, ξ)}{\partial χ_{i} \partial χ_{j}}

, where

i, j = 1, 2

.

The moduli of continuity of

{\dot{θ}}_{j} (X, ξ)

, and

{\ddot{θ}}_{j k} (X, ξ)

inside

D

, are given by the following:

ξ_{j} (h) = sup_{∥ X - Y ∥ < h, ξ} |{\dot{θ}}_{j} (X, ξ) - {\dot{θ}}_{j} (Y, ξ)|,

and

ξ_{j k} (h) = sup_{∥ X - Y ∥ < h, ξ} |{\ddot{θ}}_{j k} (X, ξ) - {\ddot{θ}}_{j k} (Y, ξ)|,

respectively.

To ensure that the realizations of

θ (X, ξ)

are sufficiently smooth, the following conditions must be met:

C1: The maximum of $ξ_{j} (h)$ and $ξ_{j k} (h)$ should be asymptotically negligible, specifically $o (h^{N})$ , as $h \to 0$ .
C2: The Hessian $\ddot{θ}$ of ${\dot{θ}}_{j k}$ must have a finite variance conditional on $θ$ , where $\dot{θ}$ is the gradient of ${\dot{θ}}_{j}$ .
C3: The densities of $θ$ and $\dot{θ}$ must be uniformly bounded for all $X \in D$ .

At a boundary point

X

on

\partial D

:

${\dot{θ}}_{⊥}$ is the gradient of $θ$ in the direction perpendicular to $\partial D$ .
${\dot{θ}}_{D}$ is the gradient in the tangent direction within the plane of $\partial D$ .
${\ddot{θ}}_{D}$ denotes the $1 \times 1$ -Hessian matrix in the tangent plane.
r is the $1 \times 1$ matrix representing the internal curvature of $\partial D$ .

Using the sign function, denoted as

s i g n

, and following Knuth’s notation [44], which assigns a value of 1 to a true expression in parentheses, and 0 if false, the EPC is then given by the following:

\begin{matrix} Ψ (A_{λ} (θ, D)) & = Σ_{X \in D} (θ \geq λ) (\dot{θ} = 0) s i g n [det (- \ddot{θ})] \\ + Σ_{X \in D} (θ \geq λ) ({\dot{θ}}_{D} = 0) ({\dot{θ}}_{⊥} \leq 0) \times sign [det (- {\ddot{θ}}_{D}) - \dot{θ} ⊥ r)], \end{matrix}

(7)

with a probability of one. In

R^{2}

, integral geometry characterizes

φ (A_{λ} (θ, D))

as the difference between the count of connected components and the number of voids in

A_{λ} (θ, D)

. Moreover, the expected value of

φ (A_{λ} (θ, D))

over multiple realizations is as follows:

\begin{matrix} E {Ψ (A_{λ} & (θ, D))} = \int_{D} E {(θ \geq λ) det (- \ddot{θ}) | \dot{θ} = 0} β (0) d X \\ + \int_{\partial D} E {(θ \geq λ) ({\dot{θ}}_{⊥} < 0) det (- {\ddot{θ}}_{D} - \dot{θ} ⊥ r) | {\dot{θ}}_{D} = 0} \times β_{D} (0) d X, \end{matrix}

(8)

where

β (.)

is the density of

\dot{θ}

and

β_{D} (.)

is the density of

{\dot{θ}}_{D}

. It is shown that, under the condition that the boundary of

D

consists of a finite number of piecewise filtered segments, the expected value of the excursion set of a mean-zero, unit-variance random field is given by the following:

E \{Ψ (A_{λ} (θ, D))\} = \sum_{j = 0}^{N} ρ_{j} (λ) L_{j} (D)

(9)

where

ρ_{j} (λ) = {(2 π)}^{- (j + 1) / 2} H_{j - 1} (λ) exp (- λ^{2} / 2)

is the intensity of the EPC per unit volume,

H_{j}

denotes the

j - t h

Hermite polynomial, and

L_{j} (D)

represents the Lipschitz–Killing curvatures.

3. Data Used

The Apnea-ECG database [42] was used in this study. It contains 70 records, divided into two groups: a training set of 35 records, including 20 severe cases, 5 mild cases, and 10 non-apnea cases, and a validation set consisting of the remaining 35 records. All ECG signals in the dataset are annotated with QRS complexes. The data are derived from a population aged between 27 and 60 years, with all records digitized at 100 samples per second. The length of the records ranges from 7 to 10 h.

4. Methodology

Figure 1 represents the proposed methodology used for classifying sleep apnea. The methodology consists of the following five steps:

Pre-processing: As part of this step, the ECG signals were filtered using random polynomials. This process is referred to as polynomization.
Random field transformation: The filtered signal was used to build the NGPRF, collapsing c number of cycles into a geometric structure.
Geometrical property: Each NGPRF was sectioned at several levels, denoted by $λ$ , known as excursion sets.
Feature extraction: For each excursion set, a value was calculated that represented the difference between the number of connected components and the number of holes, capturing key aspects of the set’s geometric structure. This feature is referred to as the Euler–Poincaré characteristic.
Classifier selection: An EKNN model is proposed to classify OSA from the EPC models. The training dataset was used for learning, and the test dataset was used to validate the model. Also, a feedforward multi-layer neural network is proposed for the binary classification.

4.1. Pre-Processing

The pre-processing stage is referred to as ‘polynomization’, and was dealt with as follows. First, the ECG cycle was divided into two interval parts, as our focus was on the part containing the P-waves and R-peaks, only one interval was selected. The function for this process is presented in (1).

We identified the R-peaks using the Pan–Tompkins algorithm [45] to achieve this objective. Afterward, we split the ECG cycle from R-peak to R-peak using the middle points,

m_{k}

, and a fixed threshold. Each ECG cycle preserved the main waves, i.e., P, Q, R, S, and T. Every cycle

N_{j}

was segmented two sections,

P_{1} (C) = [m_{k} R_{k + 1}],

and

P_{2} (C) = [R_{k + 1} m_{k + 1}]

. Upon completion of the process, the result was in the form of polynomials that correspond to the appropriate order D. In these cases, we only used the

P_{1} (C)

polynomials.

We assessed the normalized root mean square error (NRMSE) using Equation (10) to identify the optimal order of the random polynomials that best approximated the data.

F I T = 100 (1 - \frac{∥P_{1} (C) - \sum_{k = 0}^{D_{1} (C)} {\hat{a}}_{k} (k, \bar{ω}) X_{j} {(k)}^{k}∥}{∥P_{1} (C) - {\bar{P}}_{1} (C)∥})

(10)

where

P_{1} (C)

represents the section of the ECG containing the P-waves and R-peaks,

{\bar{P}}_{1} (C)

is the mean of each interval

P_{1}

, and the inverse of the Vandermonde matrix is used to calculate the vector of estimated coefficients

{\hat{a}}_{k}

[46]. Our goal is to find the maximum FIT based on NRMSE. To do this, we evaluate a group of samples (100,000) from each of the 35 patients for the training set using polynomial orders ranging from 15 to 25 (

D_{1} (c) = 15, \dots, 25

). To calculate the order of polynomials for the sections of the ECG from the R-peaks to the T-waves, we used Equation (10), and instead of

P_{1} (C)

, we used

P_{2} (C)

. Also, we used the integral absolute error (IAE) criterion. The results for

P_{1} (C)

and

P_{2} (C)

are shown in Figure 2a,b with both criteria.

The optimal polynomial order is 15 according to the FIT criterion and 21 based on the IAE criterion. It is later shown that the order of the polynomials significantly influences the final results, particularly during the classification process. Algorithm 1 shows the entire procedure used for selecting the orders and obtaining the polynomials

P_{1} (C)

and

P_{2} (C)

. However, we only use the P-wave polynomials

P_{1} (C)

in the following steps.

Algorithm 1 Algorithm for pre-processing the ECG signals

Input:: ECG
Output:: Polynomials $P_{1} (C)$ and/or $P_{2} (C)$
: Initialization;
1:: Search the locations $M_{j}$ , $j = 1, \dots, N_{c}$ ;
2:: Locate the middle points $m_{j}$ ;
3:: Find the best order for $P_{1} (C)$ and $P_{2} (C)$ using Equation (10);
4:: Estimate the polynomials $I_{1} (c)$ , $c \in [m_{j}, M_{j + 1}]$ , and $I_{2} (c)$ , $c \in [M_{j + 1}, m_{j + 1}]$ ;
5:: Assess the results;

4.2. Transforming into a Random Field and Geometrical Approach

Once the ECG signal is filtered using the polynomization method, we combine the results of each cardiac cycle to form the random field (see Figure 3). The NGPRF can be expressed as Equation (2).

The excursion set, defined in Equation (3), utilizes a threshold level

λ

determined from the range [−1, 1]. This range is used because we normalize the signal in the pre-treatment process.

To ensure homogeneity across the random fields, any empty spaces should be filled with zeros. Figure 4 and Figure 5 show the behaviors of the excursion set at threshold level

λ = 0

, where we note the R-peaks and the P-waves of each type of patient. We developed Algorithm 2, which measures the vectors with the P-waves and R-peaks to build the non-Gaussian random field. The algorithm first determines the maximum length of the cycles. If the length of any cycle is less than this maximum, the algorithm fills the empty spaces with zeros to ensure homogeneity.

Algorithm 2 Process to construct the non-Gaussian random physiological field

Input:: RAPs $P_{1} (C)$ , empty vector $A = []$
Output:: $Φ (X, \bar{Ω}, S^{M})$
: Initialization;
1:: for $i = 1 t o$ $M_{c}$ do
2:: $M (i) = l e n g t h (P_{1} {(C)}_{i})$
3:: end for
4:: $V a l_{M} = max {M a t r i x} | V a l_{M} \in R$
5:: if ( $M a t r i x_{i} < V a l_{M}$ ) then
6:: $Φ ([i, 1 : M], \bar{Ω}) = [z e r o s, P_{1} {(C)}_{k}]$
7:: else
8:: $Φ ([i, 1 : M], \bar{Ω}) = P_{1} {(C)}_{k}$
9:: end if

4.3. Extracting the Feature

In this work, feature extraction relies on the excursion set to calculate the Euler–Poincaré characteristic (EPC), represented as

φ (A_{λ} (Φ, D))

. The approach involves transforming the excursion set into a binary image, where each value is set to 1 if it exceeds the threshold level

λ

, and 0 otherwise. To compute the EPC values, we applied Gray’s algorithm [47]. The EPC is a scalar quantity that represents the difference between the number of connected components and the number of holes within those components [48].

After that, the EPC is obtained for all apnea and non-apnea patients in the training set. Figure 6 shows the EPC for each apnea patient; the blue lines depict the results obtained with 15th-order polynomials, while the red lines indicate the behaviors associated with 21st-order polynomials. Figure 7 shows the borderline apnea patients (patients 21–25) and non-apnea patients (26–35), where we have the results of both polynomials, i.e., the blue lines denote 15th-order and the red lines denote 21st-order.

4.4. Classification

The last step of the methodology involves searching for the best classifier. To predict the type of apnea, we use a matrix

X

with 102 columns; columns 1 to 101 contain the stacked EPC values, and the last column represents the target value, where sleep apnea is denoted as 0 and non-apnea as 1. The matrix described is utilized for training the classifier.

[\begin{matrix} Z_{1}^{1} & Z_{2}^{1} & \dots & Z_{n}^{1} & Y_{1} \\ Z_{1}^{2} & Z_{2}^{2} & \dots & Z_{n}^{2} & Y_{2} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ Z_{1}^{m} & Z_{2}^{m} & \dots & Z_{n}^{m} & Y_{m} \end{matrix}]

where

Z_{j}^{k}

denotes the predictors with

j = [1, 101]

, and

Y_{k}

denotes the class of each patient.

In our study, we utilized Classification Learner, a Matlab toolbox, to model several classifiers. We chose several models to provide a baseline comparison, varying complexity and interoperability. Our objective was to evaluate simple and sophisticated models to observe the complexity of the OSA classification. The results are shown in Table 1, applying EPC models derived from 15th-order polynomials as inputs for the classifiers. The computations were conducted on an AMD Ryzen 5 processor with 16 GB of RAM, utilizing the parallel pool option. We used k-fold cross-validation with

k = 5

in all cases.

With the results shown previously, we applied an ensemble of K-nearest neighbors (EKNN); this approach can help to improve the prediction with non-informative features [49]. In Figure 6 and Figure 7, we observed all the EPCs from all the patients; analyzing these data presents a challenge, so we employed classifiers to reveal information that aids in their classification.

Figure 8 presents a diagram of the ensemble KNN classifier, where the EPC models from both healthy and unhealthy patients serve as the training data.

We conducted a study where we selected 30 learners, each representing a different subspace. For each learner, we randomly chose 65 variables to predict their behavior, resulting in 35 observations for both the training and test phases. In the next section, we will discuss the results and findings of this model in detail. Table 2 shows the parameters used to train the EKNN model using 101 features as inputs. Remember that from the Apnea-ECG database, there are two sets, one called training data and the other called test data, each with 35 records unbalanced in the classes.

4.5. Neural Network Classifier

This work uses a feedforward multi-layer neural network to classify apnea or non-apnea patients. In this architecture, data move in one direction from the input layer through one or more hidden layers to the output layer. FNNs are commonly used for classification tasks, such as determining whether a patient has sleep apnea. The network is fully connected, meaning each neuron in one layer connects to every neuron in the next layer. The model is trained by adjusting the weights between these connections to minimize the difference between the predicted and actual labels using a process called backpropagation. In binary classification, the output typically comes from a single neuron using a sigmoid activation function to produce a probability.

The main equation governing the computation in a neural network is the weighted sum of the inputs at each neuron, followed by an activation function. For neuron j, the pre-activation output

z_{j}

is given by the following:

z_{j} = \sum_{i = 0}^{n} ω_{j i} x_{i} + b_{j}

(11)

where

ω_{j i}

denotes the weights,

x_{i}

denotes the input values, and

b_{j}

denotes the bias. The weighted sum

z_{j}

is then passed through an activation function

f (z_{j})

to introduce non-linearity. This nonlinearity is essential for the network to model complex patterns in the data. The final output from the network depends on both the network architecture and the activation function choice.

Figure 9 shows the neural network used in this work to classify apnea, where an input layer, two hidden layers, and an output layer were implemented. A Swish activation function was implemented in the input layer, a ReLU activation function was implemented in the first hidden layer, a tanh activation function was used in the second hidden layer, and a softmax was used in the output layer.

Table 3 summarizes the values of the parameters of the FNN builder in Python using the TensorFlow library.

Below are the results obtained using two types of polynomial orders to smooth the signal. They are also compared with other models.

5. Results and Discussion

A machine learning model was developed to detect cases of sleep apnea using an ensemble K-nearest neighbors (KNN) classifier. The first phase of the study involved training the model using a dataset that included 20 cases of severe sleep apnea (1 h of activity), 5 mild cases (<1 h), and 10 cases with non-apnea. The results of this phase are presented in the form of a confusion matrix, as shown in Figure 10. The model’s accuracy was found to be 80%. Also, we found a sensitivity of 82% and a specificity of 71%. To achieve these results, we used the following formulas using the results of the confusion matrix:

A c c u r a c y = (T P + T N) / (P + N) = (23 + 5) / (35) \times (100 %) = 80 %

S e n s i t i v i t y = (T P) / (T P + F N) = (23) / (23 + 5) \times (100 %) = 82 %

S p e c i f i c i t y = (T N) / (T N + F P) = (5) (5 + 2) \times (100 %) = 71 %

where P and N represent positive and negative cases, respectively;

T P

and

T N

denote true positives and true negatives; and

F P

and

F N

represent false positives and false negatives.

Figure 10 shows the total correct predictions of the trained model, where it achieved 92% accuracy in apnea cases and 50% accuracy in non-apnea cases. In this work, we used 15th- and 21st-order polynomials to soften the signal, but the model achieved the same results obtained with the training data; Figure 10 expresses the results for the two polynomial orders in this phase.

In the testing phase, we used the validation data to obtain the results presented in Figure 11 and Figure 12; one set of results uses a 15th-order polynomial and the other uses a 21st-order polynomial. Figure 11 shows that patients with apnea were correctly classified 100% of the time, while those without apnea were correctly classified 88% of the time. In addition, the accuracy, sensitivity, and specificity were calculated with the above-mentioned formulas, yielding results of 97%, 96%, and 100%, respectively.

Figure 12 shows the results using the 21st-order polynomial, where the patients with apnea were correctly classified 100% of the time, and those without apnea were correctly classified 75% of the time. Additionally, the accuracy, sensitivity, and specificity were calculated using the formulas mentioned above, yielding results of 94%, 93%, and 100%, respectively.

A new method for classifying sleep apnea is proposed, which involves performing a geometric transformation and extracting information through the Euler–Poincaré characteristic model from Holter ECG and one lead (lead

V_{3}

). Using a 15th-order polynomial in the pre-processing phase, the model achieves an average accuracy of 89%, a sensitivity of 89%, and a specificity of 86%.

The results obtained with the feedforward neural network are presented for the training phase in Figure 13 and for the test phase in Figure 14. The summary of these results is shown in Table 4.

The FNN has a better response than the EKNN. We used a 15th-order polynomial EPC model to obtain the FNN classifier model because it had better results compared to the 21st-order polynomial. The next step is to compare the two approaches with several cases in the literature.

Table 5 compares our approach with various cases from the 2000 challenge to modern techniques that employ deep learning machines.

The heart rate variability (HRV) analysis has been employed in the literature. For example, [26] used domain time features of the HRV, achieving a performance of 92.6%. In [34], the authors analyzed the frequency domain features from the T-wave, using the wavelet transform, and reached 92.3% accuracy. In [32], the spectral analysis of HRV features achieved an accuracy of 89.36%.

Other works used oxygen saturation (

S_{p} O_{2}

) data to classify OSA. In [29], three different algorithms using the wavelet transform were compared, and the results yielded an accuracy of 88%.

The decision methods can vary by work. The most used classifiers include random forest, support vector machines, linear or quadratic regression, neural networks, and KNN. In [50], a linear discriminant using frequency domain features extracted from the Hilbert transform achieved 90% accuracy. In [30], a decision tree combined with blood oxygen saturation levels classified the OSA and achieved an accuracy of 93.03%, a sensitivity of 92.35%, and a specificity of 93.52%. In [36], a neural network was used to classify RR interval data and multi-frequency filters, achieving a specificity of 99%.

In [51], a random forest classifier used features from the power spectrum and discrete wavelet transform. The best features were selected as inputs with the aid of the particle swarm optimization method. The reported accuracy was 99.22%.

The authors of [35] and [17] used deep neural networks. A multi-layer feed-forward neural network (FNN) using information from the ECG,

S_{p} O_{2}

, and BMI yielded an accuracy of 97.8% [35]. In [17], the authors proposed an algorithm that used deep learning layers (convolutional neural networks and long short-term memory) to extract features and classify the OSA. It achieved an accuracy of 97.21%, a sensitivity of 94.41%, and a specificity of 98.94% on the Apnea-ECG database.

The related works and each study’s performance measures, signals used, techniques, and decision methods are summarized in Table 5. Our approach stands out in terms of performance measures (sensitivity, specificity, and accuracy); we also employed a geometrical approach, which is unique among the studies reviewed. We used a different multi-layer FNN and the EPC model as features, obtaining a similar performance to Li et al. [35].

Appendix A outlines our application of the Synthetic Minority Oversampling Technique (SMOTE) to address dataset imbalance in the database used. This enhancement increases the robustness of detecting sleep apnea and heart disease from ECG data.

6. Conclusions and Future Works

This study introduced a novel method for detecting obstructive sleep apnea (OSA) using single-lead ECG signals by leveraging the Euler–Poincaré characteristic (EPC) to represent ECG peaks geometrically. The RAP-based methodology for smoothing the ECG signal and extracting EPC provided a rich feature set that captured the intricate patterns of ECG peaks. We then applied an ensemble K-nearest neighbors (EKNN) classifier to these EPC features, outperforming traditional methods by combining multiple KNN models for improved accuracy. This approach highlights the effectiveness of using geometric representations and advanced ensemble techniques to enhance OSA detection from ECG data.

In conclusion, using the EKNN as a classifier during the training phase, we achieved an accuracy of 80%, with a sensitivity of 82% and a specificity of 71%. In the test phase, we attained an accuracy of 97%, a sensitivity of 96%, and a specificity of 100%.

For the FNN, as a binary classifier, we recorded a perfect accuracy of 100%, with a sensitivity of 96% and a specificity of 100% in the training phase. In the test phase, we achieved an accuracy of 97%, a sensitivity of 93%, and a specificity of 100%.

In future works, the NGRF will be built using smooth signals from the P-waves, QRS-peaks, and T-waves to obtain a model that represents the behavior of the full ECG instead of only one section (P-R section).

Our framework works with the EPC as a direct input to the classifier. In future works, we will include different features in various domains (time, frequency, geometric, and statistical) and larger datasets to classify various diseases.

Author Contributions

Conceptualization, M.R.-M. and G.O.-T.; methodology, F.D.J.S.-V. and M.M.G.; validation, E.S.-B. and J.C.M.-S.; formal analysis, J.E.V.-R. and E.M.R.-V.; Investigation, M.R.-M. and M.G.M.-E.; writing—review and editing, J.Y.R.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The database Apnea-ECG, described in [42], is available on PhysioNet at https://physionet.org/content/apnea-ecg/1.0.0/ (accessed on 22 August 2024).

Acknowledgments

We thank the Centro Universitario de los Valles of the University of Guadalajara for the access and use of the research laboratories.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

Accu	accuracy
BMI	body mass index
CNN	convolutional neural network
CTM	central tendency measure
DWT	discrete wavelet transform
ECG	electrocardiogram
EDR	ECG-derived respiration
EEG	electroencephalogram
EKNN	ensemble k-nearest neighbors
EMG	electromyogram
EPC	Euler–Poincaré characteristic
FENet	frequency extraction network
FNN	feedforward neural network
HRV	heart rate variability
IAE	integral absolute error
KNN	k-nearest neighbors
LSTM	long short-term memory
LZ	Lempel–Ziv
ML	machine learning
NGPRF	non-Gaussian physiological random field
NGRF	non-Gaussian random field
NRMSE	normalized root mean square error
OSA	obstructive sleep apnea
PG	polygraphy
PSO	particle swarm optimization
RFC	random forest classifier
ROC-AUC	characteristic-area under the curve
SBD	sleep-disordered breathing
Sen	sensitivity
Spec	specificity
$S_{p} O_{2}$	pulse oximetry

Appendix A. Results Obtained with Balance Data Using SMOTE

This study utilized the datasets from the Apnea-ECG database on sleep apnea. However, due to the inherent class distribution, a significant imbalance was observed, with the minority class being underrepresented. To address this issue and improve the model’s performance, particularly in detecting the minority class, we employed the synthetic minority over-sampling technique (SMOTE) [52].

SMOTE generates synthetic samples by interpolating between existing minority class instances, thereby increasing the representation of the minority class in the dataset. This approach helps to balance the class distribution, mitigating the risk of the model becoming biased toward the majority class. A new and balanced dataset was created by incorporating the synthetic samples generated by SMOTE, which was then used for model training and evaluation. This strategy aimed to enhance the model’s ability to accurately identify instances of sleep apnea, especially those belonging to the previously underrepresented class.

The strategy criteria applied to this dataset are as follows:

Data preparation: Load the ECG dataset, separate features and labels, and scale the features using the StandardScaler.
Data balancing: Apply SMOTE to handle class imbalance, ensuring balanced data for model training. In this step, features from the EPC model are randomly copied to generate new data for the minority class.
Train–test split: Split the balanced dataset into training and testing sets, resulting in two groups of 52 patients each, with 26 patients from the apnea group and the others from the no-apnea group.
Model building and training: Create a neural network model, compile it with the Adam optimizer, and train it on the training set with one-hot encoded labels, using the same FNN with parameters described in Table 3.
Model evaluation: Evaluate the model’s performance on the test set and generate predictions.
Performance metrics: Calculate and visualize confusion matrices, and compute detailed metrics such as accuracy, sensitivity, specificity, precision, and F1 score for both training and test sets.

To handle class imbalance, the SMOTE algorithm generates 34 new instances for the non-apnea class, each with 101 features. Then, the results are split into two sets (train and test), each containing 52 elements, with 26 apnea and 26 non-apnea patients, achieving a balance ratio of 50.

Figure A1 shows the confusion matrix for the training set with an accuracy of 92.31%, a sensitivity of 100%, and a specificity of 84.62%.

Figure A2 shows the confusion matrix for the test set with an accuracy of 84.62%, a sensitivity of 80.77%, and a specificity of 88.46%.

The summary of the metrics for this dataset, using the SMOTE algorithm to balance the data, is presented in Table A1.

Figure A1. Confusion matrix for the training set using the balanced data. Class 0 represents the apnea patients, and class 1 represents the non-apnea patients.

Figure A2. Confusion matrix for the test set using the balanced data. Class 0 represents the apnea patients, and class 1 represents the non-apnea patients.

Table A1. Performance metrics summary using SMOTE.

Metric	Training (%)	Test (%)
Accuracy	92.31	84.62
Sensitivity	100.00	80.77
Specificity	84.62	88.46
Precision	86.67	87.50
F1 Score	92.86	84.00

References

Malhotra, A.; Ayappa, I.; Ayas, N.; Collop, N.; Kirsch, D.; Mcardle, N.; Mehra, R.; Pack, A.I.; Punjabi, N.; White, D.P.; et al. Metrics of sleep apnea severity: Beyond the apnea-hypopnea index. Sleep 2021, 44, zsab030. [Google Scholar] [CrossRef]
Baranchuk, A. Sleep apnea, cardiac arrhythmias, and conduction disorders. J. Electrocardiol. 2012, 45, 508–512. [Google Scholar] [CrossRef] [PubMed]
Suen, C.; Wong, J.; Ryan, C.M.; Goh, S.; Got, T.; Chaudhry, R.; Lee, D.S.; Chung, F. Prevalence of Undiagnosed Obstructive Sleep Apnea Among Patients Hospitalized for Cardiovascular Disease and Associated In-Hospital Outcomes: A Scoping Review. J. Clin. Med. 2020, 9, 989. [Google Scholar] [CrossRef] [PubMed]
Anzai, T.; Grandinetti, A.; Katz, A.R.; Hurwitz, E.L.; Wu, Y.Y.; Masaki, K. Association between central sleep apnea and atrial fibrillation/flutter in Japanese-American men: The Kuakini Honolulu Heart Program (HHP) and Honolulu-Asia Aging Study (HAAS). J. Electrocardiol. 2020, 61, 10–17. [Google Scholar] [CrossRef] [PubMed]
Leung, A.A.; Williams, J.V.; Padwal, R.S.; McAlister, F.A. Prevalence, Patient Awareness, Treatment, and Control of Hypertension in Canadian Adults With Common Comorbidities. CJC Open 2024, 6, 1099–1107. [Google Scholar] [CrossRef] [PubMed]
Rundo, J.V.; Downey, R., III. Polysomnography. Handb. Clin. Neurol. 2019, 160, 381–392. [Google Scholar]
Gottlieb, D.J.; Punjabi, N.M. Diagnosis and management of obstructive sleep apnea: A review. Jama 2020, 323, 1389–1400. [Google Scholar] [CrossRef]
Khandoker, A.H.; Palaniswami, M.; Karmakar, C.K. Support vector machines for automated recognition of obstructive sleep apnea syndrome from ECG recordings. IEEE Trans. Inf. Technol. Biomed. 2008, 13, 37–48. [Google Scholar] [CrossRef]
Varon, C.; Caicedo, A.; Testelmans, D.; Buyse, B.; Van Huffel, S. A novel algorithm for the automatic detection of sleep apnea from single-lead ECG. IEEE Trans. Biomed. Eng. 2015, 62, 2269–2278. [Google Scholar] [CrossRef]
Jung, D.W.; Hwang, S.H.; Lee, Y.J.; Jeong, D.U.; Park, K.S. Apnea–hypopnea index prediction using electrocardiogram acquired during the sleep-onset period. IEEE Trans. Biomed. Eng. 2016, 64, 295–301. [Google Scholar]
Mohammadzadeh-Asl, B.; Setarehdan, S.K. Neural network based arrhythmia classification using heart rate variability signal. In Proceedings of the 2006 14th European Signal Processing Conference, Florence, Italy, 4–8 September 2006; pp. 1–4. [Google Scholar]
Asl, B.M.; Sharafat, A.R.; Setarehdan, S.K. An adaptive backpropagation neural network for arrhythmia classification using RR interval signal. Neural Netw. World 2012, 22, 535. [Google Scholar] [CrossRef]
Starkey, S.Y.; Jonasson, D.R.; Alexis, S.; Su, S.; Johal, R.; Sweeney, P.; Brasher, P.M.; Fleetham, J.; Ayas, N.; Orenstein, T.; et al. Screening for Obstructive Sleep Apnea in an Atrial Fibrillation Population: What’s the Best Test? CJC Open 2021, 3, 442–449. [Google Scholar] [CrossRef] [PubMed]
Sun, D.; Schaff, H.V.; Somers, V.K.; Nishimura, R.A.; Geske, J.B.; Dearani, J.A.; Ommen, S.R. Association of Preoperative Sleep-Disordered Breathing With Functional Status After Septal Myectomy for Obstructive Hypertrophic Cardiomyopathy. CJC Open 2022, 4, 848–853. [Google Scholar] [CrossRef] [PubMed]
Babaeizadeh, S.; White, D.P.; Pittman, S.D.; Zhou, S.H. Automatic detection and quantification of sleep apnea using heart rate variability. J. Electrocardiol. 2010, 43, 535–541. [Google Scholar] [CrossRef] [PubMed]
Yılmaz, B. Sleep staging and apnea detection from single-lead electrocardiogram. J. Electrocardiol. 2011, 44, e31–e32. [Google Scholar] [CrossRef]
Zarei, A.; Beheshti, H.; Asl, B.M. Detection of sleep apnea using deep neural networks and single-lead ECG signals. Biomed. Signal Process. Control 2022, 71, 103125. [Google Scholar] [CrossRef]
Bahrami, M.; Forouzanfar, M. Sleep Apnea Detection From Single-Lead ECG: A Comprehensive Analysis of Machine Learning and Deep Learning Algorithms. IEEE Trans. Instrum. Meas. 2022, 71, 1–11. [Google Scholar] [CrossRef]
Pant, H.; Dhanda, H.K.; Taran, S. Sleep apnea detection using electrocardiogram signal input to FAWT and optimize ensemble classifier. Measurement 2022, 189, 110485. [Google Scholar] [CrossRef]
Yang, Q.; Zou, L.; Wei, K.; Liu, G. Obstructive sleep apnea detection from single-lead electrocardiogram signals using one-dimensional squeeze-and-excitation residual group network. Comput. Biol. Med. 2022, 140, 105124. [Google Scholar] [CrossRef]
Liu, H.; Cui, S.; Zhao, X.; Cong, F. Detection of obstructive sleep apnea from single-channel ECG signals using a CNN-transformer architecture. Biomed. Signal Process. Control 2023, 82, 104581. [Google Scholar] [CrossRef]
Magalang, U.J.; Dmochowski, J.; Veeramachaneni, S.; Draw, A.; Mador, M.J.; El-Solh, A.; Grant, B.J. Prediction of the Apnea-Hypopnea Index From Overnight Pulse Oximetry. Chest 2003, 124, 1694–1701. [Google Scholar] [CrossRef] [PubMed]
Heneghan, C.; Chua, C.P.; Garvey, J.F.; de Chazal, P.; Shouldice, R.; Boyle, P.; McNicholas, W.T. A Portable Automated Assessment Tool for Sleep Apnea Using a Combined Holter-Oximeter. Sleep 2008, 31, 1432–1439. [Google Scholar] [CrossRef] [PubMed]
Álvarez, D.; Hornero, R.; Abásolo, D.; del Campo, F.; Zamarrón, C. Nonlinear characteristics of blood oxygen saturation from nocturnal oximetry for obstructive sleep apnoea detection. Physiol. Meas. 2006, 27, 399–412. [Google Scholar] [CrossRef] [PubMed]
Maier, C.; Friedrich, J.; Katus, H.; Dickhaus, H. Prospective evaluation of a Holter-ECG derived severity index for screening of sleep disordered breathing. J. Electrocardiol. 2016, 49, 919–924. [Google Scholar] [CrossRef] [PubMed]
McNames, J.N.; Fraser, A.M. Obstructive sleep apnea classification based on spectrogram patterns in the electrocardiogram. In Proceedings of the Computers in Cardiology 2000. Vol.27 (Cat. 00CH37163), Cambridge, MA, USA, 24–27 September 2000; pp. 749–752. [Google Scholar] [CrossRef]
Gil, E.; Vergara, J.M.; Laguna, P. Study of the relationship between Pulse Photopletismography amplitude decrease events and sleep apneas in children. In proceeding of 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, New York, NY, USA, 30 August–3 September 2006; pp. 3887–3890. [Google Scholar]
Gil, E.; Monasterio, V.; Laguna, P.; Vergara, J.M. Pulse Photopletismography Amplitude Decrease Detector for Sleep Apnea Evaluation in Children. In proceeding of 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China, 17–18 January 2006; pp. 2743–2746. [Google Scholar]
Lee, Y.K.; Bister, M.; Blanchfield, P.; Salleh, Y.M. Automated detection of obstructive apnea and hypopnea events from oxygen saturation signal. In Proceedings of the The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Francisco, CA, USA, 1–5 September 2004; Volume 1, pp. 321–324. [Google Scholar] [CrossRef]
Burgos, A.; Goñi, A.; Illarramendi, A.; Bermudez, J. Real-Time Detection of Apneas on a PDA. IEEE Trans. Inf. Technol. Biomed. 2010, 14, 995–1002. [Google Scholar] [CrossRef]
Varady, P.; Micsik, T.; Benedek, S.; Benyo, Z. A novel method for the detection of apnea and hypopnea events in respiration signals. IEEE Trans. Biomed. Eng. 2002, 49, 936–942. [Google Scholar] [CrossRef]
de Chazal, P.; Heneghan, C.; Sheridan, E.; Reilly, R.; Nolan, P.; O’Malley, M. Automatic classification of sleep apnea epochs using the electrocardiogram. In Proceedings of the Computers in Cardiology 2000. Vol.27 (Cat. 00CH37163), Cambridge, MA, USA, 24–27 September 2000; pp. 745–748. [Google Scholar] [CrossRef]
Maier, C.; Dickhaus, H.; Laguna, P. Amplitude variability extraction from multi-lead electrocardiograms for improvement of sleep apnea recognition. In Proceedings of the Computers in Cardiology, Lyon, France, 25–28 September 2005; pp. 355–358. [Google Scholar] [CrossRef]
Raymond, B.; Cayton, R.M.; Bates, R.A.; Chappell, M. Screening for obstructive sleep apnoea based on the electrocardiogram-the computers in cardiology challenge. In Proceedings of the Computers in Cardiology 2000. Vol.27 (Cat. 00CH37163), Cambridge, MA, USA, 24–27 September 2000; pp. 267–270. [Google Scholar] [CrossRef]
Li, Z.; Li, Y.; Zhao, G.; Zhang, X.; Xu, W.; Han, D. A model for obstructive sleep apnea detection using a multi-layer feed-forward neural network based on electrocardiogram, pulse oxygen saturation, and body mass index. Sleep Breath. 2021, 25, 2065–2072. [Google Scholar] [CrossRef]
Ye, G.; Yin, H.; Chen, T.; Chen, H.; Cui, L.; Zhang, X. FENet: A Frequency Extraction Network for Obstructive Sleep Apnea Detection. IEEE J. Biomed. Health Informatics 2021, 25, 2848–2856. [Google Scholar] [CrossRef]
Rashid, M.M.; Askari, M.R.; Chen, C.; Liang, Y.; Shu, K.; Cinar, A. Artificial Intelligence Algorithms for Treatment of Diabetes. Algorithms 2022, 15, 299. [Google Scholar] [CrossRef]
Asif, D.; Bibi, M.; Arif, M.S.; Mukheimer, A. Enhancing Heart Disease Prediction through Ensemble Learning Techniques with Hyperparameter Optimization. Algorithms 2023, 16, 8. [Google Scholar] [CrossRef]
Mondol, C.; Shamrat, F.M.J.M.; Hasan, M.R.; Alam, S.; Ghosh, P.; Tasnim, Z.; Ahmed, K.; Bui, F.M.; Ibrahim, S.M. Early Prediction of Chronic Kidney Disease: A Comprehensive Performance Analysis of Deep Learning Models. Algorithms 2022, 15, 308. [Google Scholar] [CrossRef]
Khoperskov, A.V.; Polyakov, M.V. Improving the Efficiency of Oncological Diagnosis of the Breast Based on the Combined Use of Simulation Modeling and Artificial Intelligence Algorithms. Algorithms 2022, 15, 292. [Google Scholar] [CrossRef]
Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet components of a new research resource for complex physiologic signals. Circulation 2000, 101, e215–e220. [Google Scholar] [CrossRef] [PubMed]
Penzel, T.; Moody, G.B.; Mark, R.G.; Goldberger, A.L.; Peter, J.H. Apnea-ECG Database. Comput. Cardiol. 2000, 27, 255–258. [Google Scholar] [CrossRef]
Ramos-Martinez, M.; Corbier, C.; Alvarado, V.M.; Lopez, G.L. Decomposed Mean Euler-Poincaré Characteristic Model for a Non-Gaussian Physiological Random Field. IEEE Access 2021, 9, 21180–21191. [Google Scholar] [CrossRef]
Knuth, D.E. Two notes on notation. Am. Math. Mon. 1992, 99, 403–422. [Google Scholar] [CrossRef]
Pan, J.; Tompkins, W.J. A Real-Time QRS Detection Algorithm. IEEE Trans. Biomed. Eng. 1985, BME-32, 230–236. [Google Scholar] [CrossRef]
Klinger, A. The Vandermonde Matrix. Am. Math. Mon. 1967, 74, 571–574. [Google Scholar] [CrossRef]
Gray, S.B. Local Properties of Binary Images in Two Dimensions. IEEE Trans. Comput. 1971, C-21, 551–561. [Google Scholar] [CrossRef]
Richardson, E.; Werman, M. Efficient classification using the Euler characteristic. Pattern Recognit. Lett. 2014, 49, 99–106. [Google Scholar] [CrossRef]
Gul, A.; Perperoglou, A.; Khan, Z.; Mahmoud, O.; Miftahuddin, M.; Adler, W.; Lausen, B. Ensemble of a subset of k NN classifiers. Adv. Data Anal. Classif. 2018, 12, 827–840. [Google Scholar] [CrossRef] [PubMed]
Corthout, J.; Van Huffel, S.; Mendez, M.O.; Bianchi, A.M.; Penzel, T.; Cerutti, S. Automatic screening of Obstructive Sleep Apnea from the ECG based on Empirical Mode Decomposition and wavelet analysis. In Proceedings of the 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Vancouver, BC, Canada, 20–25 August 2008; pp. 3608–3611. [Google Scholar] [CrossRef]
Rajesh, K.; Dhuli, R.; Kumar, T.S. Obstructive sleep apnea detection using discrete wavelet transform-based statistical features. Comput. Biol. Med. 2021, 130, 104199. [Google Scholar] [CrossRef] [PubMed]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]

Figure 1. Methodology to classify sleep apnea. The first stage involves the data or ECG. The second stage involves pre-processing, where the ECG is cleaned. The third stage involves random field conversion. The fourth stage focuses on geometrical properties, where the excursion set is obtained. The fifth stage is feature extraction. Finally, the last stage involves selecting a classifier model (EKNN, SVM, FNN) to differentiate between an apnea patient and a non-apnea patient.

Figure 2. Polynomial order results for the sections of the ECG; sections

P_{1} (C)

and

P_{2} (C)

. (a) Histogram of the polynomials of P-waves (blue) and T-waves (red) using FIT (Equation (10)). (b) Histogram of the polynomials of P-waves (blue) and T-waves (red) using the IAE criterion.

Figure 2. Polynomial order results for the sections of the ECG; sections

P_{1} (C)

and

P_{2} (C)

. (a) Histogram of the polynomials of P-waves (blue) and T-waves (red) using FIT (Equation (10)). (b) Histogram of the polynomials of P-waves (blue) and T-waves (red) using the IAE criterion.

Figure 3. Random field of the P-waves, Q-peaks, and R-peaks in a short number of cycles (100).

Figure 4. P-waves and R-peaks from the binary image (excursion set) at level

λ

= 0.2 for an apnea patient. The binary representation of the P-wave varies depending on the level of u.

Figure 4. P-waves and R-peaks from the binary image (excursion set) at level

λ

= 0.2 for an apnea patient. The binary representation of the P-wave varies depending on the level of u.

Figure 5. P-waves and R-peaks from the binary image (excursion set) at level

λ

= 0.2 for a non-apnea patient. Normally, the P-wave follows a straight line at each level u.

Figure 5. P-waves and R-peaks from the binary image (excursion set) at level

λ

= 0.2 for a non-apnea patient. Normally, the P-wave follows a straight line at each level u.

Figure 6. EPC values from the sleep apnea patients. Patients from 1 to 20 show OSA events at different times with severe cases. In the majority of these cases, the duration of the OSA event is about 1 h.

Figure 7. EPC values from sleep apnea patients with borderline cases (patients 21–25), where activity was less than an hour, and non-apnea patients (patients 26–35), who served as control cases.

Figure 8. The ensemble KNN classifier example; each dataset is composed of 65 random features. In this setup, 30 KNN classifiers (learners) are activated, and the final prediction is determined by a majority vote.

Figure 9. Feedforward multi-layer neural network architecture proposed for classifying sleep apnea; an input layer using a Swish activation function, two hidden layers—one with a ReLU function and the other with a tanh function—and an output layer equipped with a softmax function.

Figure 10. Confusion matrix for the training phase using 15th- and 21st-order polynomials. Both cases had the same results.

Figure 11. Results of the confusion matrix for the validated data using a 15th-order polynomial.

Figure 12. Results for the test phase using a 21st-order polynomial.

Figure 13. Confusion matrix for the training phase using a 15th-order polynomial and FNN.

Figure 14. Results of the confusion matrix for the test set using a 15th-order polynomial and FNN.

Table 1. Comparing machine learning techniques to classify sleep apnea using EPC models as inputs.

Type of Classifier	Accuracy	Sensitivity	Specificity	F1 Score	MCC
Linear SVM	62.86%	70%	20%	76.36%	−7.75%
Decision Tree	65.71%	78.26%	41.67%	75%	20.94%
KNN	74.29%	78.57%	57.14%	83.02%	31.62%
Logistic Regression	77.14%	86.96%	58.33%	83.33%	47.59%
Ensemble KNN	80%	82.14%	71.43%	86.79%	47.43%

Table 2. Parameters used in the ensemble KNN model.

Parameters	Values
Number of Neighbors (k)	10
Weighting Method	‘distance’
Distance Metric	‘euclidean’
Algorithm for Finding Neighbors	‘auto’
Number of Estimators	30
Ponderation of Models	Uniform

Table 3. Summary of feedforward multi-layer neural network parameters.

Parameter	Description	Value
Input Layer	Number of input features	101 (from EPC model)
Hidden Layers	Number of hidden layers	2
Neurons	Number of neurons per layer	100 (Layer 1), 64 (Layer 2), 64 (Layer 3)
Output Layer	Number of neurons in output layer	1
Activation Functions	Activation function per layer	Swish (Layer 1), ReLU (Layer 2), Tanh (Layer 3), Softmax (Output Layer)
Optimizer	Optimization algorithm	Adam
Epochs	Number of epochs	100
Classes	Number of output classes	2 (binary classification)

Table 4. Table presenting the accuracy, sensitivity, specificity, and F1 score results of the FNN classifier.

Phase	Accuracy	Sensitivity	Specificity	F1 Score
Training	100%	96%	100%	98%
Test	97%	93%	100%	95%

Table 5. Comparison between our approach and the approaches used in the literature.

Literature Work	Efficiency	Analyzed Signal	Proposed Techniques	Classification
Mcnames et al. [26]	Accu = 92.6%	ECG	HRV in RR, T and S ECG pulse Energy(WT)	Threshold (5 min window)
Raymond et al. [34]	Accu = 92%	ECG	EDR signal RR signal	Shared mixture classifier (spectral features)
De Chazal et al. [32]	Accu = 89.4% Sens = 84.1% Spec = 90.0%	ECG	RR variability R-wave amplitude (PSD)	Linear and quadratic discriminants (spectral features)
Lee et al. [29]	Accu = 88% Sens = 98% Spec = 92%	$S_{p} O_{2}$	WT ADA, DDA, NA	Transform coefficients Threshold
Maier et al. [33]	Accu = 89.8% Sens = 81.3% Spec = 82.8%	ECG	RR series MAV series	Threshold (120 ms. window)
Corthout et al. [50]	Accu = 90% Sens = 84% Spec = 93%	ECG	EDM+HT, EDM+RAS WA	Linear Discriminant classifier (feature set)
Burgos et al. [30]	Accu = 93.03% Sens = 92.35% Spec = 93.52%	$S_{p} O_{2}$	Desaturation indexes	Bagging with ADTree
Ye et al. [36]	Accu = 99.22% Sens = 99.25% Spec = 99.02%	ECG	R-R intervals	FENet
Rajesh et al. [51]	Accu = 89% Sens = 86% Spec = 92%	ECG	Power Spectrum, Discrete Wavelet Transform, PSO	Random forest classifier
Li et al. [35]	Accu = 97.8% Sens = 98.6% Spec = 93.9%	ECG $S p O_{2}$ BMI	Feature extraction	Multi-layer FNN
Zarei et al. [17]	Accu = 97.21% Sens = 94.41% Spec = 98.94%	ECG	LSTM-CNN	AHI index
EKNN approach	Accu = 89% Sens = 89% Spec = 86%	ECG	Random Field Euler–Poincaré characteristic	EKNN classifier
FNN approach	Accu = 98.5% Sens = 94.5% Spec = 100%	ECG	Random Field Euler–Poincaré characteristic	Multi-layer FNN

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ramos-Martinez, M.; Sorcia-Vázquez, F.D.J.; Ortiz-Torres, G.; Martínez García, M.; Mena-Enriquez, M.G.; Sarmiento-Bustos, E.; Mixteco-Sánchez, J.C.; Rentería-Vargas, E.M.; Valdez-Resendiz, J.E.; Rumbo-Morales, J.Y. Sleep Apnea Classification Using the Mean Euler–Poincaré Characteristic and AI Techniques. Algorithms 2024, 17, 527. https://doi.org/10.3390/a17110527

AMA Style

Ramos-Martinez M, Sorcia-Vázquez FDJ, Ortiz-Torres G, Martínez García M, Mena-Enriquez MG, Sarmiento-Bustos E, Mixteco-Sánchez JC, Rentería-Vargas EM, Valdez-Resendiz JE, Rumbo-Morales JY. Sleep Apnea Classification Using the Mean Euler–Poincaré Characteristic and AI Techniques. Algorithms. 2024; 17(11):527. https://doi.org/10.3390/a17110527

Chicago/Turabian Style

Ramos-Martinez, Moises, Felipe D. J. Sorcia-Vázquez, Gerardo Ortiz-Torres, Mario Martínez García, Mayra G. Mena-Enriquez, Estela Sarmiento-Bustos, Juan Carlos Mixteco-Sánchez, Erasmo Misael Rentería-Vargas, Jesús E. Valdez-Resendiz, and Jesse Yoe Rumbo-Morales. 2024. "Sleep Apnea Classification Using the Mean Euler–Poincaré Characteristic and AI Techniques" Algorithms 17, no. 11: 527. https://doi.org/10.3390/a17110527

APA Style

Ramos-Martinez, M., Sorcia-Vázquez, F. D. J., Ortiz-Torres, G., Martínez García, M., Mena-Enriquez, M. G., Sarmiento-Bustos, E., Mixteco-Sánchez, J. C., Rentería-Vargas, E. M., Valdez-Resendiz, J. E., & Rumbo-Morales, J. Y. (2024). Sleep Apnea Classification Using the Mean Euler–Poincaré Characteristic and AI Techniques. Algorithms, 17(11), 527. https://doi.org/10.3390/a17110527

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Sleep Apnea Classification Using the Mean Euler–Poincaré Characteristic and AI Techniques

Abstract

1. Introduction

2. Mathematical Framework

2.1. Polynomization

2.2. Building Non-Gaussian Physiological Random Field

2.3. Excursion Set

2.4. Euler–Poincaré Characteristic

3. Data Used

4. Methodology

4.1. Pre-Processing

4.2. Transforming into a Random Field and Geometrical Approach

4.3. Extracting the Feature

4.4. Classification

4.5. Neural Network Classifier

5. Results and Discussion

6. Conclusions and Future Works

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Results Obtained with Balance Data Using SMOTE

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI