Optimized Prescreen Survey Tool for Predicting Sleep Apnea Based on Deep Neural Network: Pilot Study

Kim, Jungyoon; Park, Jaehyun; Park, Jangwoon; Surani, Salim

doi:10.3390/app14177608

Open AccessArticle

Optimized Prescreen Survey Tool for Predicting Sleep Apnea Based on Deep Neural Network: Pilot Study

¹

Department of Computer Science, Kent State University, Kent, OH 44242, USA

²

Department of Industrial and Management Engineering, Incheon National University, Incheon 22012, Republic of Korea

³

Department of Engineering, Texas A&M University-Corpus Christi, Corpus Christi, TX 78412, USA

⁴

Department of Medicine, Texas A&M University Health Science Centre, College of Medicine, Bryan, TX 77807, USA

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(17), 7608; https://doi.org/10.3390/app14177608

Submission received: 24 July 2024 / Revised: 22 August 2024 / Accepted: 26 August 2024 / Published: 28 August 2024

(This article belongs to the Special Issue eHealth Innovative Approaches and Applications: 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

Obstructive sleep apnea (OSA) is one of the common sleep disorders related to breathing. It is important to identify an optimal set of questions among the existing questionnaires, using a data-driven approach, that can prescreen OSA with high sensitivity and specificity. The current study proposes reliable models that are based on machine learning techniques to predict the severity of OSA. A total of 66 participants consisted of 45 males and 21 females (average age = 52.4 years old; standard deviation ± 14.6). Participants were asked to fill out the questionnaire items. If the value of the Respiratory Disturbance Index (RDI) was more than 30, the participant was diagnosed with severe OSA. Several different modeling techniques were applied, including deep neural networks with a scaled principal component analysis (DNN-PCA), random forest (RF), Adaptive Boosting Classifier (ABC), Decision Tree Classifier (DTC), K-nearest neighbors classifier (KNC), and support vector machine classifier (SVMC). Among the participants, 27 participants were diagnosed with severe OSA (RDI > 30). The area under the receiver operating characteristic curve (AUROC) was used to evaluate the developed models. As a result, the AUROC values of DNN-PCA, RF, ABC, DTC, KNC, and SVMC models were 0.95, 0.62, 0.53, 0.53, 0.51, and 0.78, respectively. The highest AUROC value was found in the DNN-PCA model with a sensitivity of 0.95, a specificity of 0.75, a positive predictivity of 0.95, an F1 score of 0.95, and an accuracy of 0.95. The DNN-PCA model outperforms the existing screening questionnaires, scores, and other models.

Keywords:

sleep apnea; respiratory disturbance index; deep neural network; user experience

1. Introduction

Sleep apnea is one of the most prevalent sleep disorders, causing significant daytime sleepiness and cardiovascular comorbidities [1,2,3,4]. The repetitive airflow reductions, due to sleep apnea, can cause recurrent hypoxia and sleep fragmentation [5,6,7]. In the United States, more than 200 million patients are known to suffer from sleep apnea [8]. Sleep apnea can be divided into three major categories: obstructive, central, and mixed apnea [9]. Among them, obstructive sleep apnea (OSA) is the most common one; it is described as an occasional pause in breathing during sleep, caused by the collapse or obstruction of the upper airway. OSA causes loud snoring or choking noises in patients due to the repetitive attempts to breathe with the repeated airway collapse for increasing the inspired amount of air. Thus, OSA obstructs the patient’s brain and body from taking enough oxygen, which may cause patients to wake up with reduced sleep quality, also leading to degraded activities and excessive sleepiness during the daytime [10].

Polysomnography (PSG) is often regarded as the gold standard for detecting sleep apnea. It monitors physiological signals while sleeping in a dedicated laboratory, including electrocardiogram (ECG), oronasal thermal airflow signal (FlowTh), nasal pressure signal (NPRE), electroencephalogram (EEG), electromyogram (EMG), and oxygen levels in the blood (SpO₂) [11,12,13]. This laboratory environment for PSG makes participants uncomfortable, due to the number of sensors and wires and costly examination fees. PSG is more expensive because of an offline inspection of clinicians for deriving the apnea–hypopnea index (AHI) based on visual scoring by an expert sleep clinician [14]. The AHI is used as a parameter for evaluating sleep apnea and its severity. Consequently, the process of PSG is time-demanding and labor-intensive [15,16,17,18] and is vulnerable to human errors [19].

Many screening tools have been developed to identify patients at high risk for OSA, including the Berlin questionnaire (BQ), NoSAS, NAMES, No-Apnea, and STOP-BANG. The Berlin questionnaire (BQ) was developed in 1999 by Netzer et al. [20], and now it is one of the most popular questionnaires to screen OSA. Patients can be classified into having a high risk or low risk of OSA based on their overall responses to the 10 questions (e.g., Do you snore? How often do you snore? etc.). The NoSAS score was introduced by Marti-Scoler et al. [21] to identify individuals at risk of sleep-disordered breathing; this screening tool consists of five significant factors to diagnose sleep-disordered breathing, which includes neck circumference, obesity, snoring, age, and sex. In addition, Subramanian et al. [22] introduced the NAMES assessment (neck circumference, airway classification, comorbidities, Epworth scale, and snoring) for the detection of OSA; this screening tool was developed by combining both self-reported historical factors with physical exam findings of OSA. Duarte et al. [23] developed a practical model, the No-Apnea screening tool, which consists of only two variables (neck circumference and age). The range of the total score is from 0 to 9 points, and they used a cutoff of ≥3 to classify patients at a high risk of having OSA. Chung et al. [24] introduced the STOP-BANG questionnaire to evaluate OSA. The questionnaire includes four yes or no questions about snoring (S), tiredness during daytime (T), observed apnea (O), and high blood pressure (P). These four questionnaires have been well adopted in the medical field to evaluate the severity of OSA, and all four of them are actively used.

The previously developed screening tools were validated and showed marginal accuracy in detecting OSA. For example, Yegneswaran et al. [24] validated the STOP questionnaire based on the AHI from PSG. The sensitivities of the STOP questionnaire with three AHI cutoffs (AHI > 5, >15, and >30) were determined as 65.6%, 74.3%, and 79.5%, respectively. In addition, by incorporating body mass index (BMI), age (A), neck circumference (N), and gender (G) into the STOP questionnaire, sensitivities were increased to 83.6%, 92.9%, and 100%, respectively, with the same AHI cutoffs. Maislin et al. [25] developed a screening tool for predicting sleep apnea based on the reported frequency of various sleep apnea symptoms and other sleep disorders, plus age, BMI, and gender. Maislin et al. [25] developed multiple logistic regression and the area under the receiver operating characteristic curve (AUROC) was 0.79 (p < 0.0001). Duarte et al. [23] validated No-Apnea with three cutoffs (AHI 5, >15, and >30), and the rates of the subjects were correctly screened as 78.1%, 68.8%, and 54.4% for AHI-5, AHI-15, and AHI-30, respectively. Overall, the developed questionnaires showed marginal accuracy in detecting the severity of OSA in multiple patients.

Recent research highlights the importance and efficacy of various questionnaire-based screening tools for OSA across diverse patient populations and conditions. Mashaqi et al. (2020) combined nocturnal pulse oximetry with questionnaire-based screening, showcasing a hybrid approach for enhanced accuracy [26]. Delesie et al. (2021) emphasized the value of screening questionnaires in patients with atrial fibrillation, underscoring their utility in cardiovascular contexts [27]. Butt et al. (2021) explored the predictive value of these tools in patients with type 2 diabetes mellitus, suggesting significant clinical relevance [28]. Waseem et al. (2021) demonstrated the diagnostic performance of the STOP-BANG questionnaire across different ethnic groups, highlighting its broad applicability [29]. Zheng et al. (2021) compared six assessment tools for OSA in hypertensive patients, providing insights into tool-specific efficacies [30]. Hwang et al. (2022) conducted a meta-analysis validating the STOP-BANG questionnaire as a preoperative screening tool, reinforcing its preoperative utility [31]. Bernhardt et al. (2022) provided a systematic review and meta-analysis of the diagnostic accuracy of various questionnaires in different clinical cohorts, offering comprehensive evidence of their effectiveness [32]. Bauters et al. (2020) addressed sex-specific performance gaps, particularly in women, indicating a need for tailored screening approaches [33]. Questionnaire-based tools for OSA screening are accessible, cost-effective, and non-invasive, facilitating early intervention, but they may lack the accuracy of polysomnography and exhibit limited sensitivity and specificity, particularly across diverse and sex-specific populations.

OSA predictions have also been attempted using artificial neural networks and deep learning. Kirby et al. (1999) explored the use of a generalized regression neural network (GRNN) to classify OSA patients based on 23 clinical variables [34]. The model achieved a high sensitivity of 98.9% and a specificity of 80%, demonstrating the potential of ANN models in accurately diagnosing OSA from clinical data. Cen et al. (2018) presented an automatic system for OSA event detection using a Convolutional Neural Network (CNN) trained on multi-modal physiological signals [35]. The model achieved an average accuracy of 79.61% across normal, hypopnea, and apnea classes. Karamanli et al. (2016) developed an ANN model using four simple clinical inputs, sex, age, BMI, and snoring status, to diagnose OSA [36]. The model achieved an accuracy of 86.6%, indicating that ANNs can effectively reduce the need for expensive polysomnography (PSG). Perero-Codosero et al. (2019) applied deep learning techniques to analyze ECG and SpO₂ signals for OSA detection [37]. The model demonstrated good performance, but the study highlighted challenges in generalizing the results due to the limited sample size and the need for extensive manual feature extraction.

This study addresses the need for a more efficient and accurate method to predict severe OSA. While previous research has explored various machine learning models, many rely on extensive input variables and complex feature engineering, limiting their practical application. This study proposes a streamlined approach using a deep neural network with principal component analysis (DNN-PCA) to optimize and reduce the number of necessary inputs. This not only enhances the model’s accuracy and sensitivity but also makes it more practical and accessible for clinical use, especially in settings with limited resources. By improving both efficiency and accuracy, this research contributes significantly to the field of OSA diagnostics, offering a viable alternative to more cumbersome and less precise models.

Also, prescreening OSA through questionnaires is crucial, even though most existing questionnaires demonstrate high sensitivity but low specificity. This study introduces an integrated screening tool designed to predict severe OSA using responses from questionnaire items. To achieve this objective, we employed a combination of scaled principal component analysis (PCA) and DNN-PCA on data from 66 participants. The questionnaires utilized include demographic items and established forms such as Berlin, NAMES, No-Apnea, No SAS, and STOP-BANG. The proposed methodology is intended to facilitate the easy and quick completion of questionnaire items by users who suspect they have sleep apnea. By doing so, it aims to minimize the time users spend in sleep laboratories or hospitals, thereby streamlining the prescreening process and making it more efficient.

This study’s key contribution lies in its novel integration of DNN with PCA, which not only enhances predictive performance but also simplifies the prescreening process for severe OSA. By effectively reducing the number of input variables while maintaining high accuracy, this approach makes OSA screening more accessible and practical for both patients and clinicians. Additionally, this methodology sets the stage for future advancements in cost-effective and efficient OSA diagnostics, particularly in resource-limited settings.

The structure of this manuscript is organized into four sections. The proposed method of OSA prediction is introduced in Section 2, followed by Results and Discussion in Section 3 and Section 4. The Conclusions are provided in Section 5. The proposed framework consists of three processing steps: (1) unsupervised preprocessing for attaining principal components, (2) feature scaling based on standardization, and (3) classifying for the prediction of sleep apnea. This study converts 38 binary or categorical variables into continuous variables, using the PCA maximum-attribute filter; it can help minimize data discretization. The proposed method would be particularly useful in identifying severe OSA with acceptable sensitivity and specificity levels, which would also reduce examination costs in sleep testing.

2. Methods

2.1. Subject Data

This retrospective study was conducted after approval by the executive committee of the TORR Sleep Center, Corpus Christi, Texas. Consent waivers for patients were received as a retrospective chart review was undertaken. For the pilot study, the patient charts during the month of December 2019 were reviewed. The patients, as part of the routine laboratory procedures, completed the screening forms as Epworth sleepiness scale, Berlin questionnaire (BQ), NoSAS, NAMES, No-Apnea, and STOP-BANG. A total of 66 participants consisting of 45 males and 21 females (average age = 52.4 years old; standard deviation ± 14.6) were included. Among these, the severe OSA group comprised 27 participants, representing approximately 40.9% of the total. A total of 30 patients who were there for the second-night polysomnographic study or for the split-night study were excluded. Besides the screening tools, baseline demographic data such as age, sex, race, body data measurements, snoring, BMI, and blood pressure were reviewed.

2.2. Overall System Architecture

To construct the deep learning model, we extracted 38 variables from various questionnaires, including Berlin, NAMES, No-Apnea, No SAS, and STOP-BANG. This study involved 66 participants, of whom 49 were diagnosed with severe sleep apnea, as indicated by Respiratory Disturbance Index (RDI) values greater than 30, obtained from polysomnography (PSG) examinations in a sleep laboratory.

Figure 1 illustrates the overall analytical architecture, which comprises the following steps:

Data Division: The collected data were divided into training (66%) and testing (34%) sets. Additionally, 30% of the training set was designated for validation purposes. It is important to note that there was no overlap between the training and testing sets.
Data Transformation: PCA and scaling techniques were applied to convert categorical variables into continuous variables. These continuous variables were then used to develop models for testing.
Model Training: The deep neural network (DNN) models were trained using the scaled PCA variables.
Result Comparison: The model’s results were compared with ground truth data, which were labeled with RDI values obtained from the sleep laboratory.

This structured approach ensures that the data are effectively transformed and utilized, allowing for robust model training and accurate performance evaluation against the true labels provided by PSG examinations.

2.3. Preprocessing

Principal component analysis (PCA) is a simple yet powerful non-parametric method for extracting meaningful information from complex datasets. Typically, PCA efficiently defines new features and reduces dimensionality to uncover hidden or simplified structures suitable for classification algorithms during preprocessing [38]. However, given the unique nature of our dataset, which comprised categorical or binary variables instead of continuous ones, PCA was used to convert these variables rather than reduce or adjust dimensions. Raw collected data often contain noise, making them unsuitable for direct input into a deep neural network (DNN) classifier. Therefore, we utilized the PCA maximum-attribute filter to transform binary or categorical variables into continuous variables. This approach effectively minimized data discretization, resulting in a cleaner dataset for model training. Figure 2 illustrates the first and second principal components derived from the data: Figure 2a depicts a non-scaled PCA, while Figure 2b shows a PCA with a quantile transformer scaler. The combination of a DNN and PCA–quantile transformer scaler provided the best performance, demonstrating the effectiveness of this preprocessing strategy in handling our dataset’s unique characteristics.

2.4. Classification and Training Method

In this study, the primary focus was on developing and refining a deep learning model based on a simple feed-forward neural network. This neural network was trained using the standard backpropagation algorithm and consisted of two hidden layers. To implement the models, we utilized Keras [39] with a TensorFlow [40] backend, which provided a robust framework for building and training our neural networks. A variety of hyperparameters were meticulously adjusted to optimize the performance of each deep neural network (DNN). Key parameters included the number of hidden layers, the number of neurons within each layer, the choice of activation function, the optimization method, and the regularization technique. Through rigorous experimentation, we identified the best-performing DNN architecture with two hidden layers of 120 neurons each, after testing depths of 2 to 8 layers and 50 to 200 nodes using a trial-and-error approach (grid search method). The output layer consisted of a single neuron designed to produce a regression output, which was critical for our specific application.

The activation function chosen for each layer was the Rectified Linear Unit (ReLU) [41], known for its ability to introduce non-linearity into the model while maintaining computational efficiency. We trained the model using dropout [42] values ranging from 0.1 to 0.5 and, based on testing, selected an optimal dropout value of 0.3, which was then applied to each layer to effectively prevent overfitting. This regularization technique randomly deactivates a fraction of neurons during training, thereby promoting generalization. For optimization, we selected the Adam optimizer [43], which combines the advantages of two other popular optimization methods: AdaGrad and RMSProp. The learning rate for Adam was set at 0.001, a value that balances the speed of convergence with the stability of the training process. Additionally, L2 regularization was incorporated to further prevent overfitting by penalizing large weights, thus encouraging the model to maintain smaller and more generalized weights.

Batch normalization [44] was applied after the first two layers to stabilize and accelerate the training process. This technique normalizes the output of the previous activation layer by adjusting and scaling the activations. By doing so, batch normalization helps to mitigate internal covariate shifts, leading to faster convergence and improved performance. The loss function played a dual role in this context, not only guiding the optimization process by measuring the difference between predicted and actual values but also acting as a regularization term to denote model accuracy. This dual role ensured that the model remained focused on minimizing errors while maintaining generalizability.

The comprehensive architecture of the proposed DNN, encompassing all these elements, is depicted in Figure 3. This architecture represents a carefully balanced network designed to deliver optimal performance for the given task. Each component, from the choice of activation function to the implementation of regularization techniques, was selected based on extensive testing and validation, ensuring that the final model was both robust and effective. Through this meticulous design and training process, the resulting DNN was well equipped to handle the complexities of the dataset, providing accurate and reliable predictions. The integration of advanced techniques such as dropout, batch normalization, and the Adam optimizer, coupled with a well-structured architecture, underscored the efficacy of our approach in leveraging deep learning for complex classification tasks.

3. Results

3.1. Performance Parametric

The detection performance of the model is primarily assessed by evaluating accuracy (Ac). However, due to the class imbalance in the dataset, particularly with more subjects diagnosed with severe apnea (n = 49) compared to non-severe apnea (n = 17), relying solely on accuracy can be misleading. This class imbalance can introduce bias into the performance results. For instance, if the model were to classify all subjects as having severe apnea, the accuracy would be calculated as 0.74 (49/(49 + 17)), which might appear acceptable. However, this would fail to reflect the model’s inability to correctly predict non-severe apnea cases. Therefore, additional performance metrics are necessary to provide a more comprehensive evaluation.

To address this, we included three additional metrics: sensitivity (Sn), specificity (Sp), and precision (Pr). Sensitivity (Sn) measures the probability of correctly detecting subjects with severe apnea, providing insight into the model’s ability to identify true positive cases. Specificity (Sp) reflects the probability of correctly identifying subjects with non-severe apnea, highlighting the model’s effectiveness in recognizing true negative cases. Precision (Pr) indicates the likelihood that subjects predicted to have severe apnea are correctly classified, thus focusing on the accuracy of positive predictions. The performance of the model is evaluated using the following parameters: true positive (TP), true negative (TN), false positive (FP), and false negative (FN). These parameters are defined as follows: TP represents the count of correctly predicted severe apnea cases, TN represents the count of correctly predicted non-severe apnea cases, FP indicates the count of incorrectly predicted severe apnea cases, and FN indicates the count of incorrectly predicted non-severe apnea cases.

The formulas used to calculate Sn, Sp, Pr, and Ac are as follows:

Sn = TP/(TP + FN)

(1)

Sp = TN/(TN + FP)

(2)

Pr = TP/(TP + FP)

(3)

Ac = (TP + TN)/(TP + FN + FP + TN)

(4)

By utilizing these metrics, we gain a clearer understanding of the model’s performance across different aspects of prediction. Sensitivity ensures that the model is proficient at detecting severe apnea cases, specificity ensures that non-severe apnea cases are not overlooked, and precision ensures that the predictions of severe apnea are reliable. This multifaceted approach to performance evaluation provides a more robust and accurate assessment of the model’s capabilities, especially in the presence of class imbalance.

3.2. Performance Results

The proposed classification algorithm yielded a reliable computational result, as shown in Table 1, with five performance parameters, such as the area under the receiver operating characteristic (ROC), Sn, Sp, Pr, and Ac. The optimal performance result based on the threshold of 0.875 shows an area under ROC (AUC) of 94.7%, an Sn of 94.74%, an Sp of 75%, a Pr of 94.74%, and an Ac of 91.30%. The threshold determined the empirical balance between sensitivity and specificity. The AUC, as well as common performance metrics (i.e., Sn, Sp, Pr, and Ac), is known to reflect the performance of the algorithm [45]. The ROC curve and the AUC of the DNN model were compared to the other classifiers, including random forest (RF) [46], AdaBoost (ADB) [47], K-neighbors (KNs) [48], support vector machine classifier (SVMC) [49], and decision tree (DT) [50], as shown in Figure 4 and Table 2. The proposed DNN/scaled PCA algorithm was optimal, followed by the SVMC, and the top two best results are highlighted in bold in Table 2. In order to further analyze the results in more detail, we plot the confusion matrices as shown in Table 3 and Table 4, indicating the best result and three-fold cross-validation results from scaled PCA processing. As can be seen, the proposed model has a good classification ability in severe sleep apnea.

4. Discussion

To enhance the performance of the proposed PCA-DNN algorithm, extensive tuning of hyperparameters was undertaken. The loss of the algorithm was measured using these hyperparameters on the training data, with the evaluation of results performed on the validation data. Hyperparameters were classified into four categories: real-valued (e.g., learning rate), integer-valued (e.g., number of layers and neurons), binary (e.g., use of early stopping), and categorical (e.g., choice of optimizer) [51]. Key hyperparameters optimized in this study included two hidden layers, each with 120 neurons, 500 training epochs, a patience value of 18, and a batch size of 2.

The DNN integrated with a scaled PCA classifier was employed to predict severe apnea. However, the correlation coefficients between various patient parameters and severe apnea revealed unclear relationships. Table 5 lists coefficients with absolute values exceeding 0.36. The highest correlation coefficient (+0.68) is highlighted in bold, while informative parameters with coefficients over an absolute value of 0.52 are italicized and shown in red.

Our methodology utilizes indirect or limited data, such as demographic information and responses to existing questionnaires, to predict severe sleep apnea. This approach aims to reduce future medical costs and streamline the diagnostic process. Despite its advantages, this study has several limitations, including the lack of input data separation and the absence of longitudinal data. The reliance on survey data presents challenges due to its discrete format. Although scaled PCA was employed to enhance data resolution, incorporating additional input variables would be beneficial. Furthermore, the applicability of our results is confined to subjects with severe sleep apnea. This specific focus may affect the overall accuracy of the predictive model when applied to a broader population. Therefore, while our method shows promise in identifying severe sleep apnea, further research is necessary to improve its generalizability and accuracy for the wider population.

The hyperparameter optimization process involved testing various configurations to determine the most effective combination. Real-valued hyperparameters, such as the learning rate, were adjusted to find an optimal balance between convergence speed and stability. Integer-valued parameters, including the number of layers and neurons, were fine-tuned to ensure the network’s capacity was sufficient to capture complex patterns without overfitting. Binary parameters, like early stopping, were evaluated to prevent unnecessary training and reduce overfitting. Finally, categorical parameters, such as the choice of optimizer, were selected based on their performance in training the model effectively.

In predicting severe sleep apnea, the scaled PCA classifier played a crucial role in transforming the input data into a format that the DNN could process efficiently. This transformation helped in managing the data’s dimensionality and enhancing the model’s predictive capabilities. Despite the unclear relationships indicated by the correlation coefficients, the identification of key parameters with significant correlations provides valuable insights for further model refinement. The use of demographic information and questionnaire responses underscores the potential of non-invasive and cost-effective methods in medical diagnostics. By leveraging such data, our approach can facilitate early detection and intervention for severe sleep apnea, thereby improving patient outcomes and reducing healthcare costs.

However, this study’s reliance on survey data and the absence of longitudinal data highlight areas for improvement. Future research should focus on incorporating a broader range of input variables and longitudinal datasets to enhance the model’s robustness and predictive accuracy. Alternatively, our goal could also be to reduce the number of input variables, thereby increasing the efficiency of data collection in real-world settings while achieving the same or even better model performance. In doing so, it may also be possible to involve a larger number of participants in the study. Additionally, expanding the model’s applicability beyond severe sleep apnea to include a general population would increase its utility in clinical settings. Thus, while our PCA-DNN algorithm demonstrates significant potential in predicting severe sleep apnea, ongoing efforts to address its limitations and improve its generalizability are essential. Further research and development will enable more accurate and comprehensive diagnostic tools, ultimately benefiting a larger patient population.

5. Conclusions

In this paper, we propose an automated methodology to predict severe sleep apnea, using a deep neural network model with scaled PCA based on the questionnaire items, including demographic information and existing forms, such as Berlin, NAMES, No-Apnea, No SAS, and STOP-BANG. The most significant finding of this study is the early detection of patients who have severe sleep apnea and need an additional checkup and appropriate treatment. The proposed method can minimize the efforts in designing, computing, and selecting the appropriate features for survey-based diagnoses. Since the used dataset has a limited number of categories, we applied DNN to train the categorical/binary type of input variables and used scaled PCA to convert them into continuous features as proper inputs for the DNN.

Consequently, this study has demonstrated that integrating a DNN with PCA provides a robust method for predicting severe OSA. By optimizing and reducing the number of input variables, our model achieved high accuracy, sensitivity, and specificity, outperforming existing screening tools and models. The key contribution of this research lies in its ability to streamline the prescreening process, making it more efficient and practical for clinical application, particularly in resource-limited settings. This work not only advances the field of OSA diagnostics but also sets the foundation for future research aimed at further enhancing predictive performance and broadening the model’s applicability to diverse populations.

In future studies, we plan to modify and apply the proposed method to other survey-based datasets with larger populations, such as those for chronic obstructive pulmonary disease and hypopnea. In addition, we plan to employ the proposed method for patients with mild/moderate sleep apnea within large populations. Moreover, auto-fine-tuning methodologies to reduce the model training time and improve its overall performance will be developed and validated for other applications.

Author Contributions

Conceptualization, J.K.; Writing—original draft, J.K.; Writing—review & editing, J.P. (Jaehyun Park), J.P. (Jangwoon Park) and S.S.; Funding acquisition, J.P. (Jaehyun Park). All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Incheon National University Research Grant in 2023.

Institutional Review Board Statement

The Executive Committee Torr Sleep Center approved this study (PT #02-2020).

Informed Consent Statement

All subjects gave their informed consent for inclusion before they participated in the study.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Somers, V.K.; White, D.P.; Amin, R.; Abraham, W.T.; Costa, F.; Culebras, A.; Daniels, S.; Floras, J.S.; Hunt, C.E.; Olson, L.J.; et al. Sleep Apnea and Cardiovascular Disease: An American Heart Association/American College of Cardiology Foundation Scientific Statement from the American Heart Association Council for High Blood Pressure Research Professional Education Committee, Council on Clinical Cardiology, Stroke Council, and Council on Cardiovascular Nursing. In Collaboration with the National Heart, Lung, and Blood Institute National Center on Sleep Disorders Research (National Institutes of Health). J. Am. Coll. Cardiol. 2008, 52, 686–717. [Google Scholar]
Botros, N.; Concato, J.; Mohsenin, V.; Selim, B.; Doctor, K.; Yaggi, H.K. Obstructive sleep apnea as a risk factor for type 2 diabetes. Am. J. Med. 2009, 122, 1122–1127. [Google Scholar] [CrossRef] [PubMed]
Mandal, S.; Kent, B.D. Obstructive Sleep Apnoea and Coronary Artery Disease. J. Thorac. Dis. 2018, 10, S4212. [Google Scholar] [CrossRef]
Sharma, S.; Culebras, A. Sleep apnoea, and stroke. Stroke Vasc. Neurol. 2016, 1, 185–191. [Google Scholar] [CrossRef] [PubMed]
Quan, S.; Gillin, J.C.; Littner, M.; Shepard, J. Sleep-related breathing disorders in adults: Recommendations for syndrome definition 244 and measurement techniques in clinical research. Editorials. Sleep 1999, 22, 662–689. [Google Scholar] [CrossRef] [PubMed]
American Academy of Sleep Medicine. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology, and Technical Specifications; American Academy of Sleep Medicine: Westchester, IL, USA, 2007. [Google Scholar]
Dempsey, J.A.; Veasey, S.C.; Morgan, B.J.; O’Donnell, C.P. Pathophysiology of sleep apnea. Physiol. Rev. 2010, 90, 47–112. [Google Scholar] [CrossRef]
Zhang, J.; Zhang, Q.; Wang, Y.; Qiu, C. A Real-Time Auto-Adjustable Smart Pillow System for Sleep Apnea Detection and Treatment. In Proceedings of the 12th International Conference on Information Processing in Sensor Networks, Philadelphia, PA, USA, 8–11 April 2013; IPSN ’13; Association for Computing Machinery: New York, NY, USA, 2013; pp. 179–190. [Google Scholar]
de Chazal, P.; Penzel, T.; Heneghan, C. Automated detection of obstructive sleep apnoea at different time scales using the electrocardiogram. Physiol. Meas. 2004, 25, 967. [Google Scholar] [CrossRef]
Kim, J.; ElMoaqet, H.; Tilbury, D.M.; Ramachandran, S.K.; Penzel, T. Time-domain characterization for sleep apnea in oronasal airflow signal: A dynamic threshold classification approach. Physiol. Meas. 2019, 40, 054007. [Google Scholar] [CrossRef]
Patil, S.P.; Schneider, H.; Schwartz, A.R.; Smith, P.L. Adult obstructive sleep apnea: Pathophysiology and diagnosis. Chest 2007, 132, 325–337. [Google Scholar] [CrossRef]
Berry, R.; Brooks, R.; Gamaldo, C.; Harding, S.M.; Lloyd, R.M.; Quan, S.F.; Troester, M.T.; Vaughn, B.V. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology, and Technical Specifications; Version 2.6.0; American Academy of Sleep Medicine: Darien, IL, USA, 2020. [Google Scholar]
ElMoaqet, H.; Kim, J.; Tilbury, D.; Ramachandran, S.K.; Ryalat, M.; Chu, C.H. Gaussian mixture models for detecting sleep apnea events using single oronasal airflow record. Appl. Sci. 2020, 10, 7889. [Google Scholar] [CrossRef]
Berry, R.B.; Budhiraja, R.; Gottlieb, D.J.; Gozal, D.; Iber, C.; Kapur, V.K.; Marcus, C.L.; Mehra, R.; Parthasarathy, S.; Quan, S.F.; et al. Rules for scoring respiratory events in sleep: Update of the 2007 AASM Manual for the Scoring of Sleep and Associated Events. J. Clin. Sleep Med. 2012, 8, 597–619. [Google Scholar] [CrossRef] [PubMed]
Agarwal, R.; Gotman, J. Computer-assisted sleep staging. IEEE Trans. Biomed. Eng. 2001, 48, 1412–1423. [Google Scholar] [CrossRef] [PubMed]
Flemons, W.W.; Littner, M.R.; Rowley, J.A.; Gay, P.; Anderson, W.M.; Hudgel, D.W.; McEvoy, R.D.; Loube, D.I. Home diagnosis of sleep apnea: A systematic review of the literature: An evidence review cosponsored by the American Academy of Sleep Medicine, the American College 268 of Chest Physicians, and the American Thoracic Society. Chest J. 2003, 124, 1543–1579. [Google Scholar] [CrossRef]
de Almeida, F.R.; Ayas, N.T.; Otsuka, R.; Ueda, H.; Hamilton, P.; Ryan, F.C.; Lowe, A.A. Nasal pressure recordings to detect obstructive sleep apnea. Sleep Breath. 2006, 10, 62–69. [Google Scholar] [CrossRef] [PubMed]
Khandoker, A.H.; Gubbi, J.; Palaniswami, M. Automated scoring of obstructive sleep apnea and hypopnea events using short-term electrocardiogram recordings. IEEE Trans. Inf. Technol. Biomed. 2009, 13, 1057–1067. [Google Scholar] [CrossRef] [PubMed]
Whitney, C.W.; Gottlieb, D.J.; Redline, S.; Norman, R.G.; Dodge, R.R.; Shahar, E.; Surovec, S.; Nieto, F.J. Reliability of scoring respiratory disturbance indices and sleep staging. Sleep 1998, 21, 749–757. [Google Scholar] [CrossRef]
Netzer, N.C.; Stoohs, R.A.; Netzer, C.M.; Clark, K.; Strohl, K.P. Using the Berlin Questionnaire to identify patients at risk for the sleep apnea syndrome. Ann. Intern. Med. 1999, 131, 485–491. [Google Scholar] [CrossRef]
Marti-Soler, H.; Hirotsu, C.; Marques-Vidal, P.; Vollenweider, P.; Waeber, G.; Preisig, M.; Tafti, M.; Tufik, S.B.; Bittencourt, L.; Tufik, S. The NoSAS score 278 for screening of sleep-disordered breathing: A derivation and validation study. Lancet Respir. Med. 2016, 4, 742–748. [Google Scholar] [CrossRef]
Subramanian, S.; Hesselbacher, S.E.; Aguilar, R.; Surani, S.R. The NAMES assessment: A novel combined-modality screening tool for obstructive sleep apnea. Sleep Breath. 2011, 15, 819–826. [Google Scholar] [CrossRef]
Duarte, R.L.; Rabahi, M.F.; Magalhães-da-Silveira, F.J.; de Oliveira-e-Sá, T.S.; Mello, F.C.; Gozal, D. Simplifying the screening of obstructive sleep apnea with a 2-item model, No-Apnea: A cross-sectional study. J. Clin. Sleep Med. 2018, 14, 1097–1107. [Google Scholar] [CrossRef]
Chung, F.; Yegneswaran, B.; Liao, P.; Chung, S.A.; Vairavanathan, S.; Islam, S.; Khajehdehi, A.; Shapiro, C.M. STOP questionnaire: A tool to screen patients for obstructive sleep apnea. Anesthesiology 2008, 108, 812–821. [Google Scholar] [CrossRef] [PubMed]
Maislin, G.; Pack, A.I.; Kribbs, N.B.; Smith, P.L.; Schwartz, A.R.; Kline, L.R.; Schwab, R.J.; Dinges, D.F. A survey screen for prediction of apnea. Sleep 1995, 18, 158–166. [Google Scholar] [CrossRef]
Mashaqi, S.; Staebler, D.; Mehra, R. Combined nocturnal pulse oximetry and questionnaire-based obstructive sleep apnea screening–A cohort study. Sleep Med. 2020, 72, 157–163. [Google Scholar] [CrossRef] [PubMed]
Delesie, M.; Knaepen, L.; Hendrickx, B.; Huygen, L.; Verbraecken, J.; Weytjens, K.; Dendale, P.; Heidbuchel, H.; Desteghe, L. The value of screening questionnaires/scoring scales for obstructive sleep apnoea in patients with atrial fibrillation. Arch. Cardiovasc. Dis. 2021, 114, 737–747. [Google Scholar] [CrossRef]
Butt, A.M.; Syed, U.; Arshad, A. Predictive value of clinical and questionnaire based screening tools of obstructive sleep apnea in patients with type 2 diabetes mellitus. Cureus 2021, 13, e18009. [Google Scholar] [CrossRef] [PubMed]
Waseem, R.; Chan, M.T.; Wang, C.Y.; Seet, E.; Tam, S.; Loo, S.Y.; Lam, C.K.; Hui, D.S.; Chung, F. Diagnostic performance of the STOP-Bang questionnaire as a screening tool for obstructive sleep apnea in different ethnic groups. J. Clin. Sleep Med. 2021, 17, 521–532. [Google Scholar] [CrossRef] [PubMed]
Zheng, Z.; Sun, X.; Chen, R.; Lei, W.; Peng, M.; Li, X.; Zhang, N.; Cheng, J. Comparison of six assessment tools to screen for obstructive sleep apnea in patients with hypertension. Clin. Cardiol. 2021, 44, 1526–1534. [Google Scholar] [CrossRef]
Hwang, M.; Nagappa, M.; Guluzade, N.; Saripella, A.; Englesakis, M.; Chung, F. Validation of the STOP-Bang questionnaire as a preoperative screening tool for obstructive sleep apnea: A systematic review and meta-analysis. BMC Anesthesiol. 2022, 22, 366. [Google Scholar] [CrossRef]
Bernhardt, L.; Brady, E.M.; Freeman, S.C.; Polmann, H.; Réus, J.C.; Flores-Mir, C.; De Luca Canto, G.; Robertson, N.; Squire, I.B. Diagnostic accuracy of screening questionnaires for obstructive sleep apnoea in adults in different clinical cohorts: A systematic review and meta-analysis. Sleep Breath. 2022, 26, 1053–1078. [Google Scholar] [CrossRef]
Bauters, F.A.; Loof, S.; Hertegonne, K.B.; Chirinos, J.A.; De Buyzere, M.L.; Rietzschel, E.R. Sex-specific sleep apnea screening questionnaires: Closing the performance gap in women. Sleep Med. 2020, 67, 91–98. [Google Scholar] [CrossRef]
Kirby, S.D.; Eng, P.; Danter, W.; George, C.F.; Francovic, T.; Ruby, R.R.; Ferguson, K.A. Neural network prediction of obstructive sleep apnea from clinical criteria. Chest 1999, 116, 409–415. [Google Scholar] [CrossRef]
Cen, L.; Yu, Z.L.; Kluge, T.; Ser, W. Automatic system for obstructive sleep apnea events detection using convolutional neural network. Ann. Int. Conf. IEEE Eng. Med. Biol. Soc. 2018, 2018, 3975–3978. [Google Scholar]
Karamanli, H.; Yalcinoz, T.; Yalcinoz, M.A.; Yalcinoz, T. A prediction model based on artificial neural networks for the diagnosis of obstructive sleep apnea. Sleep Breath. 2016, 20, 509–514. [Google Scholar] [CrossRef] [PubMed]
Perero-Codosero, J.M.; Espinoza-Cuadros, F.; Antón-Martín, J.; Barbero-Álvarez, M.A.; Hernández-Gómez, L.A. Modeling obstructive sleep apnea voices using deep neural network embeddings and domain-adversarial training. IEEE J. Sel. Top. Signal Process. 2019, 14, 240–250. [Google Scholar] [CrossRef]
Shlens, J. A tutorial on principal component analysis. arXiv 2014, arXiv:1404.1100. [Google Scholar]
Joshi, D.; Anwarul, S.; Mishra, V. Deep leaning using keras. In Machine Learning and Deep Learning in Real-Time Applications; IGI Global: Pennsylvania, PA, USA, 2020; pp. 33–60. [Google Scholar]
Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Srivastava, N.; Hintton, G.; Krizhevsky, A.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
Kim, J.; Chu, C.H. ETD: Anextendedtime delay algorithm for ventricular fibrillation detection. In Proceedings of the 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Chicago, IL, USA, 26–30 August 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 6479–6482. [Google Scholar]
Liaw, A.; Wiener, M. Classification and regression by random. For. R News 2002, 2, 18–22. [Google Scholar]
Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the Second European Conference on Computational Learning Theory, Barcelona, Spain, 13–15 March 1995; pp. 23–37. [Google Scholar]
Viswanath, P.; Sarma, T.H. An improvement to k-nearest neighbor classifier. In Proceedings of the 2011 IEEE Recent Advances in Intelligent Computational Systems, Trivandrum, India, 22–24 September 2011; pp. 227–231. [Google Scholar]
Kotsiantis, S.B.; Zaharakis, I.; Pintelas, P. Supervised machine learning: A review of classification techniques. Emerg. Artif. Intell. Appl. Comput. Eng. 2007, 160, 3–24. [Google Scholar]
Song, Y.Y.; Ying, L.U. Decision tree methods: Applications for classification and prediction. Shanghai Arch. Psychiatry 2015, 27, 130–315. [Google Scholar] [PubMed]
Hutter, F.; Kotthoff, L.; Vanschoren, J. Automated Machine Learning: Methods, Systems, Challenges; Springer Nature: New York, NY, USA, 2019; p. 5. [Google Scholar]

Figure 1. The architecture of the deep neural network (DNN)/scaled principal component analysis (PCA) approach.

Figure 2. Two-dimensional plots of the first and second principal components (class 0 indicates non-severe apnea patients, and class 1 indicates severe sleep apnea patients).

Figure 3. The architecture of the proposed DNN.

Figure 4. Comparison of ROC curves (The blue dotted line is the random prediction).

Table 1. Performance result of DNN with scaled PCA.

AUC	Sensitivity	Specificity	Positive Predictivity	Accuracy
0.947	0.9474	0.7500	0.9474	0.9130

Table 2. Comparison of AUC results (Bold indicates top two results).

Models	AUC	Models	AUC
DNN	0.947	KNC	0.513
RF	0.618	SVC	0.776
ABC	0.526	DTC	0.525

Table 3. Confusion matrix.

	Condition Positive	Condition Negative
Predicted Condition Positive	19 (TP)	1 (FP)
Predicted Condition Negative	0 (FN)	3 (TN)

Table 4. Confusion matrix (3-fold cross-validation).

	Condition Positive	Condition Negative
Predicted Condition Positive	57 (TP)	2 (FP)
Predicted Condition Negative	1 (FN)	11 (TN)

Table 5. Correlation coefficients of variables (over 0.35) among 38 variables (Bold indicates the best Corr. Coeff.).

Variables	Corr. Coeff.	Variables	Corr. Coeff.
Daytime sleepiness	0.68	Neck	0.38
How loud the snoring is	−0.47	M.Friedman	0.43
How often you snore/week	−0.39	Snoring	0.52

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, J.; Park, J.; Park, J.; Surani, S. Optimized Prescreen Survey Tool for Predicting Sleep Apnea Based on Deep Neural Network: Pilot Study. Appl. Sci. 2024, 14, 7608. https://doi.org/10.3390/app14177608

AMA Style

Kim J, Park J, Park J, Surani S. Optimized Prescreen Survey Tool for Predicting Sleep Apnea Based on Deep Neural Network: Pilot Study. Applied Sciences. 2024; 14(17):7608. https://doi.org/10.3390/app14177608

Chicago/Turabian Style

Kim, Jungyoon, Jaehyun Park, Jangwoon Park, and Salim Surani. 2024. "Optimized Prescreen Survey Tool for Predicting Sleep Apnea Based on Deep Neural Network: Pilot Study" Applied Sciences 14, no. 17: 7608. https://doi.org/10.3390/app14177608

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimized Prescreen Survey Tool for Predicting Sleep Apnea Based on Deep Neural Network: Pilot Study

Abstract

1. Introduction

2. Methods

2.1. Subject Data

2.2. Overall System Architecture

2.3. Preprocessing

2.4. Classification and Training Method

3. Results

3.1. Performance Parametric

3.2. Performance Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI