**1. Introduction**

Automatic sleep stage classification is an important research focus due to its importance for the study of sleep related disorders. There are currently two classification criteria for sleep stages. According to Rechtschaffen's and Kale's (R&K) recommendations, sleep stages can be divided into six stages: The Awake stage (Awa), rapid Eye Movement stage (REM), Sleep stage 1 (S1), Sleep stage 2 (S2), Sleep stage 3 (S3), Sleep stage 4 (S4) [1]. Another sleep stage classification standard was provided by the AASM. In this standard, there are five sleep stages: Awa, N1 (S1), N2 (S2), N3 (the merging of stages S3 and S4) and REM [2]. Usually, the detection of each sleep stage requires manual marking by professionals, which requires a lot of work and may produce erroneous markings. Therefore, it is imperative to study the method for automatic sleep stage classification.

According to the characteristics of the adopted features, currently commonly used automatic detection methods can be divided into the following two categories. The first is the method based on statistical features (such as spectral energy) extracted from the one-dimensional EEG signal. The other is the implicit features, which can be obtained by training deep-learning based classifiers. Hassan et al. computed various spectral features by Tunable-Q factor wavelet transform (TQWT) on sleep-EEG signal segments [3]. With the random forest classifier, they achieved accuracies of 90.38%, 91.50%, 92.11%, 94.80%, 97.50% for 6-stage to 2-stage classification of sleep states on the Sleep-EDF database. Diykh et al. adopted different structural and spectral attributes extracted from weighted undirected

networks to automatically classify the sleep stages [4]. Kang et al. present a statistical framework to estimate whole-night sleep states in patients with obstructive sleep apnea (OSA)—the most common sleep disorder [5]. In this framework, they extracted 11 spectral features from 60903 epochs to estimate per-night sleep stages with a 5-state hidden Markov model. Abdulla et al. used graph modularity of EEG segments as the features to feed an ensemble classifier which achieved the accuracy of 93.1% with 20265 epochs from Sleep EDF database [6].

In [7], Ghimatgar et al. constructed a features pool by the relevance and redundancy analysis on the sleep EEG epochs. With a random forest classifier and a Hidden Markov Model, this method was evaluated on three public sleep EEG database scored according to R&K and AASM guidelines. They achieved overall accuracies in the range of (79.4–87.4%) and (77.6–80.4%) for six-stage (R&K) and five-stage (AASM) classification, respectively. Taran et al. proposed an optimized flexible analytic wavelet transform (OFAWT) to decompose EEG signals into band-limited basis or sub-bands (SBs) [8]. The experimental results yields classification accuracies for the classification of six to two sleep stages 96.03%, 96.39%, 96.48%, 97.56% and 99.36%, respectively. Sharma et al. computed the discriminatory features namely fuzzy entropy and log energy by the wavelet decomposition coefficients [9]. This approach yielded an accuracy of 91.5% and 88.5% for six-class classification task using small and large datasets, respectively. Hassan et al. extracted various statistical moment based features decomposed by the Empirical Mode Decomposition (EMD) and achieved a good performance on a small database [10]. They also decomposed EEG signal segments using Ensemble Empirical Mode Decomposition (EEMD) to extract various statistical moment based features and achieved 88.07%, 83.49%, 92.66%, 94.23% and 98.15% for 6-state to 2-state classification of sleep stages on Sleep-EDF database [11]. Sharma et al. adopted the Poincare plot descriptors and statistical measures which are calculated by the discrete energy separation algorithm (DESA) as the features [12]. Moreover, the classification accuracy of the two to six categories on 15136 epochs from the Sleep-EDF database was 98.02%, 94.66%, 92.29%, 91.13% and 90.02%, respectively.

Besides the conventional features extraction method, some researchers choose the convolutional neural network (CNN) to classify sleep stages with the time–frequency images which are converted by one-dimensional EEG signals. Zhang et al. converted EEG data to a time–frequency representation via Hilbert–Huang transform and employed an orthogonal convolutional neural network (OCNN) as the classifier [13]. They achieved a total classification accuracy of 88.4% and 87.6% on two public datasets, respectively. Similarly, Xu et al. employed multiple CNN on multi-channel EEG signals to classify the sleep stages [14]. Mousavi [15] directly fed the raw EEG signals to a deep CNN with nine layers followed by two fully connected layers, without involving feature extraction and selection. This method achieved the accuracy of 98.10%, 96.86%, 93.11%, 92.95%, 93.55% for two to six class classification. Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the field of deep learning. It can not only process single data points (such as images), but also entire sequences of data (such as speech or EEG signal). Korkalainen et al. used a combined convolutional and LSTM neural network on the public database and achieved sleep staging accuracy of 83.7% with a single frontal EEG channel [16]. Michielli et al. proposed a novel cascaded RNN architecture based on LSTM for automated scoring of sleep stages on single-channel EEG signals [17]. The network performed four and two classes classification with a classification rate of 90.8% and 83.6%, respectively.

Most of the existing studies only adopted a few epochs or a single database when evaluating the performance of these method and some do not use the k-fold cross-validation, which will cause large fluctuations in the experimental results. Therefore, although the published researches have achieved positive results in automatic sleep stage classification, there is still a need for further validation and improvements to the existing methods. In this paper we proposed a novel IMBEFs extracted from LE and DSSM for automatically detecting the sleep stages with a high degree of accuracy. LE and DSSM are estimated from the two sets coefficients of LSBs and HSBs. The two sets coefficients are coming from the WPD of the sleep EEG epoch based on two wavelet bases separately. After comparing with various kinds of classifiers, the Bagged Trees was finally selected as the suitable classifier for this method to identify the sleep stages. In addition, experiments are conducted on three public sleep databases and the results are compared with state of the art published work in order to fully evaluate and validate the performance of the proposed method.

The paper is organized as follows: In Section 2, the experimental material and methodology of the proposed method are descripted in detail. Section 3 resents the experimental results. In Section 4, the results and findings of this paper are discussed. The conclusions of the paper are drawn in Section 5.

### **2. Materials and Methods**

### *2.1. Sleep State Classes*

According to the AASM and R&K standards, the classes of sleep stages can be divided into two to six classes. Moreover, under the AASM standard, it can be divided into two to five classes. The difference is that the N3 stage of AASM includes the S3 and S4 stages of the R&K standard. The detailed description of classes considered in this work are shown in Tables 1 and 2.

**Table 1.** The class description considered in this work under the Rechtschaffen's and Kale's (R&K) standard.


**Table 2.** The class description considered in this work under the American Academy of Sleep Medicine (AASM) standard.


### *2.2. Datasets*

### 2.2.1. Sleep EDF (S-EDF) Database

The S-EDF database have 197 whole-night Polysomnography (PSG) sleep recordings, containing EEG, EOG, chin EMG and event markers [18,19]. All the Hypnograms (sleep patterns) were manually scored by well-trained technicians according to the R&K criteria. In this study, 34 EEG recordings from 26 subjects aged 25 to 96 years are randomly selected.

### 2.2.2. DREAMS Subjects (DRMS) Database

The DRMS Database consists of 20 whole-night PSG recordings coming from healthy subjects, annotated in sleep stages according to both the R&K criteria and the new standard of the AASM [20]. Data collected were acquired in a sleep laboratory of a Belgium hospital using a digital 32-channel polygraph (BrainnetTM System of MEDATEC, Brussels, Belgium). The sampling frequency was 200 Hz.

### 2.2.3. ISRUC(Subgroup 3, ISRUC3) Database

The ISRUC3 database is the third subgroup of ISRUC database [21]. The data were obtained from human adults, including healthy subjects, subjects with sleep disorders and subjects under the effect of sleep medication. Each recording was randomly selected between PSG recordings that were acquired by the Sleep Medicine Centre of the Hospital of Coimbra University (CHUC).

The S-EDF database was only labeled under the R&K criteria. Moreover, the ISRUC3 database was only labeled by the AASM criteria. The DRMS database was not only labeled by R&K criteria but also the AASM criteria. The annotations of S-EDF database and DRMS database were produced visually by a single expert. The ISRUC3 database was scored by two experts and the label made by the second expert was used in this paper. The Pz-Oz channel of the S-EDF database is used according to the recommendations of various studies [3–7]. At the same time, for the DRMS database, as the researches [9–12] recommended, the Cz-A1 channel was adopted in this work. Moreover, for the ISRUC database, the C3-A2 channel is the best choice [7]. Table 3 lists the detailed information of the above three databases.


**Table 3.** The specification of the electroencephalograph (EEG) databases included in this study.
