*3.3. Experiments on S-EDF and ISRUC3 Database*

After experiments on the DRMS database, through the comprehensive comparison and selection, the classifier is selected as the Bagged Tress, *nDSSM* is set to 6, *lLE* is set to 5, *ωDSSM* is set to db1 and *ωLE* is set to db4. In order to further evaluate the performance of the method proposed in this paper, we will use these parameters to conduct experiments on the S-EDF database and the ISRUC3 database.

The classification accuracy and Cohen's Kappa Coefficients of the 2–6 classes on the S-EDF database are shown in Table 26. Furthermore, the confusion matrix of six class classification is listed for further analysis in Table 27.



**Table 27.** the Confusion matrix of six classes sleep state classification on S-EDF database.


Similarly, the method proposed in this paper was also tested on the ISRUC3 database. The experimental results are shown in the following Tables 28 and 29.

**Table 28.** The classification accuracy and Cohen's Kappa Coefficient of 2–5 class sleep classification on ISRUC3 database.


**Table 29.** The confusion matrix for five classes case on ISRUC3 database.


As can be seen from Table 28, the classification accuracies of two to five classes are 96.18%, 90.54%, 84.68% and 81.65%, respectively. In the five class classification, the sensitivity of Awa, REM, N1, N2, N3 are 90.31%, 83.36%, 57.70%, 81.12% and 87.50%, respectively.

### **4. Discussion**

Table 30 shows the comparison of the classification accuracy from two to six classes of the various published method and the method proposed in this paper on the DRMS database under the R&K standard.


**Table 30.** The accuracy comparison of various published methods on DRMS database under the R&K standard. Highest accuracy in each case is highlighted in bold.

As can be seen from the Table 30 above, when the only DSSMFs is used, the method proposed in this paper has a certain improvement in accuracy compared with the others. After adding LEFs on the basis of DSSMFs, the classification accuracies of two to six classes are improved by 1.27%, 1.02%, 1.27%, 1.38% and 0.72% compared with our previous study [27].

It can be seen from Table 31 that the method proposed in this paper has a certain improvement in the sleep stage classification of 3–5 classes on the DRMS database compared with the current existing methods. The N1 sensitivity of this method on the DRMS database is 17.57%, which is higher than 14.3% of Ghimatgar [7]. Moreover, Table 32 is the accuracy comparison of various published methods on S-EDF database.

**Table 31.** The accuracy comparison of various published methods on the Dreams Subjects database under the AASM standard. Highest accuracy in each case is highlighted in bold.


**Table 32.** The accuracy comparison of various published methods on the S-EDF database under the R&K standard. Highest accuracy in each case is highlighted in bold.


It can be seen from Table 32 that when a large number of samples are used, the accuracy is also improved compared with other published methods. Among them, the accuracy for the classification of four classes is 93.87%, while the Sharma [28] is 92.1% and the Shen [27] is 93.0%. In the classification of two classes, Abdulla et al. [6] has the highest accuracy of 93%; however, the number of epoch they used is only 23806. The sensitivity of S1 in this paper is 19.32%, which is higher than 18.3% of Ghimatgar [7] and 15.9% of Shen [27].

The experiments results of the proposed method on ISRUC3 database are also compared with other methods, which can be seen in the following Table 33.


**Table 33.** The accuracy comparison of the ISRUC3 database with the AASM standard. Highest accuracy in each case is highlighted in bold.

As can be seen from the Table 33, compared with Ghimatgar [7], the detection accuracy of two and three classes is improved by more than 2 points. The sensitivity of S1 in Table 29 is 57.70%, which is higher than 33% of Ghimatgar [7]. Furthermore, the Cohen's kappa Coefficient is also much higher than Ghimatgar [7].

It should be noted that the classification of S1 which is an enormous challenge to all of the published method. From neurophysiological standpoint, S1(N1) is a transition phase and is a mixture of wakefulness and sleep resulting in similarity with the neural oscillations of S1 and Awa. In REM state, the cortex shows 40–60 Hz gamma waves as it does in waking. So the S1 state is often misclassified as REM or Awa state during the visual inspection by experts [3,11]. This is why many of the S1 epochs are misclassified as REM, Awa or S2 stages in this work. In addition, with different databases, the classification accuracy of S1 (N1) are also different. The detection accuracy of N1 on the ISRUC3 database reached 57.7%; on the DRMS database and the S-EDF database, however, it is less than 20%. This is also related to the different proportions of S1 stages in each database. Under the same AASM standard, on the ISRUC3 database, the S1 accounted for 12.65%; however, on the DRMS database, the S1 accounted for only 7.3%. Furthermore, under the R&K standard, the sensitivity of S3 on the S-EDF and DRMS databases is low, only 46.11% and 25.71%, respectively. The reason relate to this phenomenon rely mainly on that the S3 is a transition phase of S2 and S4. Thus the further research should be conducted to improve the S3 detection accuracy. Moreover, as can be seen in Table 20, a large number of S3 is misclassified as S2 and the other large part is misclassified as S4. Similarly, in Table 27, almost half of S3 epochs are misclassified as S2 and a small part are misclassified as S4. In addition, when under the AASM standard, after combining the S3 and S4 into N3, the sensitivity of N3 has been improved. As shown in Table 25, only 761 epochs of N3 were misclassified as N2; however, in Table 20, 1022 epochs of S3 were misclassified as S2 and 231 epochs of S4 were misclassified as S2. Therefore, the AASM standard is more suitable for guiding the researchers to annotate the sleep stages than the R&K standard.
