Next Article in Journal
The Role of Beetroot Ingredients in the Prevention of Alzheimer’s Disease
Next Article in Special Issue
FSopt_k: Finding the Optimal Anonymization Level for a Social Network Graph
Previous Article in Journal
A Semi-Parallel Active Learning Method Based on Kriging for Structural Reliability Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Study on ML-Based Sleep Score Model Using Lifelog Data

1
Department of Mathematics, Kwangwoon University, Seoul 01897, Republic of Korea
2
Department of Data Science, Seoul Women’s University, Seoul 01797, Republic of Korea
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(2), 1043; https://doi.org/10.3390/app13021043
Submission received: 14 December 2022 / Revised: 2 January 2023 / Accepted: 9 January 2023 / Published: 12 January 2023
(This article belongs to the Special Issue Advances and Challenges in Big Data Analytics and Applications)

Abstract

:
The rate of people suffering from sleep disorders has been continuously increasing in recent years, such that interest in healthy sleep is also naturally increasing. Although there are many health-care industries and services related to sleep, specific and objective evaluation of sleep habits is still lacking. Most of the sleep scores presented in wearable-based sleep health services are calculated based only on the sleep stage ratio, which is not sufficient for studies considering the sleep dimension. In addition, most score generation techniques use weighted expert evaluation models, which are often selected based on experience instead of objective weights. Therefore, this study proposes an objective daily sleep habit score calculation method that considers various sleep factors based on user sleep data and gait data collected from wearable devices. A credit rating model built as a logistic regression model is adapted to generate sleep habit scores for good and bad sleep. Ensemble machine learning is designed to generate sleep habit scores for the intermediate sleep remainder. The sleep habit score and evaluation model of this study are expected to be in demand not only in health-care and health-service applications but also in the financial and insurance sectors.

1. Introduction

Good sleep is important for health. It is very important for improving the quality of life by enhancing physical recovery, strengthening memory and immunity, and protecting mental health [1,2,3]. However, the rate of sleep disorders is steadily increasing worldwide. According to previous research, about 10–30% of adults suffer from chronic insomnia [4]. Sleep disorders not only lower the quality of life of individuals but also increase social costs. In the United States, insufficient sleep is associated with economic losses estimated at more than $411 billion [5].
Presently, various equipment such as wearable devices and smart scales are being used for sleep health [6,7,8]. In the past, only hospitals could test sleep through expensive polysomnography. Most previous studies for sleep quality scores use the Pittsburgh questionnaire [9,10], but have limitations due to reliance on interviewees’ subjective responses. Research using objective data is lacking. However, using collected lifelog data, it is possible to track health signals daily as well as identify health trends by week and month.
To achieve good quality sleep scores, it is necessary to calculate sleep scores using multiple sleep dimensions such as sleep efficiency, regularity, duration, and timing [11,12]. This paper focuses on scoring sleep habits’ healthiness by considering multiple dimensions of sleep and proposes a sleep habit score calculation methodology that considers objective data and various dimensions of sleep with a credit evaluation–based model and machine learning using data collected with a Samsung Galaxy 5.
The results of this study include an objective indicator of sleep health and are expected to be utilized in financial fields as well as digital health care. First, in the health-care industry, our scoring methodology is expected to be used as a comprehensive indicator of sleep health and to help improve sleep by checking and improving one’s sleep habit score every day. In addition, it is expected to help financial and insurance companies develop many insurance products linked to health indicators. In fact, a study by Moore (2002) [13] suggested that sleep health is related to financial information such as income.
This paper is structured as follows. Section 2 describes the proposed methodology. Section 2.1 describes data description and preprocessing methods. Section 2.2 describes the data, features, and target for generating primary sleep habits for good/bad sleep and explains the logistic regression model and methodology for generating scores. Section 2.3 describes the dataset and modeling methods for generating secondary sleep habit scores for intermediate sleep states as well as the methodology for generating these scores. Section 3 presents the overall sleep habit score results, which combine sleep habit scores for good/bad sleep and for intermediate sleep states. Section 4 summarizes this study, including interpretation of the results, considerations, limitations, and significance of the study.

2. Materials and Methods

Figure 1 shows the flow for proposed method.
Figure 1a shows the sleep habit score generation process for good/bad sleep states. The chart is divided into A and B according to the sleep state. Figure 1b illustrate the process of (A) and (B) which shows in Figure 1a.
As can be seen in Table 4 of Section 2.2.2, a simple but widely used logistic linear model for credit scoring classifies good/bad sleep habits and obtains intuitive weights. Based on a model that classifies good/bad sleep habits, it can be relatively ambiguous for intermediate sleep habit data with many factors mixed together, so a stacking machine learning model, a more complex model such as a nonlinear model, is used to classify intermediate sleep habits.
Each step in the summary flowchart of the proposed method is described in detail in subsequent sections. In brief, we perform feature generation on the collected raw data and then perform outlier removal and missing value imputation. The refined data are divided into two major categories according to the sleep state. For data on good/bad sleep states (A), a logistic regression model is used to generate a primary sleep habit score. The data on the intermediate sleep state (B) generate a sleep habit score using stacking models, where the target is defined as the sleep habit score obtained from A.

2.1. Data Preparation and Preprocessing

The data used in this study is set out in Table 1.
The data were collected from 714 people from 26 November 2020 to 1 January 2022 by a Samsung Galaxy. Specifically, daily/minute sleep data, daily/by-minute step data, and user information (age, gender) were included. The collected data were preprocessed as daily data aggregation, sleep-related feature generation, outlier processing, and missing values. In daily data aggregation, features are generated by daily aggregation of sleep data collected per minute. Based on the study findings that sleep phase information for the initial 90 min of sleep indicates the quality of sleep, we created the sleep phase features for the first 90 min [14,15]. We also generated total daily sleep stage ratio features (REM stage, light stage, deep stage, awake stage) [16,17], sleep efficiency feature [18], SRI (sleep regularity index) [19,20], and so on. The SRI feature is calculated through the SRI calculation Equation (3) assuming Equations (1) and (2) for M daily epochs and N days [20]:
S i , j = S i + 1 ,   j     δ S i , j , S i + 1 ,   j = 1 ,
S i , j S i + 1 ,   j     δ S i , j , S i + 1 ,   j = 0 ,
100 + 200 M N 1 j = 1 M i = 1 N 1 δ S i , j ,   S i + 1 ,   j .
where N and M are the number of days and the number of epochs per day, respectively. The function δ returns value 1 if the sleep occurrence (sleep-wake state) is the same at 24 h intervals and returns 0 otherwise. For example, if sleep occurred at 22:00 and ended at 06:00 on Friday and occurred at 22:30 and ended at 08:00 on Saturday, the function δ from 22:30 to 06:00 is 1 and the rest of the time zone is 0. Daily sleep data are used to generate total sleep time [21] and sleep midpoint features [22]. In addition, to generate features for the step information just before sleep, the step data collected per minute is preprocessed and used together with the sleep data to generate features. Some of the feature names and descriptions are summarized in Table 2, and the rest are shown in Appendix Table A1 for readability.
Outlier processing proceeds as follows:
  • Data with a sleep stage value of 0 among the generated sleep stage features.
  • Less than 3 h of total sleep per day, since it does not record stages if the sleep is less than 3 h.
  • SRI index with negative values [23].
Missing-value processing based on sleep habit score will be described in detail after Section 2.2, but it is briefly described in this section as it is included in the overall preprocessing. Missing-value processing is organized into three steps as follows:
  • Set the sleep habit score derived by the logistic regression model as the target and set the related sleep variable as the explanatory variable. (This is detailed in Section 2.2.1.)
  • Process missing values based on the KNN (K-nearest neighbors) machine learning algorithm [24,25]. To derive the optimal k (number of neighbors), the support vector regression [26] and random forest models have been used for k evaluation [27]; k has been selected with the average number of neighbors yielding the best performance among the evaluations.
  • Fill in missing values using the KNN method with the derived optimal number of neighbors k. Specifically, the data set was divided into training data and test data at a ratio of 8:2, and the range of k was set from 2 to 15, and performance was measured with each k value. Support vector machine and random forest calculated the final performance with a weighted sum of the calculation results, calculated by applying a weighted sum of 0.5 each, that is, the result calculated by each classifier was multiplied by 0.5 to derive the result in an ensemble method. As a result of the experiment, it was confirmed that the performance was the best when k was 3, and imputation was performed with that value. As described above, a total of 67 variables, such as user identification ID value, date, and sleep characteristics, and 16,053 rows of data are used as analysis data through daily data aggregation, sleep feature generation, outlier processing, and preprocessing of missing values.
Sleep health is defined by information on sleep regularity, sleep duration, sleep timing, and sleep efficiency dimensions [11]. Sleep regularity, sleep duration, sleep efficiency, and sleep timing are important indicators of sleep habits. Several recent studies have shown that sleep regularity is beneficial to physical and mental health and shown that irregular sleep increases the risk of developing cardiovascular disease [28,29]. As for the sleep duration indicator, many studies have found that sleep duration that is both too short and too long can negatively impact health and quality of life [28,29]. Additionally, both late sleep duration and large sleep variability are associated with poor sleep health, and regular sleep patterns have beneficial effects on health [28,29]. These are defined as follows through the sleep factors and cutoff values used by previous studies [28,29].
  • Sleep regularity: standard deviation of weekday sleep midpoint (variability), with a difference of less than 1 h defined as a good sleep state [28,29].
  • Sleep duration: the total daily sleep time, calculated as the difference between the daily sleep end time and sleep start time, where 7 to 9 h is defined as a good sleep state [28,29].
  • Sleep timing: the midpoint of sleep, calculated as the midpoint between the onset and the end of sleep, where between 2 and 4 a.m. is defined as a good sleep state [28,29].
  • Sleep efficiency: the ratio of total sleep time to total sleep time excluding waking time, where 85% or more is defined as a good sleep state [28,29].
Based on the cutoff values set above, data are defined as a good sleep state when all four conditions are satisfied, and as bad sleep when three or more of the four conditions are not satisfied. Bad sleep consists of five combinations: (1) bad-sleep regularity, duration, and efficiency, (2) bad-sleep regularity, duration, and timing, (3) bad-sleep regularity, efficiency, and timing, (4) bad-sleep duration, efficiency, and timing, and (5) bad-sleep regularity, duration, timing, and efficiency. The remaining combinations of conditions are taken to define the intermediate sleep state. The sleep habit score is derived using data for 326 good sleep states, 5168 bad sleep states, and 10,559 intermediate sleep states defined in this way.

2.2. Primary Habit Score: Good/Bad Sleep State

The number of classes of target used in this study is three, good/intermediate/bad. We first set the data consisting of good sleep and bad sleep as the analysis data set, excluding the data classified as intermediate sleep states. Based on the data with two target classes, the primary sleep habit score is derived by applying a traditional credit evaluation model and credit score generation method.

2.2.1. Setting Description Variables (Features) and Result Variables (Target)

The explanatory variables of the data set are divided into continuous variables and categorical variables as follows:
  • Continuous variables: total sleep variability, SRI (Sleep Regularity Index) (2 days, 3 days, 4 days, 5 days, 6 days, 7 days), number and time of naps, sleep midpoint variability, day-of-week information, daily sleep start and end information, information on sleep stages within the first 90 min of sleep, information on steps 2 h before the first start of sleep.
  • Categorical variables: 10~12 h sleep FLAG variable, sleep onset variability 1-h FLAG variable, total sleep time variability within 1 h FLAG variable.
The categorization for continuous variables for scoring consists of two steps as follows:
  • The first step, fine classing (Leung, 2008; Vejkanchana, 2019) [30,31], is carried out to improve consistency and explanatory power. Through this, a representative variable is selected in consideration of the correlation within the explanatory variable and the information value (Vejkanchana, 2019) [31] and a section for the variable is derived.
  • The second step is coarse classing (Leung, 2008; Vejkanchana, 2019) [30,31]; based on the categorization in the first step, a new category is derived by checking the data state. Specifically, for a linear relationship with the occurrence of good sleep, adjacent categories with similar weight-of-evidence (WoE) values (Finlay, 2010; Zdravevski, 2011) [32,33] are integrated so that the WoE value increases or decreases monotonically (Vanneschi, 2018) [34]. In this way, the amount of data on the number of occurrences and nonoccurrence of good sleep for each category is adjusted and categories are integrated based on the WoE value.
Features calculated based on WoE values for the target in this study are summarized in Table 3.

2.2.2. Defining Good Sleep Habit Labels Using a Logistic Regression Model for the Primary Sleep Habit Score

This study used a logistic regression model [35] to generate the primary sleep habit score. The reasons for this are: (1) ease of interpretation of regression coefficients; (2) since the model can estimate the probability of belonging to a class, it is often used for risk and credibility analysis required for probability calculation; (3) it can be used as a base model. For these reasons, this study uses the logistic regression model to score good and bad sleep habit status data. Good sleep habit level is expressed as the probability of developing a good sleep state that satisfies good sleep conditions. A model for the effect on the probability of good sleep occurrence has been created using logistic regression with various explanatory variables (Table 3). Logistic regression predicts the likelihood of an event using a linear combination of explanatory variables and is defined by Equation (4) [36]:
l n o d d s = l n p 1 p = w 1 × x 1 + w 2 × x 2 + + w n × x n .
To evaluate the performance of the model, the training data and the verification data were first randomly extracted and divided, at a ratio of 7:3 and then three verification metrics commonly used in the credit evaluation model were used, specifically area under ROC (receiver-operating characteristic) curve [37], K–S (Kolmogorov–Smirnov) statistic [38], and Gini coefficient [39]. AUROC (area under ROC) means the area under the ROC curve: the closer the value is to 1, the higher the sensitivity and specificity, so the model can be called a good classification model. In the problem of generating scores, such as in the study of credit scoring, it is known that a model has good discriminating power when the value is 0.7 or more. The K–S statistic is an index that compares the difference in the cumulative distribution function between two groups (in our case, good sleep state and poor sleep state) and tests whether they come from the same distribution. Here, it refers to the maximum value of the difference between the cumulative good sleep incidence and the cumulative bad sleep incidence. In general, if the K–S statistic is 0.5 or higher, the desired discriminatory power is judged to be secured. The Gini coefficient is used to determine the discriminatory power of the credit rating model using the cumulative defect distribution according to the credit score. Each metric calculated in this study is summarized in Table 4.
All index values are higher than the reference value. Therefore, the constructed model predicts the overall probability of occurrence of a good sleep state at an appropriate level.

2.2.3. Scoring for the Primary Sleep Habit Score

In this study, the sleep habit status is scored using points to double the odds (PDO) [40], a scoring methodology used in constructing a credit rating model [41]. If PDO is set to 20 or 50, it means that the odds double whenever the score increases by 20 or 50 points [42]. The higher the score, the lower the probability of satisfaction, focusing on the fact that good sleep habits are difficult to achieve. The standard value widely used in the credit evaluation model was applied. Specifically, the basic score was initialized to 100 and the PDO was set to 50, and the target odds for the initial score of 100 points were set at the level of 1:20. Specifically, for scoring, the score is calculated using (Equations (5)–(8)) [43]:
S l e e p   S c o r e = o f f s e t f a c t o r × ln o d d s ,
f a c t o r = 50 ln 2 ,  
o f f s e t = 100 50 ln 2 × l n 20 .
S l e e p   S c o r e = 100 50 ln 2 × ln 20 50 ln 2 × l n o d d s .  

2.2.4. Primary Sleep Habit Score Results

The distribution of the primary sleep score calculated in this study is shown in histogram form in Figure 2.
As can be seen from the graphs, the generated good sleep habit score (Figure 2a) is mostly distributed between 1400 and 1600 points, whereas the bad sleep habit score (Figure 2b) is distributed between 750 and 1000 points. The overall data distribution (Figure 2c) appears to follow a normal distribution, as expected. The basic statistical information of the primary sleep habit score generated in this study is summarized in Table 5.
The scorecard for SRI (Sleep Regularity Index) is summarized in Table 6 as follows.
The details for sleep duration, sleep timing, and sleep efficiency are described in the Appendix.

2.3. Second Step: Intermediate Sleep Score

The sleep habit score was first generated using the good and bad sleep habit states. However, this excludes intermediate sleep habit states that can occur. Therefore, this study intends to generate a score for the intermediate sleep habit state using multi-stacking ensemble models that are effective in improving predictive performance. The machine learning and deep learning–based stacking ensemble learning model proposed in this study uses three data sets: training set and test set, plus a CV (cross-validation) set to prevent overfitting, which occurs mainly in the stacking method [44,45].

2.3.1. Data Preparation: Training and Test Data Set

The dataset, classified into good sleep and bad sleep data, is used as the training data, and the primary sleep habit score described in Section 2.2 is set as the training data’s target. The second sleep habit score is derived by setting the data set classified as the intermediate sleep state as predictive (test) data, and the sleep habit score for the intermediate sleep state is predicted using machine learning and deep learning stacking models. Specifically, the stacking machine learning model trains with training data of 5494 data (good sleep: 326, bad sleep: 5168), and estimates sleep habit scores for 10,559 test data.

2.3.2. Modeling: Multi-Stacking Ensemble Models Based on Machine Learning and Deep Learning

Figure 3 shows in summary form the machine learning and deep learning-based stacking ensemble model construction and design used in this study.
Machine learning algorithms used for prediction are XGBoost [46], LightGBM [47], CatBoost [48], and the TabNet neural network model (a deep learning model) [49]. Metamodels used are linear regression, Bayesian Ridge Regressor [50], ElasticNet Regressor [51], and Ridge Regressor [52]. The stacking ensemble design method consists of three steps, presented in Figure 4, Figure 5 and Figure 6. Figure 7 shows the operating process based on cross-validation within each individual model.
In summary, in the first step, data is predicted using ML and DL models (LightGBM, XGBoost, CatBoost, TabNet) for good/bad sleep state data (feature) and sleep habit score (target). Stacking the output data by ML and DL models composes the data for the metadata (Figure 4). In the second step, three metamodels, linear regression, Bayesian Ridge Regressor, and Elastic-Net Regressor, are trained on the data constructed in the first step. Stacking the predicted data by metamodels composes the data for the final model. (Figure 5). In the last third step, the final prediction model, the Ridge Regressor algorithm, is used to predict the intermediate sleep habit score, and the performance error is measured by the mean squared error [53] (Figure 6). Specifically, in the first step, XGBoost, LightGBM, and CatBoost models derive optimal hyperparameters using the Optuna hyperparameter tuning framework [54]. The hyperparameters for each model are summarized in Table A3.
In Figure 7, to improve overfitting that may occur in the process shown in Figure 4, Figure 5 and Figure 6, each model generates stacking data for metamodel training and testing through cross-validation. Based on the generated data, the metamodel then yields the training and prediction performance.

3. Results

Second Step: Intermediate Sleep Score

Figure 8 shows the distribution of the final sleep habit score calculated in this study, which is the sum of the first-generated (primary sleep habit) score and the intermediate sleep habit score. It is evenly distributed with an approximately normal distribution with a mean of 850. This is similar to the characteristics of a general scorecard in which scores are concentrated in the middle (average). It can be confirmed that the distribution of the calculated scores is close to a normal distribution, so that the data are not concentrated in a specific score range and are almost symmetrically distributed with no skew. This suggests that the score was well calculated without distortion. In addition, since it is an approximately normal distribution, it is possible to estimate the population by comparing various groups through inferential statistics, and it becomes possible to derive several kinds of statistical tests. Finally, it makes it easy to use and interpret scores.
Statistical values of sleep characteristics for each sleep state are as follows.
  • For sleep midpoint between 02:00 a.m. and 04:00 a.m. on weekdays, good sleep was 47.12%, bad sleep 12.25%, and intermediate sleep 21.39%.
  • For weekend sleep time between 10:00 p.m. and 12:00 a.m., good sleep was 43.25%, bad sleep 22.50%, and intermediate sleep 29.91%.
  • For steps within 2 h, good sleep averaged 813 steps, bad sleep 35, and moderate sleep 156.
  • For the SRI (Sleep Regularity Index) index (2 days), mean values were 87.21 for good sleep, 73.26 for bad sleep, and 80.58 for intermediate sleep.
Figure 9 shows the sleep state probabilities for each section for good and bad sleep states.
Figure 9 shows the following:
  • The higher the SRI value, the higher the probability of a good sleep state.
  • The higher the gait (step) counts within the first 2 h before sleep, the higher the probability of a good sleep state.
  • The greater the weekly total time variability, the higher the probability of a bad sleep state.
Overall, good sleep was mainly distributed in the range 1400–1600 points, inter-mediate sleep was distributed over 700–900 points, and bad sleep was mainly distributed over less than 700 points (Refer to Table 7). According to the method proposed in this study, the higher the sleep habit score, the more data classified as good sleep state, while the lower the score, the worse the sleep state. Therefore, it is expected that good sleep guides can be elaborated according to the proposed sleep habit score.

4. Discussion

This study presented a model for grading sleep habit level considering various sleep dimensions. First, the quality of sleep was defined as an index indicating the level of sleep habits, and data for good sleep, intermediate sleep, and bad sleep were classified according to the cutoffs of previous studies.
Based on the logistic regression model used in the credit rating model, a model for estimating the likelihood of occurrence was derived using lifelog factors that affect good and bad sleep. Specifically, the process of categorizing various sleep features generated from lifelog datasets, estimating probability of occurrence of each sleep state with the logistic regression model and evaluating the predictive power of the model were discussed. The primary sleep habit score was derived by grading and classifying sleep habit levels based on the PDO (points to double the odds) concept using the derived model. This study aimed to derive the sleep habit level for all sleep states by learning the primary sleep habit score derived using a machine learning algorithm to generate the sleep habit index for intermediate sleep states.
Summarizing the characteristics of the sleep habit score derived from this study, the midpoint of sleep is between 2 a.m. and 4 a.m., the start time of sleep is between 10 p.m. and 12 a.m., and the walking activity in the evening increases the probability of receiving a high score. Also, the higher the Sleep Regularity Index (SRI), the higher the probability of good sleep.
Previous studies were reviewed to verify the validity of the methodology. It was confirmed that the results of this study were consistent with the results of previous studies. Halson (2022) [55] claimed that the average SRI value was 81.4 to 88.8, and Windred (2021) [56] found that the higher the SRI (94 points), the more regular the sleep state, and the lower the SRI (34 points), the more irregular the sleep. It is similar to the results of this study that the higher the SRI, the higher the probability of being in a good sleep state. Makarem (2020) [57] investigated the correlation between sleep variability and health, and confirmed that high sleep variability has a negative effect on health. Baron (2017) [58] found that higher sleep variability can negatively affect sleep quality, which is consistent with the results of this study. Buman (2014) [59] suggested that there was no relationship between evening exercise and sleep quality. Stutz (2019) [60] found that vigorous exercise one hour before bedtime could negatively affect sleep onset, total sleep duration, and SE, but found no evidence that evening exercise negatively affects sleep, in fact rather the opposite. Frimpong (2021) [61] argued that activity 2 to 4 h before bedtime does not affect sleep quality in healthy young and middle-aged adults. This is similar to the result of this study that walking activity for 2 h before sleep increases the probability of being a good sleep state. In addition, this study generated various features through gait and sleep data, and in the study of Kim (2022) [62], various step and sleep features were generated through lifelog data and body weight were predicted through these features. Liang (2019) [63] also generated various sleep features, and medical-grade sleep/wake classification was predicted with a tree-based model. In the study of Han (2018) [42], the PDO was set at 58.43994, which is similar to this study. A study on the optimized PDO setting will be conducted in the future. Studies using stacking machine learning algorithms to improve performance were presented (Jiang, 2020; Pavlyshenko, 2018) [64,65], and Yu (2022) [66] added CV (cross-validation) to the stacking technique to prevent overfitting, which is similar to the method proposed in this study. As a result of comparing and reviewing the results and methodology of previous studies with this study, most of the results were consistent. Therefore, it is recommended to measure sleep quality and generate an objective score using lifelog data and machine learning algorithms. Since this method is based on the data of the user’s life pattern, it is expected that the more data that are accumulated over time, the more accurate the quality of sleep can be predicted and the more accurate the sleep habit score can be generated.
The limitations of this study are as follows. Since we created the sleep score by focusing on sleep habits and behaviors (sleep hygiene) rather than sleep quality itself, even if the definition of good sleep presented in previous studies is not met, expert review shows that good sleep quality can occur or vice versa.
In the future, this research will go beyond the rating of sleep habit level to evaluate overall lifestyle, including walking habits and weight habits. We also plan to conduct simulations using an optimization algorithm that goes beyond simple ratings to perform additional analysis of optimal combinations and factors to increase sleep scores. This study used a linear model logistic model, but future research will study a new technique that calculates weights with a nonlinear model and scores them. In addition, good/bad sleep habits were classified with a simple linear model, and intuitive weights were obtained. Based on this, intermediate sleep habit data in which various factors were mixed were classified using a more complex model—a stacking machine learning model. However, as the efficiency can be increased as the number of steps is reduced, a study on a model that can be solved end to end in a single step will be conducted in future work. Lastly, instead of using all of the various indicators discovered in previous studies, we can consider and study regularization models such as LASSO that can identify features that are actually important and those that can be discarded. Through these studies, it is expected that our research will contribute to private medical insurance and comprehensive health management more substantially.

Author Contributions

Conceptualization, J.K. and M.P.; data curation, J.K.; formal analysis, J.K.; funding acquisition, M.P.; methodology, J.K. and M.P.; supervision, M.P.; validation, J.K.; visualization, J.K.; writing—original draft preparation, J.K. and M.P.; writing—review and editing, M.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by a research grant from Seoul Women’s University (2021-0423).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Feature names and descriptions as generated from raw data collected daily/per minute.
Table A1. Feature names and descriptions as generated from raw data collected daily/per minute.
FeatureMeaningFeatureMeaning
BED_TIME_VAR_FLAG_1Sleep onset fluctuation within 1 h (FLAG)FIRST_LIGHT_MINTotal time of light sleep phases in earlier (90 min) sleep cycles
BED_TIME_VAR_FLAG_2Sleep onset fluctuation within 2 h (FLAG)FIRST_DEEP_MINTotal time of deep sleep phases in earlier (90 min) sleep cycles
BED_TIME_10_TO_12_FLAGSleep onset time between 10 p.m. and 12 a.m. status (FLAG)FIRST_REM_MINTotal time of rem sleep phases in earlier (90 min) sleep cycles
TST_VARTotal sleep time variability per daySTAGE_SUM_MINTotal time of all sleep phases in earlier (90 min) sleep cycles
SRI_2Sleep regularity index observed on 2 daysFIRST_ASRPercentage of awake sleep phases in earlier (90 min) sleep cycles
SRI_3Sleep regularity index observed on 3 daysFIRST_LSRPercentage of light sleep phases in earlier (90 min) sleep cycles
SRI_4Sleep regularity index observed on 4 daysFIRST_DSRPercentage of deep sleep phases in earlier (90 min) sleep cycles
SRI_5Sleep regularity index observed on 5 daysFIRST_RSRPercentage of rem sleep phases in earlier (90 min) sleep cycles
SRI_6Sleep regularity index observed on 6 daysSTEP_INFO_BEFORE_SLEEP_2Number of steps taken 2 h before sleep
SRI_7Sleep regularity index observed on 7 daysWEEKDAYWeek information expressed as an integer (0: Sunday)
SLEEP_START_MINDaily sleep onset time (minute info) per dayWEEKEND_FLAGWeekend status (flag)
SLEEP_END_MINDaily sleep offset time (minute info) per dayHOLIDAY_FLAGHoliday status (flag)
SLEEP_MIDPOINTMidpoint between the onset and offset of sleep
Table A2. Scorecard for each feature.
Table A2. Scorecard for each feature.
FeatureInterval ValueScore
DECIMAL_END_HOUR_MINUTE~7.55
7.5~−4
DECIMAL_START_HOUR_MINUTE~24.54
24.5~−116
WEEKLY_MIDPOINT_VAR0~4118
4~5−94
5~5.5−59
5.5~−155
WEEK_INFO~237
2~6.5−10
6.5~11.5−49
11.5~16.5−12
16.5~24−47
24−96
HOLI_INFO~20
2~12.53
12.5~23.51
23.5~24.5−4
24.5~−2
FIRST_LIGHT_MINUTE0~2931
29~34−28
34~3723
37~39−38
39~46−5
46~9
FIRST_AWAKE_MINUTE~110
1~39
3~80
8~17−1
17~−12
STAGE_SUM_MINUTE0~3130
31~48−11
48~562
56~5825
58~60−23
60~15
FIRST_DSR~0.00513
0.005~0.05−47
0.05~0.145−13
0.145~0.27533
0.275~−50
FIRST_ASR~0.0122
0.01~0.0421
0.04~0.137
0.13~0.2−5
0.2~−34
FIRST_LSR~0.550
0.5~0.64−21
0.64~0.810
0.8~0.86−4
0.86~0.9−32
0.9~16
FIRST_RSR~0.01−6
0.01~0.0528
0.05~0.1855
0.18~−10
TOTAL_SLEEP_TIME_VAR~0.40
0.4~0.634
0.6~0.8−16
0.8~2.21
2.2~2.631
2.6~3.6−26
3.6~2
STEP_INFO_2~50−31
50~1500184
1500~145
Table A3. Table of hyperparameters for each ML model.
Table A3. Table of hyperparameters for each ML model.
ModelHyperparameterValueHyperparameterValue
LightGBMreg_alpha1.5486subsample0.5
reg_lambda4.5005learning_rate0.008
colsample_bytree0.7max_depth10
num_leaves470min_child_samples47
min_data_per_groups100n_estimators2000
XGBoostlambda0.008alpha3.818
colsample_bytree0.4subsample0.7
learning_rate0.02min_child_weight39
n_estimators2000max_depth7
CatBoostbagging_fraction0.7723l_leaf_reg1.629
max_bin235learning_rate0.0155
min_data_in_leaf n_estimators2000
max_depth7task_typeGPU
Tabnetmax_typeEntmaxn_da64
n_steps2gamma1
n_shared3lambda_sparse9.07 × 10−5
patienceScheduler9epochs15
In order to confirm that the stacking model has better performance than other single ML models, classification performance was performed on 15,727 total data (good sleep: 326, bad sleep: 5168, medium sleep: 10,559). First, in order to go through the same process as score generation, only good sleep and bad sleep were included in the learning data, i.e., 80% of good sleep + bad sleep was used as training data, and the remaining 20% was used as test data. Then, 80% of the 10,559 middle sleeps were randomly extracted and added to the test data. Then, the proposed stacking machine learning model was compared with XGBoost, LightGBM, CatBoost, and Tabnet models, known as SOTA. The compared performances are summarized in Table A4. The F1 score is out of 100. The decimal point is discarded since it is only necessary to check which model has the highest performance.
Table A4. Table of F1 score for each ML model.
Table A4. Table of F1 score for each ML model.
ML ModelF1 Score
XGBoost89
LightGBM87
CatBoost88
Tabnet85
Stacking Method90

References

  1. Lee, H.; Kim, J.; Moon, J.; Jung, S.; Jo, Y.; Kim, B.; Ryu, E.; Bahn, S. A study on the changes in life habits, mental health, and sleep quality of college students due to COVID-19. Work 2022, 73, 777–786. [Google Scholar] [CrossRef]
  2. Heuse, S.; Grebe, J.L.; Esken, F. Sleep Hygiene Behaviour in Students: An Intended Strategy to Cope with Stress. J. Med. Psychol. 2022, 24, 23–28. [Google Scholar] [CrossRef]
  3. Freeman, D.; Sheaves, B.; Waite, F.; Harvey, A.G.; Harrison, P.J. Sleep disturbance and psychiatric disorders. Lancet Psychiatry 2020, 7, 628–637. [Google Scholar] [CrossRef] [PubMed]
  4. Bhaskar, S.; Hemavathy, D.; Prasad, S. Prevalence of chronic insomnia in adult patients and its correlation with medical comorbidities. J. Family Med. Prim. Care 2016, 5, 780–784. [Google Scholar] [CrossRef] [PubMed]
  5. Hafner, M.; Stepanek, M.; Taylor, J.; Troxel, W.M.; Van Stolk, C. Why sleep matters—The economic costs of insufficient sleep: A cross-country comparative analysis. Rand Health Q. 2017, 6, 11. [Google Scholar]
  6. Estrada-Galiñanes, V.; Wac, K. Collecting, exploring and sharing personal data: Why, how and where. Data Sci. 2020, 3, 79–106. [Google Scholar] [CrossRef] [Green Version]
  7. Nyman, J.; Ekbladh, E.; Björk, M.; Johansson, P.; Sandqvist, J. Feasibility of a new homebased ballistocardiographic tool for sleep-assessment in a real-life context among workers. Work 2022. [Google Scholar] [CrossRef] [PubMed]
  8. Wei, Q.; Lee, J.H.; Park, H.J. Novel design of smart sleep-lighting system for improving the sleep environment of children. Technol. Health Care 2019, 27, 3–13. [Google Scholar] [CrossRef] [Green Version]
  9. Smyth, C. The Pittsburgh sleep quality index (PSQI). J. Gerontol. Nurs. 1999, 25, 10. [Google Scholar] [CrossRef]
  10. Carpenter, J.S.; Andrykowski, M.A. Psychometric evaluation of the Pittsburgh sleep quality index. J. Psychosom. Res. 1998, 45, 5–13. [Google Scholar] [CrossRef]
  11. Buysse, D.J. Sleep health: Can we define it? Does it matter? Sleep 2014, 37, 9–17. [Google Scholar] [CrossRef] [PubMed]
  12. Morrissey, B.; Taveras, E.; Allender, S.; Strugnell, C. Sleep and obesity among children: A systematic review of multiple sleep dimensions. Pediatr. Obes. 2020, 15, e12619. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Moore, P.J.; Adler, N.E.; Williams, D.R.; Jackson, J.S. Socioeconomic status and health: The role of sleep. Psychosom. Med. 2002, 64, 337–344. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Nishino, S. The Stanford Method for Ultimate Sound Sleep; Sunmark Publishing: Tokyo, Japan, 2017. [Google Scholar]
  15. Patel, A.K.; Reddy, V.; Araujo, J.F. Physiology, Sleep Stages; StatPearls [Internet]: Florida, FL, USA, 2021. [Google Scholar]
  16. Beattie, Z.; Oyang, Y.; Statan, A.; Ghoreyshi, A.; Pantelopoulos, A.; Russell, A.; Heneghan, C.J.P.M. Estimation of sleep stages in a healthy adult population from optical plethysmography and accelerometer signals. Physiol. Meas. 2017, 38, 1968–1979. [Google Scholar] [CrossRef] [PubMed]
  17. Slyusarenko, K.; Fedorin, I. Smart alarm based on sleep stages prediction. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2020, 2020, 4286–4289. [Google Scholar]
  18. Reed, D.L.; Sacco, W.P. Measuring sleep efficiency: What should the denominator be? J. Clin. Sleep Med. 2016, 12, 263–266. [Google Scholar] [CrossRef]
  19. Phillips, A.J.; Clerx, W.M.; O’Brien, C.S.; Sano, A.; Barger, L.K.; Picard, R.W.; Czeisler, C.A. Irregular sleep/wake patterns are associated with poorer academic performance and delayed circadian and sleep/wake timing. Sci. Rep. 2017, 7, 3216. [Google Scholar] [CrossRef]
  20. Lunsford-Avery, J.R.; Engelhard, M.M.; Navar, A.M.; Kollins, S.H. Validation of the sleep regularity index in older adults and associations with cardiometabolic risk. Sci. Rep. 2018, 8, 14158. [Google Scholar] [CrossRef] [Green Version]
  21. Rosenthal, L.; Roehrs, T.A.; Rosen, A.; Roth, T. Level of sleepiness and total sleep time following various time in bed conditions. Sleep 1993, 16, 226–232. [Google Scholar] [CrossRef]
  22. Randler, C.; Vollmer, C.; Kalb, N.; Itzek-Greulich, H. Breakpoints of time in bed, midpoint of sleep, and social jetlag from infancy to early adulthood. Sleep Med. 2019, 57, 80–86. [Google Scholar] [CrossRef]
  23. Cohen, S.; Fulcher, B.D.; Rajaratnam, S.M.; Conduit, R.; Sullivan, J.P.; St Hilaire, M.A.; Phillips, A.J.K.; Loddenkemper, T.; Kothare, S.V.; McConnell, K.; et al. Sleep patterns predictive of daytime challenging behavior in individuals with low-functioning autism. Autism Res. 2018, 11, 391–403. [Google Scholar] [CrossRef] [PubMed]
  24. Zhang, M.L.; Zhou, Z.H. ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognit. 2007, 40, 2038–2048. [Google Scholar] [CrossRef]
  25. Rashid, W.; Gupta, M.K. A Perspective of Missing Value Imputation Approaches. In Advances in Computational Intelligence and Communication Technology; Springer: Singapore, 2021; pp. 307–315. [Google Scholar]
  26. Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. 1998, 13, 18–28. [Google Scholar] [CrossRef] [Green Version]
  27. Biau, G. Analysis of a random forests model. J. Mach. Learn. Res. 2012, 13, 1063–1095. [Google Scholar]
  28. Dong, L.; Martinez, A.J.; Buysse, D.J.; Harvey, A.G. A composite measure of sleep health predicts concurrent mental and physical health outcomes in adolescents prone to eveningness. Sleep Health 2019, 5, 166–174. [Google Scholar] [CrossRef] [PubMed]
  29. Brindle, R.C.; Yu, L.; Buysse, D.J.; Hall, M.H. Empirical derivation of cutoff values for the sleep health metric and its relationship to cardiometabolic morbidity: Results from the Midlife in the United States (MIDUS) study. Sleep 2019, 42, zsz116. [Google Scholar] [CrossRef] [PubMed]
  30. Leung, K.; Cheong, F.; Cheong, C.; O‘Farrell, S.; Tissington, R. Building a Scorecard in Practice. In Proceedings of the 7th International Conference on Computational Intelligence in Economics and Finance, Taoyuan, Taiwan, 5–7 December 2008. [Google Scholar]
  31. Vejkanchana, N.; Kuacharoen, P. Continuous Variable Binning Algorithm to Maximize Information Value Using Genetic Algorithm. In International Conference on Applied Informatics; Springer: Cham, Switzerland, 2019; pp. 158–172. [Google Scholar]
  32. Finlay, S. Data Pre-Processing. In Credit Scoring, Response Modelling and Insurance Rating; Palgrave Macmillan: London, UK, 2010; pp. 144–159. [Google Scholar]
  33. Zdravevski, E.; Lameski, P.; Kulakov, A. Weight of evidence as a tool for attribute transformation in the preprocessing stage of supervised learning algorithms. IJCNN 2011, 181–188. [Google Scholar]
  34. Vanneschi, L.; Horn, D.M.; Castelli, M.; Popovič, A. An artificial intelligence system for predicting customer default in e-commerce. Expert Syst. Appl. 2018, 104, 1–21. [Google Scholar] [CrossRef]
  35. Dastile, X.; Celik, T.; Potsane, M. Statistical and machine learning models in credit scoring: A systematic literature survey. Appl. Soft Comput. 2020, 91, 106263. [Google Scholar] [CrossRef]
  36. Peng, C.Y.J.; Lee, K.L.; Ingersoll, G.M. An introduction to logistic regression analysis and reporting. J. Educ. Res. 2002, 96, 3–14. [Google Scholar] [CrossRef]
  37. Obuchowski, N.A. Receiver operating characteristic curves and their use in radiology. Radiology 2003, 229, 3–8. [Google Scholar] [CrossRef] [PubMed]
  38. Zeng, G. A comparison study of computational methods of Kolmogorov–Smirnov statistic in credit scoring. Commun. Stat. Simul. Comput. 2017, 46, 7744–7760. [Google Scholar] [CrossRef]
  39. Abdou, H.A.; Pointon, J. Credit scoring, statistical techniques and evaluation criteria: A review of the literature. Intell. Syst. Account. Financ. Manag. 2011, 18, 59–88. [Google Scholar] [CrossRef] [Green Version]
  40. Woo, H.S.; Lee, S.H.; Cho, H. Building credit scoring models with various types of target variables. J. Korean Data Inf. Sci. Soc. 2013, 24, 85–94. [Google Scholar]
  41. Park, I. Developing the osteoporosis risk scorecard model in Korean adult women. J. Health Inform. Stat. 2021, 46, 44–53. [Google Scholar] [CrossRef]
  42. Han, J.T.; Park, I.S.; Kang, S.B.; Seo, B.G. Developing the High-Risk Drinking Scorecard Model in Korea. Osong Public Health Res. Perspect. 2018, 9, 231–239. [Google Scholar] [CrossRef]
  43. Siddiqi, N. Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring; Wiley & Sons.: Hoboken, NJ, USA, 2012; Volume 3. [Google Scholar]
  44. Divina, F.; Gilson, A.; Goméz-Vela, F.; García Torres, M.; Torres, J.F. Stacking ensemble learning for short-term electricity consumption forecasting. Energies 2018, 11, 949. [Google Scholar] [CrossRef] [Green Version]
  45. Wang, T.; Zhang, K.; Thé, J.; Yu, H. Accurate prediction of band gap of materials using stacking machine learning model. Comput. Mater. Sci. 2022, 201, 110899. [Google Scholar] [CrossRef]
  46. Chen, T.; Guestrin, C. Xgboost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  47. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 2017, 30, 3149–3157. [Google Scholar]
  48. Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost: Gradient boosting with categorical features support. arXiv 2018, arXiv:1810.11363. [Google Scholar]
  49. Arik, S.Ö.; Pfister, T. Tabnet: Attentive interpretable tabular learning. AAAI 2021, 35, 6679–6687. [Google Scholar] [CrossRef]
  50. Rasifaghihi, N.; Li, S.S.; Haghighat, F. Forecast of urban water consumption under the impact of climate change. Sustain. Cities Soc. 2020, 52, 101848. [Google Scholar] [CrossRef]
  51. Hans, C. Elastic net regression modeling with the orthant normal prior. JASA 2011, 106, 1383–1393. [Google Scholar] [CrossRef]
  52. Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
  53. Gunst, R.F.; Mason, R.L. Biased estimation in regression: An evaluation using mean squared error. JASA 1977, 72, 616–628. [Google Scholar] [CrossRef]
  54. Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-Generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; ACM: New York, NY, USA, 2019. [Google Scholar]
  55. Halson, S.L.; Johnston, R.D.; Piromalli, L.; Lalor, B.J.; Cormack, S.; Roach, G.D.; Sargent, C. Sleep Regularity and Predictors of Sleep Efficiency and Sleep Duration in Elite Team Sport Athletes. Sport. Med. Open 2022, 8, 79. [Google Scholar] [CrossRef]
  56. Windred, D.P.; Jones, S.E.; Russell, A.; Burns, A.C.; Chan, P.; Weedon, M.N.; Rutter, M.K.; Olivier, P.; Vetter, C.; Saxena, R.; et al. Objective assessment of sleep regularity in 60 000 UK Biobank participants using an open-source package. Sleep 2021, 44, zsab254. [Google Scholar] [CrossRef]
  57. Makarem, N.; Zuraikat, F.M.; Aggarwal, B.; Jelic, S.; St-Onge, M.P. Variability in sleep patterns: An emerging risk factor for hypertension. Curr. Hypertens. Rep. 2020, 22, 19. [Google Scholar] [CrossRef]
  58. Baron, K.G.; Reid, K.J.; Malkani, R.G.; Kang, J.; Zee, P.C. Sleep variability among older adults with insomnia: Associations with sleep quality and cardiometabolic disease risk. Behav. Sleep Med. 2017, 15, 144–157. [Google Scholar] [CrossRef] [Green Version]
  59. Buman, M.P.; Phillips, B.A.; Youngstedt, S.D.; Kline, C.E.; Hirshkowitz, M. Does nighttime exercise really disturb sleep? Results from the 2013 National Sleep Foundation Sleep in America Poll. Sleep Med. 2014, 15, 755–761. [Google Scholar] [CrossRef]
  60. Stutz, J.; Eiholzer, R.; Spengler, C.M. Effects of evening exercise on sleep in healthy participants: A systematic review and meta-analysis. Sport. Med. 2019, 49, 269–287. [Google Scholar] [CrossRef] [PubMed]
  61. Frimpong, E.; Mograss, M.; Zvionow, T.; Dang-Vu, T.T. The effects of evening high-intensity exercise on sleep in healthy adults: A systematic review and meta-analysis. Sleep Med. Rev. 2021, 60, 101535. [Google Scholar] [CrossRef]
  62. Kim, J.; Lee, J.; Park, M. Identification of Smartwatch-Collected Lifelog Variables Affecting Body Mass Index in Middle-Aged People Using Regression Machine Learning Algorithms and SHapley Additive Explanations. Appl. Sci. 2022, 12, 3819. [Google Scholar] [CrossRef]
  63. Liang, Z.; CHAPA-MARTELL, M.A. Predicting Medical-Grade Sleep-Wake Classification from Fitbit Data Using Tree-Based Machine Learning. Rep. Number IPSJ SIG Tech. Rep. 2019, 2019, 14. [Google Scholar]
  64. Jiang, M.; Liu, J.; Zhang, L.; Liu, C. An improved Stacking framework for stock index prediction by leveraging tree-based ensemble models and deep learning algorithms. Phys. A Stat. Mech. Appl. 2020, 541, 122272. [Google Scholar] [CrossRef]
  65. Pavlyshenko, B. Using Stacking Approaches for Machine Learning Models. In Proceedings of the 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP), Lviv, Ukraine, 21–25 August 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 255–258. [Google Scholar]
  66. Yu, W.; Li, S.; Ye, T.; Xu, R.; Song, J.; Guo, Y. Deep ensemble machine learning framework for the estimation of PM 2.5 concentrations. Environ. Health Perspect. 2022, 130, 037004. [Google Scholar] [CrossRef] [PubMed]
Figure 1. (a) Summary flowchart; (b) specific sleep score creation process.
Figure 1. (a) Summary flowchart; (b) specific sleep score creation process.
Applsci 13 01043 g001
Figure 2. (a) Histogram of sleep habit scores for good sleep using a logistic regression model; (b) histogram of sleep habit scores generated for bad sleep data; (c) histogram of sleep habit scores for good and bad sleep data combined.
Figure 2. (a) Histogram of sleep habit scores for good sleep using a logistic regression model; (b) histogram of sleep habit scores generated for bad sleep data; (c) histogram of sleep habit scores for good and bad sleep data combined.
Applsci 13 01043 g002
Figure 3. Summary graph for stacking technique: ML and DL indicate machine learning and deep learning, respectively.
Figure 3. Summary graph for stacking technique: ML and DL indicate machine learning and deep learning, respectively.
Applsci 13 01043 g003
Figure 4. Summary graph for the first step of the stacking method: training dataset of metamodels is generated by machine learning models and a deep learning model in this step.
Figure 4. Summary graph for the first step of the stacking method: training dataset of metamodels is generated by machine learning models and a deep learning model in this step.
Applsci 13 01043 g004
Figure 5. Summary graph for the second step of the stacking method: training dataset of final model is generated by metamodels in this step.
Figure 5. Summary graph for the second step of the stacking method: training dataset of final model is generated by metamodels in this step.
Applsci 13 01043 g005
Figure 6. Summary graph for the final stage of the stacking method: in this step, a sleep habit score is calculated for the intermediate sleep state.
Figure 6. Summary graph for the final stage of the stacking method: in this step, a sleep habit score is calculated for the intermediate sleep state.
Applsci 13 01043 g006
Figure 7. Summary graph of the process of CV stacking in each model.
Figure 7. Summary graph of the process of CV stacking in each model.
Applsci 13 01043 g007
Figure 8. Sleep habit score distribution for all data.
Figure 8. Sleep habit score distribution for all data.
Applsci 13 01043 g008
Figure 9. (a) Sleep state probabilities for SRI interval; (b) sleep state probabilities for STEP_INFO_2 interval; (c) sleep state probabilities for WEEKLY_TST_VAR interval.
Figure 9. (a) Sleep state probabilities for SRI interval; (b) sleep state probabilities for STEP_INFO_2 interval; (c) sleep state probabilities for WEEKLY_TST_VAR interval.
Applsci 13 01043 g009
Table 1. Description of raw data collected from wearable devices (Samsung Galaxy watch).
Table 1. Description of raw data collected from wearable devices (Samsung Galaxy watch).
CategoryValueDescription
Quantity of sleep data collected by day67,180 rowsSleep data set collected by day with Samsung Galaxy 4 or 5
Quantity of sleep data collected by minute2,494,862 rowsSleep data set collected by day with Samsung Galaxy 4 or 5
Quantity of step (gait) data collected by day78,643 rowsStep data set collected by day with Samsung Galaxy Watch 4 or 5
Quantity of step data collected by minute18,710,423 rowsStep data set collected by day with Samsung Galaxy 4 or 5
Quantity of user information data918 rowsUser information such as height and age
Number of users714
Period of data collection26 November 2020 to 1 January 2022
Table 2. Feature names and descriptions as generated from raw data collected daily/per minute.
Table 2. Feature names and descriptions as generated from raw data collected daily/per minute.
FeatureMeaningFeatureMeaning
USER_CODEUser identification codeNAP_FLAGDaily nap occurrence status
DATEData collection dateNAP_HOURTotal sleep time from 12 noon to 3 p.m. (less than 3 h)
SLEEP_EFFICIENCYRatio of sleep time excluding awake time to total sleep timeWEEKLY_MEAN_SLEEP_MIDPOINTThe average time of the midpoint of sleep during the weekdays
DSRPercentage of deep sleep phases per dayWEEKLY_MEAN_SLEEP_START_TIMEThe average time of the sleep onset during the weekdays
RSRPercentage of rem sleep phases per dayWEEKEND_MEAN_SLEEP_START_TIMEThe average time of the sleep onset during the weekends
LSRPercentage of light sleep phases per dayDIFF_WEEK_HOLIDifference between average weekday onset sleep and average sleep onset on weekends
ASRPercentage of awake sleep phases per dayWEEKLY_MEAN_TSTThe average time of the total sleep time per day during the weekdays
TSTTotal sleep time per dayDIFF_SLEEP_START_WEEKLYThe difference between the average weekly sleep onset time and daily average sleep onset time
SLEEP_START_HSleep onset time (hours) per dayDIFF_SLEEP_END_WEEKLYThe difference between the average weekly sleep offset time and daily average sleep offset time
SLEEP_END_HDaily sleep offset time (hours) per dayWEEKLY_MIDPOINT_VARThe variation of the midpoint of sleep during the weekdays
AWAKE_TTotal awake time per dayWEEKLY_TST_VARThe variation of total sleep time during the weekdays
DEEP_TTotal time of deep sleep phases per dayGOOD_SLEEP_FLAGSleep quality status based on various sleep dimensions
REM_TTotal time of rem sleep phases per dayGENDERUser’s gender
LIGHT_TTotal time of light sleep phases per dayAGEUser’s age
SLEEP_EFFICIENCY_CAT85% cutoff criterion flag for Sleep Efficiency IndexAGE_CATEGORYUser’s age category
BED_TIME_VARsleep onset variabilityFIRST_AWAKE_MINTotal time of awake sleep phases in earlier (90 min) sleep cycles
Table 3. Interval range information for each feature based on WoE values.
Table 3. Interval range information for each feature based on WoE values.
FeatureCalculation of Bin Interval Excluding Missing ValuesFeatureCalculation of Bin Interval Excluding Missing Values
SRI_2 (Sleep regular index observed over 2 days)[−inf, 52], [52, 72], [72, 86], [86, inf]SRI_3 (Sleep regular index observed over 3 days)[−inf, 56], [56, 70], [70, 82], [82, 90], [90, inf]
SRI_4 (Sleep regular index observed over 4 days)[−inf, 52], [52, 62], [62, 70], [70, 82], [82, inf]SRI_5 (Sleep regular index observed over 5 days)[−inf, 52], [52, 58], [58, 82], [82, 88], [88, inf]
SRI_6 (Sleep regular index observed over 6 days)[−inf, 56], [56, 80], [80, 86], [86, inf]SRI_7 (Sleep regular index observed over 7 days)[−inf, 56], [56, 78], [78, 84], [84, inf]
Daily Sleep offset time information (hour)[−inf, 7.5], [7.5, inf]Average weekend sleep onset information[−inf, 2], [2, 12.5], [12.5, 23.5], [23.5, 24.5], [24.5, inf]
Daily Sleep onset time information (hour)[−inf, 24.5], [24.5, inf]Sleep midpoint variability[−inf, 3], [3, 4], [4, 4.5], [4.5, inf]
Average weekly sleep onset information[−inf, 2], [2, 6.5], [6.5, 11.5], [11.5, 16.5], [16.5, 24], [24, inf]Daily total sleep time variance (HOUR)[−inf, 0.4], [0.4, 0.6], [0.6, 0.8], [0.8, 2.2], [2.2, 2.6], [2.6, 3.6], [3.6, inf]
REM sleep rate (%) in Initial 90 min[−inf, 0.01], [0.01, 0.08], [0.08, 0.18], [0.18, inf]LIGHT sleep rate (%) in Initial 90 min[−inf, 0.54], [0.54, 0.62], [0.62, 0.84], [0.84, 0.9], [0.9, 0.96], [0.96, inf]
DEEP sleep rate (%) in Initial 90 min[−inf, 0.01], [0.01, 0.05], [0.05, 0.09], [0.09, 0.15], [0.15, 0.23], [0.23, 0.32], [0.32, inf]AWAKE sleep rate (%) in Initial 90 min[−inf, 0.01], [0.01, 0.04], [0.04, 0.08], [0.08, 0.09], [0.09, 0.14], [0.14, 0.17], [0.17, 0.2], [0.2, inf]
Total REM sleep time (MINUTE) in initial 90 min[−inf, 1], [1, 6], [6, 11], [11, inf]Total LIGHT sleep time (MINUTE) in initial 90 min[−inf, 29], [29, 34], [34, 37], [37, 39], [39, 50], [50, inf]
Total DEEP sleep time (MINUTE) in initial 90 min[−inf, 1], [1, 7], [7, 12], [12, 16], [16, 22], [22, inf]Total AWAKE sleep time (MINUTE) in initial 90 min[−inf, 1], [1, 2], [2, 6], [6, 8], [8, 12], [12, 16], [16, inf]
Weekly total sleep time variance[−inf, 0.8], [0.8, 1.2], [1.2, 2.8], [2.8, 3.7], [3.7, inf]Total steps taken 2 h before sleep[−inf, 10], [10, 720], [720, inf]
Total sleep stage time in initial 90 min[−inf, 40], [40, 48], [48, 50], [50, 52], [52, 55], [55, 58], [58, 60], [60, inf]
Table 4. Performance metric table for training data and validation data.
Table 4. Performance metric table for training data and validation data.
CategoryAUROCK–SGini Coefficient
Train data0.98470.89120.9694
Validation data0.98450.88820.969
Reference>0.7>0.5>0.6
Table 5. Statistical information of the sleep habit score obtained from the logistic regression model.
Table 5. Statistical information of the sleep habit score obtained from the logistic regression model.
CountMeanStandard Deviation25%50%75%Max
5494960.889516285.11407616374811621857
Table 6. Scorecard for each feature.
Table 6. Scorecard for each feature.
FeatureInterval ValueScore
SRI_2~52−2
52~72−1
72~8611
86~29
SRI_3~56−50
56~70−34
70~826
82~9087
90180
SRI_4~52−45
52~62−19
62~702
70~8215
82~29
SRI_5~52−17
52~58−14
58~825
82~8834
88~67
SRI_6~56−19
56~80−13
80~869
86~13
SRI_7~56−17
56~78−8
78~840
84~6
Table 7. Table of data distribution and ratio of good sleep states by score.
Table 7. Table of data distribution and ratio of good sleep states by score.
Sleep ScoreNumber of Data PointsProportion of DataFrequency of Good Sleep StatusGood Sleep Statis Ratio
50 < ss ≤ 10060.037%00.00%
100 < ss ≤ 150210.131%00.00%
150 < ss ≤ 200890.554%00.00%
200 < ss ≤ 2501400.872%00.00%
250 < ss ≤ 3002141.333%00.00%
300 < ss ≤ 3501861.159%00.00%
350 < ss ≤ 4002551.588%00.00%
400 < ss ≤ 4503212.000%00.00%
450 < ss ≤ 5004232.635%00.00%
500 < ss ≤ 5505833.632%00.00%
550 < ss ≤ 6007394.604%00.00%
600 < ss ≤ 6507424.622%00.00%
650 < ss ≤ 7007044.385%00.00%
700 < ss ≤ 7508195.102%00.00%
750 < ss ≤ 8009335.812%00.00%
800 < ss ≤ 85010996.846%10.31%
850 < ss ≤ 90010856.759%00.31%
900 < ss ≤ 9508485.283%00.31%
950 < ss ≤ 10009015.613%00.31%
1000 < ss ≤ 10508725.432%10.61%
1050 < ss ≤ 11009385.843%21.23%
1100 < ss ≤ 11507884.909%11.53%
1150 < ss ≤ 12006764.211%73.68%
1200 < ss ≤ 12505443.389%127.36%
1250 < ss ≤ 13004322.691%1311.35%
1300 < ss ≤ 13504482.791%1415.64%
1350 < ss ≤ 14004772.971%4128.22%
1400 < ss ≤ 14503191.987%4441.72%
1450 < ss ≤ 15001671.040%5759.20%
1500 < ss ≤ 15501600.997%6880.06%
1550 < ss ≤ 16001170.729%5998.16%
1600 < ss ≤ 165070.044%6100.00%
Total16,053100.00%3262.03%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kim, J.; Park, M. A Study on ML-Based Sleep Score Model Using Lifelog Data. Appl. Sci. 2023, 13, 1043. https://doi.org/10.3390/app13021043

AMA Style

Kim J, Park M. A Study on ML-Based Sleep Score Model Using Lifelog Data. Applied Sciences. 2023; 13(2):1043. https://doi.org/10.3390/app13021043

Chicago/Turabian Style

Kim, Jiyong, and Minseo Park. 2023. "A Study on ML-Based Sleep Score Model Using Lifelog Data" Applied Sciences 13, no. 2: 1043. https://doi.org/10.3390/app13021043

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop