*4.2. Inter-Device Comparisons for Sleep Stages and Metrics*

Table 2 shows the summary statistics for all device-produced metrics and SRSMs. TSD was reported by all devices and by the participants themselves (i.e., as part of SRSM). Figure 2A shows a correlation matrix of TSD. The correlations were generally medium to weak ( < 0.7 for all pairwise comparisons), although surprisingly the correlations of the SRSM with device estimates were on par with correlations among the devices themselves. Figure 2B shows a REM sleep (in sec) cycle correlations across the Oura, Hexoskin and Withings (Fitbit did not report an estimate of REM sleep). The correlation between Oura and Withings was highest at = 0.44, while Oura and Hexoskin had the lowest correlation ( = 0.22). Figure 2C shows Kendall's rank correlation across overall sleep stages for Withings, Hexoskin, and Oura (see Section 3.9). All of these assessments were statistically significant at the *p* < 0.05 threshold. We report the *p* values from these analyses in Supplemental S5.

**Figure 2.** (**A**) A correlation matrix of total sleep duration (TSD) (in seconds) by device and self-reported estimation (i.e., self-reported sleep metrics (SRSMs)) with *p* value significance indication (\* *p* < 0.1; \*\* *p* < 0.05; \*\*\* *p* < 0.01). Each point represents data from each night for each participant. The plots in the diagonals of A and B reflect the distribution of sleep metric of interest (TSD and REM, respectively). (**B**) A REM sleep (in sec) correlation across the Oura, Hexoskin, and Withings devices with *p* value significance indication (same as above). The Fitbit was excluded, as it does not track REM vs. NREM sleep. for each individual device. The plots in the bottom left of A and B show the trend line with 95% confidence intervals between devices. (**C**) A correlation matrix of overall sleep stages (awake, NREM, and REM) between Oura, Hexoskin, and Withings devices (Fitbit does not differentiate between NREM and REM) with *p* value significance indication (same as above).


**Table 2.** Summary metrics of device data and SRSMs. All units are in hours except wakeups which is in occurrences and efficiency (no units). Sleep efficiency is a metric to track percentage of time in bed while asleep. TSD is total sleep duration which is similar to start-end duration and similar features were utilized that included latency and other measures.
