Next Article in Journal
Respiratory Rate Monitoring via a Fibre Bragg Grating-Embedded Respirator Mask with a Wearable Miniature Interrogator
Previous Article in Journal
A High-Sensitivity Fiber Optic Soil Moisture Sensor Based on D-Shaped Fiber and Tin Oxide Thin Film Coatings
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deriving Accurate Nocturnal Heart Rate, rMSSD and Frequency HRV from the Oura Ring

Centre for Sleep and Cognition, Yong Loo Lin School of Medicine, National University of Singapore (NUS), Singapore 117549, Singapore
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(23), 7475; https://doi.org/10.3390/s24237475
Submission received: 23 September 2024 / Revised: 14 November 2024 / Accepted: 19 November 2024 / Published: 23 November 2024
(This article belongs to the Section Wearables)

Abstract

:
Cardiovascular diseases are a major cause of mortality worldwide. Long-term monitoring of nighttime heart rate (HR) and heart rate variability (HRV) may be useful in identifying latent cardiovascular risk. The Oura Ring has shown excellent correlation only with ECG-derived HR, but not HRV. We thus assessed if stringent data quality filters can improve the accuracy of time-domain and frequency-domain HRV measures. 92 younger (<45 years) and 22 older (≥45 years) participants from two in-lab sleep studies with concurrent overnight Oura and ECG data acquisition were analyzed. For each 5 min segment during time-in-bed, the validity proportion (percentage of interbeat intervals rated as valid) was calculated. We evaluated the accuracy of Oura-derived HR and HRV measures against ECG at different validity proportion thresholds: 80%, 50%, and 30%; and aggregated over different durations: 5 min, 30 min, and Night-level. Strong correlation and agreements were obtained for both age groups across all HR and HRV metrics and window sizes. More stringent validity proportion thresholds and averaging over longer time windows (i.e., 30 min and night) improved accuracy. Higher discrepancies were found for HRV measures, with more than half of older participants exceeding 10% Median Absolute Percentage Error. Accurate HRV measures can be obtained from Oura’s PPG-derived signals with a stringent validity proportion threshold of around 80% for each 5 min segment and aggregating over time windows of at least 30 min.

1. Introduction

Cardiovascular diseases (CVD) have become the global leading cause of death, accounting for over 10 million deaths worldwide annually [1]. Continuous monitoring of physiological signals such as heart rate (HR) contributes to better detection of abnormalities during free-living conditions. Higher HR, especially during sleep, is strongly associated with increased mortality and CVD risk [2]. On the other hand, lower heart rate variability (HRV), which is an indicator of altered autonomic nervous system activity [3], predicts higher CVD risk in population studies [4,5]. Long-term and large-scale monitoring of HR and HRV during sleep may contribute to more effective CVD prevention with earlier risk detection. However, the standard electrocardiogram (ECG) measurement used in clinical and research settings is neither scalable nor suitable for long-term use under free living conditions [6,7].
Commercially available sleep trackers with embedded photoplethysmography (PPG) sensors can support the long-term continuous acquisition of HR and HRV data with high temporal resolution. PPG is an established, non-invasive optical technique that can be used to estimate changes in blood volume coupled to heart rate [8]. The green, red, and infrared light sensors are commonplace in existing wrist- or finger-worn devices, as they are cost-efficient to implement and produce reliable results [6,9]. Wearable-derived HR and HRV show good agreement with reference measurements when taken during rest and sleep [10,11,12]. The Oura Ring, a finger-worn sleep tracker, is a validated device for measuring sleep and HR [13,14,15], and a potential candidate for long-term cardiovascular monitoring. The ring form factor is smaller and less obtrusive than wrist-wearables, and thus is well-suited for physiological measurements during sleep. Like most other wearables [16], Oura only provides time-domain HRV metrics (such as root mean square of successive differences, rMSSD). However, insufficient sleep disrupts cardiac autonomic balance by increasing sympathetic activity while decreasing parasympathetic activity [17,18,19], which is indexed by High Frequency HRV [20]. In the long run, this may result in CVD [5,21]. Hence, in addition to rMSSD, acquiring frequency-based HRV measures from wearables could provide a more comprehensive profile of cardiovascular well-being. However, despite showing high HR accuracy, a recent study [13] which evaluated a comprehensive suite of frequency-based HRV metrics from Oura data found only moderate agreement with concurrent reference ECG measures, falling short of acceptable accuracy.
A simple processing detail may have contributed to the poor correspondence in HRV measures between Oura and ECG. For each inter-beat-intervals (IBI) reading, Oura provides a validity rating based on their quality assessment filters (for details, see [10]). Following [10,13] accepted a 5 min segment of Oura IBI readings if at least 30% of readings within that time window were rated as valid. While this validity proportion threshold could still yield a relatively accurate measure of mean HR within a 5 min period, the impact on HRV accuracy is substantially more detrimental and deserves careful consideration. Variability metrics like rMSSD rely on the continuity of data, which are compromised by gaps left by missing or rejected IBI readings [22,23]. Spectral components can also be distorted by missing IBI, especially if up to 70% may be missing at a 30% validity proportion threshold, thus reducing the accuracy of frequency measures [24,25].
Hence, we propose a simple solution to improve the accuracy of HRV metrics from the Oura Ring: apply a more stringent validity proportion threshold. Stringent filtering reduces data gaps, ensuring that only epochs with sufficient valid IBIs to represent actual heart rate fluctuations are included, thus yielding more accurate and reliable HRV measures. Importantly, this reduces the likelihood of inaccurate or unstable HRV readings that may lead to undue concern for users. Our approach is also expected to improve HRV measures derived from other PPG-based wearables with ring or watch form factors, as they face similar challenges of increased susceptibility to transient artefacts related to poor contact or motion. During such periods, the total number of IBIs recorded may not be accurate due to invalid or missed IBIs, so a count-based approach to segment validity may not be appropriate [13]. Thus, we also used a different definition of validity proportion based on duration, i.e., the sum of valid IBIs divided by the time window.
Here, we showed that a more stringent validity proportion threshold of 80% can indeed provide accurate HRV measures of rMSSD and HF (compared against ECG). However, a higher validity proportion threshold also results in a higher data rejection rate, which needs to be taken into consideration [26]. Thus, we compared the accuracy of HR and HRV measures and data rejection rates at 80%, 50%, and 30% validity thresholds to evaluate the practicality of increasing validity proportion thresholds.

2. Materials and Methods

2.1. Participants and Protocol

Data from two previously published studies with concurrent ECG (SOMNOmedics, GmbH, Randersacker, Germany) and Oura Ring (Gen 3, Oura Ring Inc., Oulu, Finland) recordings [27,28] were combined for the current analyses. Participants were not clinically diagnosed with pre-existing sleep, neurological, or psychiatric disorders; not experiencing excessive daytime sleepiness (Epworth Sleepiness Scale < 11, [29]); not taking wake-promoting medication; habitually sleep more than 5 h per night; and have a body mass index (BMI) <35 kg/m2. Anthropometric measurements like height, weight, waist circumference, and office blood pressure (BP) were taken during a daytime briefing prior to their sleep session. Participants slept according to their habitual bed- and wake-times in a sleep laboratory while polysomnography (PSG), ECG, and Oura Ring data were collected simultaneously during their sleep. Informed consent was obtained during the briefing for both studies. The Institutional Review Board of the National University of Singapore approved the protocols, which were compliant with the Declaration of Helsinki.
For both studies, a study night’s recordings were included if they passed quality control checks. For Study 2, only recordings from the first night were included, as the second night involved a sleep disruption protocol, which is expected to affect HR and HRV measurements [30]. After excluding 28 participants from Study 1 and 23 from Study 2, 114 participants were included in the analyses (68 from Study 1; 50 males; 28.0 ± 15.8 years old). Participants were further divided into two age groups: younger (20–44 years old; N = 92; 45.7% male) versus older (45–68 years old; N = 22; 36.4% male) to examine whether HR and HRV measurement accuracies differed between younger and older participants [31].

2.2. Devices

ECG data were collected using the ECG electrodes of SOMNOtouch devices (SOMNOmedics, GmbH, Randersacker, Germany). The Oura Ring (Oura Ring Inc., Oulu, Finland) estimates IBI using PPG signals (sampling rate 250 Hz) collected via multiple infrared (900 nm) light sensors. A real-time moving average filter is applied to the raw PPG signals to locate local maxima and minima in order to compute IBI, after which IBI normalcy is assessed through median filters [32]. In both studies, participants wore Oura Rings (Generation 3) on their non-dominant hand, as shown in Figure 1.

2.3. Data Analysis

ECG waveform analyses were conducted in MATLAB (R2021b; The Math Works, Inc., Natick, MA, USA). Oura IBI processing, HR and HRV calculation, and statistical comparisons were performed in RStudio (version 2023.12.0+369). As the Oura Ring’s efficacy as a standalone device for cardiovascular monitoring during sleep is being evaluated, for the current analyses, the start and end of sleep periods were defined using Oura-assessed bed- and wake times, which have been shown to be highly accurate when compared against PSG [33,34]. The Time-In-Bed period (TIB) was divided into 5 min segments for comparing HR and HRV metrics derived from ECG normal-to-normal (NN) and Oura IBI signals. Before further analyses, both time series were checked for: (1) validity of each NN or IBI reading; (2) physiological plausibility during sleep; and (3) validity proportion for each 5 min segment (Figure 2).

2.3.1. Oura IBI

Raw PPG signals first underwent Oura’s internal processing to derive IBI readings. A validity rating was provided for each IBI reading, based on a 5-point ordinal scale from 0 to 4. Ratings other than ‘1’ indicated unreliable readings, so only IBIs rated ‘1’ were retained. Additionally, following [10], valid IBIs were retained only if two immediately preceding and succeeding IBIs were also ‘1’. Next, physiologically implausible IBIs during sleep were removed (IBI < 375 ms or HR > 160 bpm; IBI > 2000 ms or HR < 30 bpm; [13,35]). Finally, we evaluated the impact of different validity proportion thresholds on the degree of correlation and agreement between Oura Ring- and ECG-derived HR and HRV measures at 80%, 50%, and 30%. To ensure representative coverage of the night, we only accepted nights where at least twenty 5 min epochs survived the validity proportion threshold. Reference [13] accepted a 5 min segment if at least 30% of the number of readings within the 5 min were rated as valid. However, invalid readings can be unusually long or short, sometimes due to artefacts like poor contact or motion, so the IBI count may not be a reliable indicator of the actual number of heartbeats. This issue also precluded beat-to-beat matching of Oura IBI and ECG NN. Thus, we adopted a duration-based definition of validity proportion: the fraction of time accounted for by valid IBIs within a time window, i.e., for a 5 min segment to pass a validity proportion threshold of 80%, the sum of valid IBIs must be at least 240 s.

2.3.2. ECG NN

R-peaks were automatically identified using the QRS detector from the PhysioNet Cardiovascular Signal Toolbox [36], with the following validity and plausibility criteria: (1) lower and upper limits of peak-to-peak duration were set to 375 ms and 2000 ms; (2) a maximum acceptable change in peak-to-peak duration between consecutive peaks of 30%. Misidentified R-peaks were deleted manually in a custom-made Graphic User Interface. No R-peak additions were performed. The NN interval time series was checked for abnormal patterns and any clustering in Poincaré plots suggestive of ectopy. Coverage checks were performed at the night and 5 min segment levels: (1) only 5 min segments in which at least 150 NN-intervals passed the above criteria (equivalent to 50% at 60 bpm) were accepted; and (2) only nights with at least 50% of 5 min segments accepted were retained.

2.4. HR and HRV Metrics

A 5 min window is conventionally used for short-term HRV assessment [37,38], and allows for comparisons across devices and studies. (Although we use the term HRV for consistency, the derived metric actually indicates pulse rate variability, as PPG does not directly measure the heart’s electric signatures.) Only 5 min segments that passed both ECG and Oura IBI quality control checks were included for further analyses. Mean HR and rMSSD were calculated for Oura IBI and ECG-NN for each 5 min segment. HF HRV was calculated using the Lomb-Scargle periodogram [39,40] via the Lomb package in R [41]. Per convention [38], HF power was calculated by taking the area under the NN periodogram for frequency ranges of 0.15–0.40 Hz. To facilitate comparison, HF values were normalized to the sum of HF and LF for the respective 5 min segments, resulting in HFnu values that ranged from 0 to 1 normalized units [42].
Finally, to assess correspondence over larger time windows, we averaged multiple 5 min Oura and ECG segments to obtain 30 min and nightly aggregates. To ensure representativeness, only 30 min windows containing at least three 5 min segments surviving the validity proportion threshold, and night-level windows with at least twenty such 5 min segments, were included for analyses.

2.5. Statistical Analyses

In addition to Pearson’s correlation coefficient (r), the agreement between ECG- and Oura-derived metrics were also assessed using the Concordance Correlation Coefficient (CCC), which evaluates how far the linear relationship between two variables deviates from the line of perfect concordance (i.e., y = x; [43]). Similarly to Pearson’s r, CCC values range from 0 to 1, with higher values indicating better correlation. The systemic errors between ECG and Oura IBI-based measures were evaluated using Bland–Altman plots and limits of agreement analysis [44].
Lastly, Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Median Absolute Percentage Error (MdAPE) for all HR/HRV metrics were used to evaluate discrepancies between Oura and ECG at the individual participant level. This was carried out for readings at the 5 min level, from which the 30 min and night level values were derived. As there is no clear consensus on which error metric to report for HR/HRV analyses, we report all 3 with median (interquartile range, IQR) values due to slightly skewed distributions (Supplementary Material, Table S2). MAPE and MdAPE values < 10% were considered acceptable for both HR [45], and HRV [15] measures. The formulae for the error metrics are as follows:
Mean   Absolute   Error   ( MAE ) = 1 n i = 1 n | H O u r a H E C G |
Mean   Absolute   Percentage   Error   ( MAPE ) = 1 n i = 1 n | H O u r a H E C G | H E C G
Median   Absolute   Percentage   Error   ( MdAPE ) = m e d i a n ( i = 1 n H O u r a H E C G H E C G )
where H stands for HR, rMSSD, or HFnu.

2.6. Data Retention Rates at Different Validity Proportion Thresholds

Data included in this study were minimally affected by artefacts, as the collection period was during sleep and in a controlled laboratory environment. However, under free-living conditions, where data collection can be affected by various factors, the 80% validity proportion threshold may result in significant data loss, compromising data representativeness and feasibility [26]. Thus, out of practical necessity, other than the accuracy of HR and HRV metrics, it is important to also consider the effect of validity proportion thresholds on data retention rates. Here, we also report the proportion of 5 min segments that are retained at 80%, 50%, and 30% validity proportion thresholds, with the expectation that these may be lower under free living conditions.

3. Results

Table 1 summarizes participant demographics and sleep characteristics. The median BMI and blood pressure for both younger and older groups fell within the healthy range. Compared to younger participants, older participants had earlier bed- and wake times and shorter TIB.

3.1. Performance Evaluation

We first present results for a validity proportion threshold of 80%. Overall, HR and HRV showed high correlations of 0.9 and above for both Pearson’s r and CCC (Figure 3A–C), with the only exception being CCC for 5 min HFnu for the older group, at 0.889 (Figure 3A). Nevertheless, it was clear that at the 5 min level, individual HRV readings could differ substantially between Oura and ECG. However, these discrepancies were reduced with larger window sizes, with all points clustering very closely to the line of equality for all night-level measures. HRV measures, especially HFnu, appear to be more susceptible to random noise at the 5 min level. To demonstrate that higher validity proportion thresholds lead to better correlation, we repeated the analyses at 95% threshold (Figure S3) only for the 5 min window. Across all metrics and age groups, very high correlations of at least 0.9 were obtained. Most notably, the correlations between Oura and ECG HR were virtually 1.00. The HFnu correlations were also greater than 0.93, indicating very high levels of similarity between the Oura- and ECG-derived values.
We then investigated the presence and extent of systemic errors between Oura and ECG through Bland–Altman plots (Figure 4). Overall, mean biases were small across all measures (HR 0.42–0.64 bpm; rMSSD 2.50–3.79 ms; HFnu 0.03). There was a slight proportional bias for rMSSD, as seen in the downward slope across all window sizes. HR and HF did not show proportional biases. Across all metrics, wider limits of agreement were observed at the 5 min level relative to the 30 min and Night levels. Older participants had lower HR and rMSSD ranges than younger participants, but no major differences in IDD were observed between the two groups for any metric.
MdAPE was calculated for each participant’s 5 min level HR/HRV readings at different validity proportion thresholds and visualized as heatmaps (Figure 5; MAE and MAPE showing similar trends are shown in Supplementary Materials Figures S1 and S2), with every individual row as one participant. Within each heatmap per age group, the individual rows were arranged in descending order of mean MdAPE across all validity proportion thresholds.
As expected, HR was the best performing metric, with very low errors for every participant, even at a relaxed threshold of 30% (Figure 5A). In contrast, for both rMSSD and HFnu, some participants had MdAPEs of 20% or higher (red). Even though there was a clear effect of validity proportion threshold, with the lowest error rates at 80%, HRV MdAPE remained above the acceptable threshold of 10% (yellow and red) for many participants at this stringent threshold. While most younger participants had acceptably low HRV MdAPE of <10%, more than half of the older participants had HRV MdAPE >10% (yellow and red).

3.2. Lower Data Retention with Higher Validity Proportion Threshold

Even though our analyses showed that a high validity proportion threshold could improve HR and HRV metrics for the Oura Ring, this came at the cost of rejecting many 5 min segments of IBI readings, as seen in Table 2. At an 80% validity proportion threshold, about 30–35% of the data were rejected. Data retention was even worse at the 95% level, where despite very high correlation (Supplementary Material, Figure S3), only about 30–45% of the data remained. Such poor coverage could undermine the representativeness of the derived metrics.

4. Discussion

This study assessed the accuracy of Oura IBI-derived HR/HRV metrics against ECG-derived ones at different validity proportion thresholds. We showed that, using a stringent validity proportion threshold of 80%, HR, rMSSD, and HFnu derived from the Oura Ring in 5 min windows showed very high correlations with respective reference ECG metrics, even for older participants. That being said, higher discrepancies were found at the individual level between Oura and ECG for both rMSSD and HFnu, especially at more lenient validity proportion thresholds, and for older participants in particular. In contrast, less stringent validity proportion thresholds had almost no impact on HR. The concordance between Oura and ECG HR/HRV measures further improved when averaged over 30 min epochs, and substantially more so across the whole night of sleep. No evidence of substantial bias was detected at larger time windows, implying that highly accurate HRV measures can be obtained from finger PPG signals, as long as random measurement noise at the 5 min level was cancelled out. Table 3 summarizes the main results of the current study in comparison with recent studies comparing wearable HR/HRV against ECG, which reported correlations and mean biases [10,13,46,47,48]. Overall, correlations between wearable and ECG HR were very high across all studies, with very low absolute mean bias. For studies that reported rMSSD, correlations were high, and mean bias was generally low except for [13]. Compared against the only other study that reported frequency-based HRV metrics [13], we had much higher correlation and lower bias.

4.1. High Correlation in HR/HRV Between Oura and ECG

As expected, we found high correlation and agreement between Oura- and ECG-derived HR [13,15,48,49]. Here, we further showed that more stringent quality filtering of the Oura data produced more accurate HR with reduced errors. Importantly, more substantial improvements were seen for Oura HRV measures with higher validity proportion thresholds (Figure 5 and Supplementary Material, Table S1). At the 80% validity proportion threshold, both Pearson’s and Concordance correlations were tending towards 1 for HR and mostly above 0.9 for HRV metrics, much better than previously reported [10,13]. These results were achieved for the shortest acceptable time window for assessing HRV, 5 min, and even better agreement was seen for larger time windows, with neither r value nor CCC below 0.9 even for the frequency-based HFnu.

4.2. Less Accurate 5 Min HRV Measures for Older Participants

Even though Oura-derived HR/HRV metrics showed high concordance with ECG, to qualify as a heart health monitoring device, it is also important to assess the error rates for individual participants. The current analyses considered whether accurate HR/HRV metrics could also be obtained for older participants aged 45 years and above, as age has been shown to affect the accuracy of HRV measures from PPG signals [31,50,51,52]. We observed relatively high MdAPE (>10%) for 5 min HRV readings of older participants (11.36 (6.88) for rMSSD and 11.82 (8.25) for HFnu, as median (IQR), reported in Table S2). Since there was no difference in systematic errors between younger and older participants (as shown with Bland–Altman plots in Figure 4), the observed difference in error metrics is more likely due to individual variability rather than a consistent bias between devices.
In Figure 5, individual participants were sorted by average MdAPE across all measures, i.e., the same row represented the same participant across the heatmaps of all three measures. We observed that participants with high MdAPE in one HRV measure also tended to have higher errors in the other. The exact reasons for this observation in our sample remained unclear as we had access to processed Oura IBI but not the full raw PPG waveforms. One possibility is that the specific wave shapes of the PPG signals from which IBI are estimated are less temporally precise for some individuals. This could be due to reduced skin perfusion or arterial stiffness, which is likely more prevalent in older participants [53,54]. Age-related changes in the PPG waveform, such as smoothing and rounding of systolic peaks [50,55], may interfere with accurate systolic peak detection and IBI estimation. While our findings did not suggest there were systematic errors for the Oura Ring, it remains unclear whether the high error rates for some elderly participants were also associated with undetected cardiovascular conditions or other issues like poor ring fit due to age-related loss of soft tissue.
Even at an extremely high validity proportion threshold of 95%, there were substantial discrepancies between Oura and ECG HRV metrics (Supplementary Material, Figure S3) indicative of inherent limitations in the accuracy of single 5 min HRV measures. In contrast, such lack of temporal precision would not increase HR error by much, as it has little impact on the number of heartbeats detected. Thus, the HR MdAPE for all participants remained below 5%, even at more lenient validity proportion thresholds. HR is an overall average of heart beats within a minute, whereas HRV reflects the variation in intervals between successive heart beats and is more sensitive to the accuracy and continuity of the interval values. Compared to the temporal precision of the ECG R-peaks, there is likely to be some noise in the IBI estimation from PPG waveforms. The discrepancy may be greater for some older participants if the PPG waveform is distorted due to ageing, resulting in higher error values. However, as we only had access to the Oura-defined IBI values and not the raw PPG sensor data, we are unable to further evaluate the accuracy and validity ratings of the Oura IBI values.
Fortunately, despite the relatively high error rate at the 5 min level, the discrepancies between Oura and ECG HRV measures were substantially reduced when aggregated over longer time windows, without showing substantive biases.

4.3. Aggregate Oura HRV Measures over Longer Durations to Improve Accuracy

The 5 min window is conventionally used for short-term HRV assessment [37,38] and may reflect momentary autonomic fluctuations [56]. For long-term heart health monitoring, the accuracy of longer HRV measurements, especially during the night, is also important, since it can be predictive of poor cardiovascular outcomes [57,58]. Here, we found a higher concordance and correlation at larger window sizes (compared to results from 5 min), which is in line with the existing literature [59,60]. In Figure 3, concordance improved from the 5 min to 30 min time windows and lay very close to the line of perfect equality between Oura and ECG at the night level, regardless of age group. Only HFnu showed a slight over-estimation of 0.03–0.04 nu in Oura data for both younger and older participants. This implies that errors seen at the 5 min level may largely be due to random measurement noise rather than systematic errors in the Oura Ring PPG-derived IBI. The higher error rate at the 5 min level should be taken into consideration when trying to assess HRV fluctuations through the night using PPG-based consumer wearables. The likelihood of measurement noise may increase in free-living conditions and in individuals with cardiovascular disease, hindering the reliability of HRV assessment. More precise investigation of individual variability in PPG waveforms and its impact on derived IBI values is needed to reduce such measurement errors.

4.4. Data Rejection Costs of Further Increasing Validity Proportion Threshold

Alternatively, the validity proportion threshold could be increased further to improve HRV accuracy at the 5 min level. At a 95% validity proportion threshold (Supplementary Material, Figure S3), the 5 min HRV accuracy was comparable to averaging over 30 min windows at an 80% threshold (r = 0.935–0.988, CCC = 0.926–0.983; Figure 3). However, more than half of the 5 min segments (younger: 67.2%; older: 55.3%) for younger and older participants could not meet the 95% threshold (Table 2). Under free-living conditions, the rejection rates could be even higher, potentially compromising the representativeness of the remaining HRV values. Further, nights with less than 50% of 5 min windows passing quality checks were excluded from our analyses (Figure 2), i.e., at a 95% validity proportion threshold, most nights would be excluded. Thus, we do not recommend increasing the validity proportion threshold beyond 80%.

4.5. Limitations

Our findings suggest that reliable HRV measures from a PPG-based wearable device are achievable with an 80% validity proportion threshold, and averaged over at least 30 min of data. However, these recommendations need to be tested under free living conditions for different demographics. Firstly, it should be noted that our data were acquired within an ideal sleep laboratory environment, and more artefacts may be expected under free-living conditions. As mentioned above, this could result in lower data retention rates, potentially leaving less than 50% usable data over the whole sleep period at an 80% validity proportion threshold. Secondly, while we considered the effect of age and found high correlations even for our older participants as a group, their individual error rates tended to be higher than those of younger participants. Considering that our sample included only healthy participants, higher error rates would be expected in older participants with various health conditions. For example, we did not directly examine the effects of anomalies like ectopic beats. We excluded ectopic beats from ECG data, but, without directly analyzing PPG waveforms, could only rely on Oura’s IBI validity rating to exclude them. Future studies should evaluate the feasibility and accuracy of different validity proportion thresholds and window length in diverse demographic samples under free living conditions.

5. Conclusions

In this study, we evaluated whether accurate HRV measures during sleep can be derived from PPG-based signals acquired using a wearable in a ring form factor. We showed that, at more stringent validity proportion thresholds, the HR/HRV metrics derived from Oura IBI showed high correlation and agreement with the ECG reference measure regardless of age group or window size. Nevertheless, the trade-off between accuracy and data retention must be carefully considered. In addition, 5 min Oura HRV error rates were relatively high for older participants, but higher accuracy with minimal biases can be achieved when averaged over longer time windows. Overall, our results suggest that the Oura Ring has the potential to be a long-term heart health monitoring device that can provide accurate HR and HRV measures, provided HRV measures are derived using a stringent validity proportion threshold of around 80%, and 5 min readings are averaged over a longer time window of at least 30 min.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/s24237475/s1, Figure S1. MAPE HR/HRV heat maps for (A) HR, (B) rMSSD, and (C) HFnu; Figure S2. MAE HR/HRV heat maps for (A) HR, (B) rMSSD, and (C) HFnu; Figure S3. Scatter plots showing the relationship between Oura and ECG-derived HR/HRV metrics at 95% validity proportion threshold. Pearson’s r and Concordance Correlation Coefficient (CCC) are specified at the bottom right-hand corner;; Table S1. Descriptive statistics for Oura IBI and ECG-derived HR/HRV metrics and mean bias; Table S2. Numerical values for correlations and error metrics (MAE, MAPE, MdAPE). Top to bottom in ascending window size: 5 min, 30 min, and Night level.

Author Contributions

Conceptualization, G.Y., C.-S.S., T.L.; methodology, G.Y., C.-S.S., T.L.; software, G.Y., T.L.; validation, T.L.; formal analysis, T.L., G.Y.; investigation, C.-S.S., G.Y., T.L.; resources, G.Y.; data curation, T.L.; writing—original draft preparation, T.L.; writing—review and editing, G.Y., C.-S.S.; visualization, T.L.; supervision, G.Y., C.-S.S.; project administration, G.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Medical Research Council Singapore (STAR19may-0001), The Lee Foundation, Support funds for the Centre for Sleep and Cognition, Yong Loo Lin School of Medicine.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board (or Ethics Committee) of National University of Singapore (Study 1: NUS-IRB-2020-463, 18 February 2021; Study 2: NUS-IRB-2021-427, 13 March 2023).

Informed Consent Statement

Written informed consent has been obtained from the patient(s) to publish this paper.

Data Availability Statement

Data and source code are available upon request.

Acknowledgments

The authors would like to thank Azrin Bin Jamaluddin, Andrew Dicom, Teo Teck Boon, Wong Kian Foong, Nicholas Chee, Chua Xin Yu, Yashmit Lepcha for their help in data collection. They also thank Michael W. L. Chee for reading through an earlier version of this paper and providing comments.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Vaduganathan, M.; Mensah, G.A.; Turco, J.V.; Fuster, V.; Roth, G.A. The Global Burden of Cardiovascular Diseases and Risk. J. Am. Coll. Cardiol. 2022, 80, 2361–2371. [Google Scholar] [CrossRef] [PubMed]
  2. Johansen, C.D.; Olsen, R.H.; Pedersen, L.R.; Kumarathurai, P.; Mouridsen, M.R.; Binici, Z.; Intzilakis, T.; Køber, L.; Sajadieh, A. Resting, Night-Time, and 24 h Heart Rate as Markers of Cardiovascular Risk in Middle-Aged and Elderly Men and Women with No Apparent Heart Disease. Eur. Heart J. 2013, 34, 1732–1739. [Google Scholar] [CrossRef] [PubMed]
  3. Stauss, H.M. Heart Rate Variability. Am. J. Physiol.-Regul. Integr. Comp. Physiol. 2003, 285, R927–R931. [Google Scholar] [CrossRef] [PubMed]
  4. Nevels, T.L.; Wirth, M.D.; Ginsberg, J.P.; McLain, A.C.; Burch, J.B. The Role of Sleep and Heart Rate Variability in Metabolic Syndrome: Evidence from the Midlife in the United States Study. Sleep 2023, 46, zsad013. [Google Scholar] [CrossRef] [PubMed]
  5. Zhang, L.; Wu, H.; Zhang, X.; Wei, X.; Hou, F.; Ma, Y. Sleep Heart Rate Variability Assists the Automatic Prediction of Long-Term Cardiovascular Outcomes. Sleep Med. 2020, 67, 217–224. [Google Scholar] [CrossRef] [PubMed]
  6. Ghamari, M. A Review on Wearable Photoplethysmography Sensors and Their Potential Future Applications in Health Care. Int. J. Biosens. Bioelectron. 2018, 4, 195. [Google Scholar] [CrossRef]
  7. Lin, W.-H.; Wu, D.; Li, C.; Zhang, H.; Zhang, Y.-T. Comparison of Heart Rate Variability from PPG with That from ECG. In The International Conference on Health Informatics; Zhang, Y.-T., Ed.; Springer International Publishing: Cham, Switzerland, 2014; Volume 42, pp. 213–215. ISBN 978-3-319-03004-3. [Google Scholar]
  8. Ryals, S.; Chiang, A.; Schutte-Rodin, S.; Chandrakantan, A.; Verma, N.; Holfinger, S.; Abbasi-Feinberg, F.; Bandyopadhyay, A.; Baron, K.; Bhargava, S.; et al. Photoplethysmography—New Applications for an Old Technology: A Sleep Technology Review. J. Clin. Sleep Med. 2023, 19, 189–195. [Google Scholar] [CrossRef]
  9. Sviridova, N.; Sakai, K. Human Photoplethysmogram: New Insight into Chaotic Characteristics. Chaos Solitons Fractals 2015, 77, 53–63. [Google Scholar] [CrossRef]
  10. Kinnunen, H.; Rantanen, A.; Kenttä, T.; Koskimäki, H. Feasible Assessment of Recovery and Cardiovascular Health: Accuracy of Nocturnal HR and HRV Assessed via Ring PPG in Comparison to Medical Grade ECG. Physiol. Meas. 2020, 41, 04NT01. [Google Scholar] [CrossRef]
  11. Nelson, B.W.; Low, C.A.; Jacobson, N.; Arean, P.; Torous, J.; Allen, N.B. Guidelines for Wrist-Worn Consumer Wearable Assessment of Heart Rate in Biobehavioral Research. npj Digit. Med. 2020, 3, 90. [Google Scholar] [CrossRef]
  12. Theurl, F.; Schreinlechner, M.; Sappler, N.; Toifl, M.; Dolejsi, T.; Hofer, F.; Massmann, C.; Steinbring, C.; Komarek, S.; Mölgg, K.; et al. Smartwatch-Derived Heart Rate Variability: A Head-to-Head Comparison with the Gold Standard in Cardiovascular Disease. Eur. Heart J.-Digit. Health 2023, 4, 155–164. [Google Scholar] [CrossRef] [PubMed]
  13. Cao, R.; Azimi, I.; Sarhaddi, F.; Niela-Vilen, H.; Axelin, A.; Liljeberg, P.; Rahmani, A.M. Accuracy Assessment of Oura Ring Nocturnal Heart Rate and Heart Rate Variability in Comparison With Electrocardiography in Time and Frequency Domains: Comprehensive Analysis. J. Med. Internet Res. 2022, 24, e27487. [Google Scholar] [CrossRef] [PubMed]
  14. Miller, D.J.; Sargent, C.; Roach, G.D. A Validation of Six Wearable Devices for Estimating Sleep, Heart Rate and Heart Rate Variability in Healthy Adults. Sensor 2022, 22, 6317. [Google Scholar] [CrossRef]
  15. Stone, J.D.; Ulman, H.K.; Tran, K.; Thompson, A.G.; Halter, M.D.; Ramadan, J.H.; Stephenson, M.; Finomore, V.S.; Galster, S.M.; Rezai, A.R.; et al. Assessing the Accuracy of Popular Commercial Technologies That Measure Resting Heart Rate and Heart Rate Variability. Front. Sports Act. Living 2021, 3, 585870. [Google Scholar] [CrossRef] [PubMed]
  16. Li, K.; Cardoso, C.; Moctezuma-Ramirez, A.; Elgalad, A.; Perin, E. Heart Rate Variability Measurement through a Smart Wearable Device: Another Breakthrough for Personal Health Monitoring? Int. J. Environ. Res. Public Health 2023, 20, 7146. [Google Scholar] [CrossRef]
  17. Bourdillon, N.; Jeanneret, F.; Nilchian, M.; Albertoni, P.; Ha, P.; Millet, G.P. Sleep Deprivation Deteriorates Heart Rate Variability and Photoplethysmography. Front. Neurosci. 2021, 15, 642548. [Google Scholar] [CrossRef]
  18. Dettoni, J.L.; Consolim-Colombo, F.M.; Drager, L.F.; Rubira, M.C.; Cavasin De Souza, S.B.P.; Irigoyen, M.C.; Mostarda, C.; Borile, S.; Krieger, E.M.; Moreno, H.; et al. Cardiovascular Effects of Partial Sleep Deprivation in Healthy Volunteers. J. Appl. Physiol. 2012, 113, 232–236. [Google Scholar] [CrossRef] [PubMed]
  19. Zhong, X.; Hilton, H.J.; Gates, G.J.; Jelic, S.; Stern, Y.; Bartels, M.N.; DeMeersman, R.E.; Basner, R.C. Increased Sympathetic and Decreased Parasympathetic Cardiovascular Modulation in Normal Humans with Acute Sleep Deprivation. J. Appl. Physiol. 2005, 98, 2024–2032. [Google Scholar] [CrossRef]
  20. Ernst, G. Heart-Rate Variability—More than Heart Beats? Front. Public Health 2017, 5, 240. [Google Scholar] [CrossRef]
  21. Covassin, N.; Singh, P. Sleep Duration and Cardiovascular Disease Risk. Sleep Med. Clin. 2016, 11, 81–89. [Google Scholar] [CrossRef]
  22. Kim, K.K.; Lim, Y.G.; Kim, J.S.; Park, K.S. Effect of Missing RR-Interval Data on Heart Rate Variability Analysis in the Time Domain. Physiol. Meas. 2007, 28, 1485–1494. [Google Scholar] [CrossRef] [PubMed]
  23. Peltola, M.A. Role of Editing of R–R Intervals in the Analysis of Heart Rate Variability. Front. Physiol. 2012, 3, 148. [Google Scholar] [CrossRef] [PubMed]
  24. Cajal, D.; Hernando, D.; Lázaro, J.; Laguna, P.; Gil, E.; Bailón, R. Effects of Missing Data on Heart Rate Variability Metrics. Sensors 2022, 22, 5774. [Google Scholar] [CrossRef] [PubMed]
  25. Kim, K.K.; Kim, J.S.; Lim, Y.G.; Park, K.S. The Effect of Missing RR-Interval Data on Heart Rate Variability Analysis in the Frequency Domain. Physiol. Meas. 2009, 30, 1039–1050. [Google Scholar] [CrossRef]
  26. Aygun, A.; Jafari, R. Robust Heart Rate Variability and Interbeat Interval Detection Algorithm in the Presence of Motion Artifacts. In Proceedings of the 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), Chicago, IL, USA, 19–22 May 2019; pp. 1–5. [Google Scholar]
  27. Ong, J.L.; Golkashani, H.A.; Ghorbani, S.; Wong, K.F.; Chee, N.I.Y.N.; Willoughby, A.R.; Chee, M.W.L. Selecting a Sleep Tracker from EEG-Based, Iteratively Improved, Low-Cost Multisensor, and Actigraphy-Only Devices. Sleep Health 2023, 10, 9–23. [Google Scholar] [CrossRef]
  28. Yilmaz, G.; Ong, J.L.; Ling, L.-H.; Chee, M.W.L. Insights into Vascular Physiology from Sleep Photoplethysmography. Sleep 2023, 46, zsad172. [Google Scholar] [CrossRef]
  29. Johns, M.W. A New Method for Measuring Daytime Sleepiness: The Epworth Sleepiness Scale. Sleep 1991, 14, 540–545. [Google Scholar] [CrossRef]
  30. Trinder, J.; Allen, N.B.; Kleiman, J.; Kralevski, V.; Taylor, D.H.; Anson, K.; Kim, Y. On the Nature of Cardiovascular Activation at an Arousal from Sleep. Sleep 2003, 26, 543–551. [Google Scholar] [CrossRef]
  31. Chow, H.-W.; Yang, C.-C. Accuracy of Optical Heart Rate Sensing Technology in Wearable Fitness Trackers for Young and Older Adults: Validation and Comparison Study. JMIR Mhealth Uhealth 2020, 8, e14707. [Google Scholar] [CrossRef]
  32. Altini, M.; Kinnunen, H. The Promise of Sleep: A Multi-Sensor Approach for Accurate Sleep Stage Detection Using the Oura Ring. Sensors 2021, 21, 4302. [Google Scholar] [CrossRef]
  33. Chee, N.I.; Ghorbani, S.; Golkashani, H.A.; Leong, R.L.; Ong, J.L.; Chee, M.W. Multi-Night Validation of a Sleep Tracking Ring in Adolescents Compared with a Research Actigraph and Polysomnography. Nat. Sci. Sleep 2021, 13, 177–190. [Google Scholar] [CrossRef] [PubMed]
  34. Ghorbani, S.; Golkashani, H.A.; Chee, N.I.; Teo, T.B.; Dicom, A.R.; Yilmaz, G.; Leong, R.L.; Ong, J.L.; Chee, M.W. Multi-Night at-Home Evaluation of Improved Sleep Detection and Classification with a Memory-Enhanced Consumer Sleep Tracker. Nat. Sci. Sleep 2022, 14, 645–660. [Google Scholar] [CrossRef] [PubMed]
  35. Roberts, D.M.; Schade, M.M.; Mathew, G.M.; Gartenberg, D.; Buxton, O.M. Detecting Sleep Using Heart Rate and Motion Data from Multisensor Consumer-Grade Wearables, Relative to Wrist Actigraphy and Polysomnography. Sleep 2020, 43, zsaa045. [Google Scholar] [CrossRef] [PubMed]
  36. Vest, A.N.; Da Poian, G.; Li, Q.; Liu, C.; Nemati, S.; Shah, A.J.; Clifford, G.D. An Open Source Benchmarked Toolbox for Cardiovascular Waveform and Interval Analysis. Physiol. Meas. 2018, 39, 105004. [Google Scholar] [CrossRef] [PubMed]
  37. Lucreziotti, S.; Gavazzi, A.; Scelsi, L.; Inserra, C.; Klersy, C.; Campana, C.; Ghio, S.; Vanoli, E.; Tavazzi, L. Five-Minute Recording of Heart Rate Variability in Severe Chronic Heart Failure: Correlates with Right Ventricular Function and Prognostic Implications. Am. Heart J. 2000, 139, 1088–1095. [Google Scholar] [CrossRef]
  38. Malik, M. Heart Rate Variability: Standards of Measurement, Physiological Interpretation, and Clinical Use: Task Force of The European Society of Cardiology and the North American Society for Pacing and Electrophysiology. Noninvasive Electrocardiol. 1996, 1, 151–181. [Google Scholar] [CrossRef]
  39. Lomb, N.R. Least-Squares Frequency Analysis of Unequally Spaced Data. Astrophys. Space Sci. 1976, 39, 447–462. [Google Scholar] [CrossRef]
  40. Scargle, J.D. Studies in Astronomical Time Series Analysis. II—Statistical Aspects of Spectral Analysis of Unevenly Spaced Data. Astrophys. J. 1982, 263, 835. [Google Scholar] [CrossRef]
  41. Ruf, T. Lsp: Lomb-Scargle Periodogram in Lomb: Lomb-Scargle Periodogram. Available online: https://rdrr.io/cran/lomb/man/lsp.html (accessed on 23 January 2024).
  42. Burr, R.L. Interpretation of Normalized Spectral Heart Rate Variability Indices In Sleep Research: A Critical Review. Sleep 2007, 30, 913–919. [Google Scholar] [CrossRef]
  43. Lin, L.I.-K. A Concordance Correlation Coefficient to Evaluate Reproducibility. Biometrics 1989, 45, 255. [Google Scholar] [CrossRef]
  44. Menghini, L.; Cellini, N.; Goldstone, A.; Baker, F.C.; De Zambotti, M. A Standardized Framework for Testing the Performance of Sleep-Tracking Technology: Step-by-Step Guidelines and Open-Source Code. Sleep 2021, 44, zsaa170. [Google Scholar] [CrossRef] [PubMed]
  45. Charlton, P.H.; Kotzen, K.; Mejía-Mejía, E.; Aston, P.J.; Budidha, K.; Mant, J.; Pettit, C.; Behar, J.A.; Kyriacou, P.A. Detecting Beats in the Photoplethysmogram: Benchmarking Open-Source Algorithms. Physiol. Meas. 2022, 43, 085007. [Google Scholar] [CrossRef] [PubMed]
  46. Benedetti, D.; Olcese, U.; Frumento, P.; Bazzani, A.; Bruno, S.; d’Ascanio, P.; Maestri, M.; Bonanni, E.; Faraguna, U. Heart Rate Detection by Fitbit ChargeHRTM: A Validation Study versus Portable Polysomnography. J. Sleep Res. 2021, 30, e13346. [Google Scholar] [CrossRef]
  47. Nuuttila, O.-P.; Korhonen, E.; Laukkanen, J.; Kyröläinen, H. Validity of the Wrist-Worn Polar Vantage V2 to Measure Heart Rate and Heart Rate Variability at Rest. Sensors 2021, 22, 137. [Google Scholar] [CrossRef]
  48. Henriksen, A.; Svartdal, F.; Grimsgaard, S.; Hartvigsen, G.; Hopstock, L.A. Polar Vantage and Oura Physical Activity and Sleep Trackers: Validation and Comparison Study. JMIR Form. Res. 2022, 6, e27248. [Google Scholar] [CrossRef]
  49. Kinnunen, H.O.; Koskimäki, H. 0312 The HRV Of The Ring—Comparison Of Nocturnal HR And HRV Between A Commercially Available Wearable Ring And ECG. Sleep 2018, 41, A120. [Google Scholar] [CrossRef]
  50. Allen, J.; Murray, A. Age-Related Changes in Peripheral Pulse Timing Characteristics at the Ears, Fingers and Toes. J. Hum. Hypertens. 2002, 16, 711–717. [Google Scholar] [CrossRef]
  51. Allen, J.; Murray, A. Age-Related Changes in the Characteristics of the Photoplethysmographic Pulse Shape at Various Body Sites. Physiol. Meas. 2003, 24, 297–307. [Google Scholar] [CrossRef]
  52. Lin, W.-H.; Zheng, D.; Li, G.; Chen, F. Age-Related Changes in Blood Volume Pulse Wave at Fingers and Ears. IEEE J. Biomed. Health Inform. 2023, 28, 5070–5080. [Google Scholar] [CrossRef]
  53. Charlton, P.H.; Paliakaitė, B.; Pilt, K.; Bachler, M.; Zanelli, S.; Kulin, D.; Allen, J.; Hallab, M.; Bianchini, E.; Mayer, C.C.; et al. Assessing Hemodynamics from the Photoplethysmogram to Gain Insights into Vascular Age: A Review from VascAgeNet. Am. J. Physiol.-Heart Circ. Physiol. 2022, 322, H493–H522. [Google Scholar] [CrossRef]
  54. Tsuchida, Y. The Effect of Aging and Arteriosclerosis on Human Skin Blood Flow. J. Dermatol. Sci. 1993, 5, 175–181. [Google Scholar] [CrossRef] [PubMed]
  55. Allen, J.; O’Sullivan, J.; Stansby, G.; Murray, A. Age-Related Changes in Pulse Risetime Measured by Multi-Site Photoplethysmography. Physiol. Meas. 2020, 41, 074001. [Google Scholar] [CrossRef] [PubMed]
  56. Li, K.; Rüdiger, H.; Ziemssen, T. Spectral Analysis of Heart Rate Variability: Time Window Matters. Front. Neurol. 2019, 10, 545. [Google Scholar] [CrossRef] [PubMed]
  57. Binici, Z.; Mouridsen, M.R.; Køber, L.; Sajadieh, A. Decreased Nighttime Heart Rate Variability Is Associated With Increased Stroke Risk. Stroke 2011, 42, 3196–3201. [Google Scholar] [CrossRef] [PubMed]
  58. Lagi, A.; Tamburini, C.; Fattorini, L.; Cencetti, S. Autonomic Control of Heart Rate Variability in Vasovagal Syncope: A Study of the Nighttime Period in 24-Hour Recordings. Clin. Auton. Res. 1999, 9, 179–183. [Google Scholar] [CrossRef] [PubMed]
  59. McNames, J.; Aboy, M. Reliability and Accuracy of Heart Rate Variability Metrics versus ECG Segment Duration. Med. Bio Eng. Comput. 2006, 44, 747–756. [Google Scholar] [CrossRef]
  60. Mejía-Mejía, E.; Kyriacou, P.A. Duration of Photoplethysmographic Signals for the Extraction of Pulse Rate Variability Indices. Biomed. Signal Process. Control 2023, 80, 104214. [Google Scholar] [CrossRef]
Figure 1. Diagram showing placement of devices and data samples before filtering. The right panel shows 5 min NN/IBI timeseries and derived HR and HRV values for one night.
Figure 1. Diagram showing placement of devices and data samples before filtering. The right panel shows 5 min NN/IBI timeseries and derived HR and HRV values for one night.
Sensors 24 07475 g001
Figure 2. Flowchart showing ECG NN and OURA IBI processing pipelines. A stringent validity proportion threshold of 80% was applied to Oura IBI data.
Figure 2. Flowchart showing ECG NN and OURA IBI processing pipelines. A stringent validity proportion threshold of 80% was applied to Oura IBI data.
Sensors 24 07475 g002
Figure 3. Scatter plots showing correlations between Oura IBI-derived measures and ECG-derived measures separated by age group for different window sizes: (A) 5 min, (B) 30 min and (C) night levels. Validity proportion threshold was set at 80%. Every colour represents one participant within the age group.
Figure 3. Scatter plots showing correlations between Oura IBI-derived measures and ECG-derived measures separated by age group for different window sizes: (A) 5 min, (B) 30 min and (C) night levels. Validity proportion threshold was set at 80%. Every colour represents one participant within the age group.
Sensors 24 07475 g003
Figure 4. Bland–Altman plots showing the agreement between Oura IBI-derived measures and ECG-derived measures (reference) separated by age for different window sizes: (A) 5 min, (B) 30 min, and (C) Night. IDD (inter-device difference) was defined as ECG measures subtracted from the respective Oura measures. The IDD distributions, plotted on the right of each Bland–Altman plot, showed a tight cluster around 0. Results are shown for 80% validity proportion threshold. Every colour represents one participant within the age group.
Figure 4. Bland–Altman plots showing the agreement between Oura IBI-derived measures and ECG-derived measures (reference) separated by age for different window sizes: (A) 5 min, (B) 30 min, and (C) Night. IDD (inter-device difference) was defined as ECG measures subtracted from the respective Oura measures. The IDD distributions, plotted on the right of each Bland–Altman plot, showed a tight cluster around 0. Results are shown for 80% validity proportion threshold. Every colour represents one participant within the age group.
Sensors 24 07475 g004
Figure 5. Heatmaps showing participant-level MdAPE values at different validity proportion thresholds for HR/HRV measures. Each row represents one individual across all validity proportion thresholds. Within each age group, participants were sorted by the mean MdAPE across the validity proportion thresholds. While errors were very low across the board for HR (dark green), many participants had unacceptable rMSSD and HFnu MdAPE (>10%, yellow and red), especially those in the older group.
Figure 5. Heatmaps showing participant-level MdAPE values at different validity proportion thresholds for HR/HRV measures. Each row represents one individual across all validity proportion thresholds. Within each age group, participants were sorted by the mean MdAPE across the validity proportion thresholds. While errors were very low across the board for HR (dark green), many participants had unacceptable rMSSD and HFnu MdAPE (>10%, yellow and red), especially those in the older group.
Sensors 24 07475 g005
Table 1. Participant demographics and sleep characteristics, in median (IQR).
Table 1. Participant demographics and sleep characteristics, in median (IQR).
UnitsYounger (<45 Years)Older (≥45 Years Old)Mann-Whitney U Test p-Value
Demographics
Number of participants92[42 males]22[8 males]
Ageyears25.0(10.3)56.5(12.0)p < 0.001
BMIkg/m²22.1(3.2)24.3(3.0)p = 0.019
Office SBPmmHg110.0(16.0)117.0(19.5)p = 0.059
Office DBPmmHg70.0(9.0)75.0(17.0)p = 0.169
Sleep Characteristics (Oura)
Time-in-Bedhours7.52(1.10)7.38(0.73)p = 0.980
Bed-timehh:mm00:45(1.57 h)23:55(1.46 h)p = 0.005
Wake-timehh:mm08:18(1.96 h)07:02(1.56 h)p = 0.009
Sleep Efficiency%90.0(7.0)87.0(9.0)p = 0.089
Table 2. Data retention rates at different validity proportion thresholds.
Table 2. Data retention rates at different validity proportion thresholds.
Validity Proportion ThresholdRetention Rate (%)
YoungerOlder
Initial 5 min segment count14,0733485
30%93.395.2
50%87.089.4
80%66.772.3
95%32.844.7
Table 3. Summary of findings from wearable HR/HRV validation studies.
Table 3. Summary of findings from wearable HR/HRV validation studies.
PublicationDevicesSettingWindow
Size
Study Sample
(Age ± SD)
* Correlations (r)* Mean Bias
Kinnunen et al., 2020 [10]Oura Ring Gen 2 (PPG, finger)
Somnologica/Faros 90/Faros 180 (ECG, Reference)
Free living, S5 min *N = 49
(31.6 ± 11.8)
HR = 0.996
rMSSD = 0.980
HR = −0.63 bpm
rMSSD = −1.20 ms
Benedetti et al., 2021 [46]FitBit ChargeHR (PPG, wrist)
Morpheus Home Portable PSG (ECG, Reference)
Free living, S1 min *N = 25
(22.4 ± 3.0)
HR < 100 bpm
HR = 0.84

HR > 100 bpm
HR = 0.35
HR = −0.66 bpm
Nuuttila et al., 2021 [47]Polar Vantage V2 (PPG, wrist)
Polar H10 (ECG, Reference)
Free living, S5 min *N = 29
(36.0 ± 7.0)
HR = 0.998
ln(rMSSD) = 0.963
HR = 0.70 bpm
ln(rMSSD) = 0.17 ms
Cao et al., 2022 [13]Oura Ring Gen 3 (PPG, finger)
Shimmer 3 (ECG, Reference)
Free living, S5 min *
Night
N = 46
(32.3 ± 6.4)
HR = 0.993
rMSSD = 0.915
SDNN = 0.518
AVNN = 0.825
LF (absolute) = 0.424
HF (absolute) = 0.627
LF/HF ratio = 0.354
HR = −0.44 bpm
rMSSD = −14.97 ms
SDNN = −0.96 ms
AVNN = −13.39 ms
LF (absolute) = 23.61 ms²
HF (absolute) = 30.23 ms²
LF/HF ratio = −0.11
Henriksen et al., 2022 [48]Oura Ring Gen 2 (PPG, finger)
Actiheart 4 (ECG, Reference)
Free living, SNight *N = 21
(33.0 ± 14.0)
RHR = 0.900RHR = −1.00 bpm
Current paperOura Ring Gen 3 (PPG, finger)
SOMNOtouch (ECG, Reference)
In-lab, S5 min *
30 min
Night
Younger
N = 92
(27.4 ± 6.5)


Older
N = 22
(58.0 ± 6.9)
Younger
HR = 0.992
rMSSD = 0.979
HFnu = 0.931

Older
HR = 0.994
rMSSD = 0.937
HFnu = 0.902
Younger
HR = −0.64 bpm
rMSSD = 2.50 ms
HFnu = 0.03

Older
HR = −0.42 bpm
rMSSD = 3.79 ms
HFnu = 0.03
* For each study, only results for this time window is shown. S—Sleep; W—Wake; SDNN—Standard Deviation of NN intervals; AVNN—Average of NN intervals; RHR—Resting Heart Rate; ln(rMSSD)—Natural log of rMSSD.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liang, T.; Yilmaz, G.; Soon, C.-S. Deriving Accurate Nocturnal Heart Rate, rMSSD and Frequency HRV from the Oura Ring. Sensors 2024, 24, 7475. https://doi.org/10.3390/s24237475

AMA Style

Liang T, Yilmaz G, Soon C-S. Deriving Accurate Nocturnal Heart Rate, rMSSD and Frequency HRV from the Oura Ring. Sensors. 2024; 24(23):7475. https://doi.org/10.3390/s24237475

Chicago/Turabian Style

Liang, Tian, Gizem Yilmaz, and Chun-Siong Soon. 2024. "Deriving Accurate Nocturnal Heart Rate, rMSSD and Frequency HRV from the Oura Ring" Sensors 24, no. 23: 7475. https://doi.org/10.3390/s24237475

APA Style

Liang, T., Yilmaz, G., & Soon, C.-S. (2024). Deriving Accurate Nocturnal Heart Rate, rMSSD and Frequency HRV from the Oura Ring. Sensors, 24(23), 7475. https://doi.org/10.3390/s24237475

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop