1. Introduction
Cryptogenic stroke, with a prevalence of 30% of cases, is a subtype of ischemic stroke without an identifiable cause after a comprehensive evaluation. However, there are several potential underlying causes that must be studied to guide treatment and secondary prevention strategies. One of the most common causes is undetected atrial fibrillation (AF) [
1].
In general, stroke is a major public health problem worldwide with a significant burden on healthcare systems in terms of costs and resources. The estimated total cost of stroke in Europe is over EUR 60 billion per year [
2]. Unfortunately, the burden of stroke is expected to rise in the incoming years with a parallel increase in healthcare service demand. To address the challenge, new stroke care pathways are needed to improve the efficiency of stroke care.
It has been demonstrated that AF is one of the main causes of recurrent stroke, and people with AF are about five times more likely to have a stroke than those without AF [
3]. Moreover, the recurrence of ischemic stroke in people with AF is also more likely to be severe and disabling [
4]. Therefore, heart rhythm monitoring to detect AF remains the basis of the follow up of cryptogenic stroke survivors, since identifying this condition may optimize secondary prevention measures, such as anticoagulation therapy, to reduce the risk of recurrent ischemic stroke [
5,
6]. However, in some cases, AF is usually complex to detect using conventional cardiac monitoring due to its paroxysmal nature, which means intermittent AF episodes of irregular and rapid heartbeats that stop on their own without treatment and are not usually associated with noticeable symptoms [
7]. Consequently, innovative methods to increase the rate of early AF detection in stroke patients using a simple and cost-effective screening protocol should be considered [
8].
New technologies are being developed to increase the rate of AF detection using prolonged rhythm monitoring, including implantable cardiac monitors (ICMs) that can be implanted under the skin to monitor the heart rhythm and detect episodes of AF over a longer period of time. Unfortunately, ICMs carry some disadvantages, such as invasiveness and high upfront cost, particularly when compared to traditional monitoring methods such as external Holter monitoring [
9]. However, it is important to note that cost is just one factor to consider when determining the appropriateness of a monitoring method. Other factors, such as patient preference, clinical judgment, availability of resources, and health insurance coverage may also play a role in decision making.
Commercially available smartwatches allow the collection of real-time heart rate (HR) data passively and continuously as well the detection of irregular heart rhythms through a non-invasive optical technology called photoplethysmography (PPG), which measures the pulse through a light sensor [
10]. It has been reported that smartwatches may be a cost-effective method for guiding AF detection in high-risk populations because they offer some advantages when compared to ICMs, such as non-invasiveness, cost, convenience, extra features, and acceptance [
11]. While PPG technology is less accurate than electrocardiogram (ECG) technology, studies have shown that it can be effective for detecting AF in some patients, particularly those with paroxysmal AF or with a high risk of developing AF [
12]. Furthermore, it was published that a PPG-based algorithm with an input of an irregular heart rhythm demonstrated a 98% positive predictive value to detect AF [
13].
Overall, the use of PPG-based smartwatches to increase the rate of AF detection in stroke patients holds promise for improving outcomes and reducing the risk of complications, such as recurrent stroke. However, further research is needed to evaluate the effectiveness of these methods in clinical practice. We conducted a pilot study to assess the HR measurement as well as the irregular rhythm notification feature of a consumer smartwatch to detect AF in stroke patients.
3. Results
During the development of the study, 76 stroke patients were screened, of whom 6 were excluded due to lost CEM signals. Then, 70 stroke patients successfully completed the HR assessment, of which 33 completed a round of 4 h of simultaneous ECG monitoring by CEM and AF data collection (IRN feature) (
Figure 8). The mean age was 79.4 years (SD ± 10.2); 62.9% were female; the median BMI was 26.3 (IQ 22.2–30.5); and the median NIHSS score was 8 (IQR 1.5–20). A history of AF diagnosis was present in 67.1% (47/70) of patients.
Table 1 displays the baseline characteristics of the patients.
A total of 526 individual pairs of HR measurements were collected from the Fitbit Charge 5 to determine the agreement of the HR measured by this device with that obtained by the reference method (CEM). For this, patient’s HR recordings were classified as sinus rhythm or AF during the measurements (
Table 2).
Figure 9 shows Lin’s concordance correlation coefficient (CCC) on paired HR data in scatterplots. For sinus rhythm, the CCC of the Fitbit Charge 5 was 0.791 (95% CI 0.350–0.452;
Figure 9a). However, the CCC of the Fitbit Charge 5 during AF episodes was 0.211 (95% CI 0.148–0.273;
Figure 9b).
The overall bias of the HR measurements by the Fitbit device based on the reference values (CEM-HR) was determined using the Bland-Altman plot. In general, this analysis revealed that the Fitbit Charge 5 tended to underestimate HR values with a mean difference of 10.18 bpm (
Figure 10). The limits of agreement (LoA) were quite broad, indicating that the measurements were less accurate over 90 bpm. These results confirm the tendency already observed in the scatterplots. The Bland-Altman plot showing the mean differences and limits of agreement is shown in
Figure 10.
Regarding the accuracy metrics of the HR measurements, we calculated the MAPE between all paired measurements of the Fitbit device and the reference values (CEM-HR). In sinus rhythm, the smartwatch had a MAPE of 6.18% (95% CI 4.12–8.24%). During an AF episode, the smartwatch had a MAPE of 16.48% (95% CI 11.64–21.67%) (
Table 3).
Figure 11 shows a graphical representation for summarizing and visualizing these statistical data.
We also searched for the influence of HR ≥ 100 bpm in the measurement accuracy of the Fitbit device.
Table 4 shows the results of our analysis, and the respective box plot is presented in
Figure 12. As expected, dividing patients according to HRs above 100 bpm led to the observation of a significant difference (
p < 0.05) in the sinus rhythm group and the AF group. We identified that HR ≥ 100 bpm significantly worsened the accuracy of the Fitbit device’s HR measurements during an AF episode (
Figure 12).
The IRN feature was collected from a total of 33 patients (
Figure 7). Participants underwent 4 h of simultaneous ECG monitoring by CEM and continuous PPG-HR recording using a Fitbit smartwatch. All the AF events referenced by CEM were recognized as AF by a blinded physician. A total of 29 (87.8%) of patients had AF during the 4-hour monitoring period. The IRN results matched the occurrence or absence of an AF episode based on the CEM system in 42.4% (14/33) of cases. Accordingly, the IRN feature had a sensitivity of 34.5% (95% CI 17.9–54.3%), and a specificity of 100% (95% CI 39.8–100%) for the detection of AF when compared to CEM. On the basis of an expected prevalence of 15% in the population, a PPV of 100% and an NPV of 86.6% (95% CI 86.9–91.9%) were estimated.
4. Discussion
In this study, we aimed to investigate the ability of a commercially available PPG-based smartwatch to aid physicians in AF screening and remote HR monitoring in stroke patients. To our knowledge, this is the first study to evaluate the accuracy of real-time average HR readings from PPG-based smartwatches reported in their health metrics dashboard. This study is part of a pilot study to establish the feasibility of an AF screening strategy for stroke patients using a consumer smartwatch (ClinicalTrials.gov Identifier: NCT05565781).
There were two different approaches tested: (1) to estimate the agreement between resting clinic HR measurements obtained by the smartwatch and by CEM, and (2) to assess the feasibility of an IRN feature that helps identify signs of AF in stroke patients.
We observed that the average HR collected from the dashboard of this PPG-based smartwatch tended to underestimate the real resting HR and that the bias was larger as the HR increased above 90 bpm. Regarding resting HR dynamics, we found that smartwatch-HR demonstrated a high concordance with CEM-HR in sinus rhythm, but a relatively weak correlation was observed during AF episodes (
Figure 9). Consistently, the MAPE for resting HR was >10% during AF and lower during sinus rhythm.
PPG-based smartwatches are specially designed to meet fitness goals and monitor HR during exercise. However, resting HR measurement via a smartwatch has not been properly evaluated for clinical purposes. Previous studies have suggested that PPG-based smartwatches underestimate the HR measurements when compared to a reference method or standard depending on the specific device and the conditions in which it is being used, such as physical activity level or clinical conditions [
5,
10].
Many studies reported that the accuracy of PPG-based smartwatches for HR measurement during physical activity is generally lower than that during rest [
10,
12,
15,
18]. For example, a recent study published in 2022 by Alfonso et al. involving PPG-based, wearable devices reported that this technology demonstrated good-to-excellent agreement in measuring HR during rest. However, the authors noted that the devices tended to underestimate HR during high-intensity exercise compared to a reference device. Overall, the study suggests that PPG-based HR sensors in smartwatches can provide fairly accurate readings during rest [
18]. Nevertheless, these studies focused mainly on healthy and younger populations, and it is important to consider the potential differences between younger and elderly populations when interpreting the results.
On the other hand, when referring to clinical conditions, several studies have compared the accuracy of PPG-based HR sensors in smartwatches to electrocardiogram (ECG) measurements during AF episodes, and the results have been mixed. Some studies have reported good agreement between PPG-based HR measurements and ECG measurements, while others have found significant differences between the two methods [
19].
Since we found significant differences for the smartwatch during AF with fast heartbeat, studies focused on the accuracy of HR estimation from smartwatches in tachyarrhythmias at rest were reviewed. For example, a study published by Koshy et al. in 2018 evaluated the accuracy of smartwatch HR measurements during rest in hospitalized patients, most of them aged 60 years and older, and found that several popular devices, including the Fitbit, had lower accuracy compared to a medical-grade ECG for HR in AF [
20]. Another study published by Sequeira et al. in 2020 to evaluate the accuracy of different wearable devices in measuring HR during supraventricular tachycardia (SVT) found that the Fitbit Charge 2 had lower accuracy in measuring HR during SVT compared to other devices among middle-aged and older patients with documented reentrant SVT [
21]. In 2020, Al-Kaisey et al. also reported that HR during AF was underestimated compared to that during sinus rhythm by wrist-worn heart rate monitors, including the Fitbit Charge 3. This study found that the Fitbit Charge 3 had lower sensitivity (75%) and positive predictive value (74%) while assessing HR monitoring during AF compared to a 12-lead electrocardiogram (ECG) among ambulant patients aged 55 and over [
22].
Our findings may be driven by the physical characteristics of our study population, especially those related to age, considering that the mean age of our patients was around 79 years. PPG technology estimates the HR by detecting subtle variations in the color of the skin, which correspond to changes in blood volume in the microvascular bed of tissue beneath the skin caused by the contraction and relaxation of the heart; however, compared to younger people, elderly people may have changes in the blood volume in their cutaneous microcirculation that can affect PPG measurements of HR [
23]. As people age, their skin becomes thinner and less elastic, which can cause a decrease in blood flow to the microvascular bed. This reduction in blood flow can make it more difficult for PPG sensors to detect changes in blood volume and accurately estimate HR. Additionally, elderly people may have other factors that can affect the accuracy of PPG measurements, such as medication use, chronic health conditions, or reduced physical activity levels. These factors can affect blood circulation and lead to changes in the microvascular bed, which can in turn affect PPG measurements [
23].
Overall, while PPG technology is generally effective at estimating HR in people of all ages, there may be some age-related differences in the accuracy of the measurements due to changes in the cutaneous microcirculation and other factors. This poses a challenge as to how smartwatches could help achieve preventive goals in elderly stroke patients. Further studies to assess the HR are needed in order to recommend whether the smartwatch is a reliable tool for monitoring HR in this population.
In general, studies have shown that PPG-based HR sensors in smartwatches can provide reliable HR measurements during sinus rhythm but may be less accurate during AF episodes. The accuracy of HR measurements during AF episodes can be affected by various factors, such as the irregularity of the heartbeat, motion artifacts, and changes in blood flow. Overall, while PPG-based HR sensors in smartwatches can be a useful tool for monitoring the HR, they may not be as accurate as ECG measurements during AF episodes, especially in challenging situations, such as with elderly stroke patients. It is important to use caution when interpreting HR measurements from smartwatches during AF episodes. Therefore, smartwatches may not be recommended to guide therapeutic decisions to control the HR in patients with AF. Researchers are working on the design of new PPG-based algorithms to provide better detection accuracy for the HR in various rhythm scenarios including AF [
24].
In contrast, promising results were reported regarding the IRN feature available on smartwatches. The IRN feature in smartwatches, such as the Fitbit devices, works by using PPG sensors, which continuously monitor the wearer’s heart rate and heart rhythm. If the device detects an irregular rhythm that is consistent with atrial fibrillation or other arrhythmias, it will alert the wearer with a notification. Recently, the Fitbit Heart Study, a large-scale clinical trial, was conducted by Fitbit in collaboration with the American Heart Association (AHA) and other organizations to evaluate the accuracy of the IRN feature on the Fitbit devices for detecting AF. The study involved over 400,000 participants who wore a Fitbit device that was equipped with the IRN feature for a period of up to one year. The results of the study, which were published in the Journal of the American College of Cardiology in 2022, showed that the IRN feature had a sensitivity of 84% and a specificity of 93% for detecting AF compared to ECG measurements. The authors of the study concluded that the IRN feature on Fitbit devices may be a useful tool for detecting AF and other arrhythmias in individuals at risk [
13].
Considering our results regarding the low sensitivity of the IRN feature, two in three cases of AF could be undetected by the PPG-based algorithm. From the perspective of screening programs for AF, where the sensitivity must be high, the use of the IRN feature should be complemented with more data, such as the age of the patients, biomarkers, and chronical conditions, to improve the accuracy of the algorithm. Conversely, the specificity of the IRN feature for the group of patients studied seems to provide very good reliability as all cases of AF were confirmed. In this way, ruling out false-positive alarms of AF may reduce patient anxiety and cardiologist overload. Nevertheless, the risk of misdiagnosis by the IRN must be assessed in prospective studies.
We assumed that a short episode of AF was barely detected by the algorithm. Consequently, the PPG-based algorithm may be more effective in patients with prolonged AF episodes. It has been reported that the IRN feature warned of an AF episode at least a few hours after it occurred, so it is not a real-time tool to detect an active AF episode unless the episode endures for a long time [
19]. Therefore, it is important to consider the overall accuracy and limitations of the PPG-based irregular rhythm algorithm feature in the context of early detection and management of AF in specific patient populations, such as older stroke patients, and to use it as an adjunct to standard diagnostic methods to improve AF diagnostics. This tool may be use as part of a low-cost screening strategy to accurately guide most AF diagnosis interventions by improving patient selection [
25].
Overall, smartwatches have the potential to improve the management of stroke patients in the long term. In patients with cryptogenic stroke, after standard-of-care cardiac monitoring, wearing a smartwatch could uncover episodes of AF by means of the IRN feature. However, it is important to note that smartwatches should not replace regular medical care and should be used in conjunction with medical assistance.
When interpreting our findings, a few limitations must be considered when talking about the IRN feature’s accuracy. In our study, we did not consider the inclusion of patients who presented with other arrhythmias different from AF. It would be important to know the performance of this PPG-based algorithm against other types of arrhythmias. For use as part of an AF screening program, the high specificity of this method can be of great advantage to identify undiagnosed patients with high risk. However, the nature of this study’s design did not enable us to investigate this further. Most notably, our study was designed to include only stroke patients in hospital settings. Our findings and their implications should not be translated to ambulatory environments. In addition, the high median age of the patients may have affected the results regarding the PPG assessment. With regard to the accuracy of the HR measurements, the impact of other clinical variables, such as stroke symptoms, was not considered in this report. Nevertheless, no PPG signals were discarded due to limb weakness.