1. Introduction
According to the physical activity guidelines of the American College of Sports Medicine (ACSM), adults should receive health benefits when they undertake 150 min of moderate-intensity cardiorespiratory exercise training or 75 min of vigorous-intensity cardiorespiratory exercise training per week [
1,
2,
3]. To maximize the benefits gained from exercise, it is essential for people to monitor their heart rate (HR) and energy expenditure (EE) during exercise. HR monitoring makes it possible for people to track their exercise intensity and quantify it in a valuable way, and EE monitoring could help people better control the time and volume of exercise.
Electrocardiography (ECG) and Polar HR monitors are generally used for HR monitoring [
4,
5,
6]. However, ECG is more suitable for laboratory and hospital settings than for individual use, and wearing a Polar HR chest strap involves a very complex process. Measuring EE relies on the use of different methods, with the indirect calorimetry and double-labeled water methods recognized as the “gold standards” [
7,
8,
9]. However, their high cost, large equipment size, complex processes, cumbersome analysis, and data lag make them suitable for use only in a laboratory or for small groups.
Today, wearable devices, represented by sports bracelets, are rapidly gaining popularity worldwide due to their low cost, ease of wear, low activity restrictions, and ability to provide simple and timely data on some physical activity indicators (e.g., HR, EE, and step count). In one American study, approximately 62% of adults reported that they met the ACSM’s physical activity guidelines; however, when their physical activity was monitored using an accelerometer, the percentage of those meeting the guidelines was found to be 9.6% lower [
10]. Thus, wearable devices that monitor exercise, such as accelerometers, can provide accurate information on physical activity, which is key to both the development of effective exercise intervention strategies and the improvement of people’s health [
11].
As wearable devices become more common, it is critical that they produce reliable and valid data. In terms of HR measurement, many peer-reviewed studies have been conducted to evaluate the reliability and validity of wearable devices produced by different brands, such as Fizzo, Fitbit, Apple Watch, and Garmin [
6,
12,
13,
14,
15,
16,
17]. Some studies have shown that these devices more accurately measure HR at lower treadmill speeds [
18,
19,
20]; however, others have shown that they are more accurate at higher treadmill speeds [
21]. For the measurement of EE, most studies have revealed that wearable devices return inaccurate measurements [
12,
16,
17,
22]. Shcherbina et al. [
16] tested seven devices on healthy subjects and found that while HR measurements were within acceptable margins of error, none of the tested devices’ EE measurements were within acceptable ranges. The limitations of that study include the fact that different devices and exercise regimens were used, different exercise intensities were tested, and different reference standards were used. Due to these limitations, the reliability and validity of the findings cannot be generalized to all wearable devices. Moreover, considering the individual differences within large populations, wearable devices should have good reliability and validity for people with different physical activity levels and different exercise intensities. However, no previous study has evaluated the measurement accuracy of wearable devices in individuals with different physical activity levels and different exercise intensities.
Some previous studies compared the measurement results of wearable devices when they are worn on the dominant and non-dominant hands. However, the findings of these studies are inconsistent [
6,
23,
24]. Considering the accelerating rate of technological change and the increasing speed at which wearable devices are being updated, it is important to evaluate the reliability and validity of current devices, especially those that are popular. In this study, we evaluated the new and popular Lexin Mio smart bracelet (Mio Band 1) due to its reasonable cost and compatibility with both Android and iOS systems. Previous studies on Mio smart bracelets have been sparse and based primarily on the accuracy of predicted EE [
22]. The manufacturer of the device used in the present study does not specify whether the smart bracelet should be worn on the dominant or non-dominant wrist or whether different results should be expected based on the wrist type.
Therefore, the first objective of this study was to test the reliability and validity of a popular wearable activity monitor in assessing the HR and EE of people with different physical activity levels exercising at different intensities. The second objective was to determine whether there was a difference in the physical activity assessment when the device was worn on the dominant wrist and non-dominant wrist. This was a preliminary study conducted in a laboratory environment to simplify data recording and exclude unexpected perturbations. Based on the outcomes of this study, future studies may be conducted in a real-world environment.
Based on the findings of previous studies, we hypothesized that the smart bracelet has good reliability and validity in measuring HR and that the EE measurement results are inferior to the HR measurement results. In addition, we hypothesized that wearing the smart bracelet on the dominant or non-dominant hand does not affect the effectiveness of the physical activity assessment.
2. Materials and Methods
2.1. Ethics
The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Experimental Ethics Committee for Sports Science of Beijing Sport University (Approval number: 2020133H, 16 September 2021). Informed consent was obtained from all subjects involved in the study.
2.2. Participants
The number of participants was set at 65 (more than 20 people per group) based on the findings of Wallen et al. [
25], who used a power of 0.5 and a type I error probability of 5%. This sample size is comparable to that of previous studies [
6,
16,
26] that examined the use of wrist-worn health-tracking devices, which included anywhere from 20 to 60 individuals. All of the participants were healthy, right-handed students (41 females and 24 males) aged 18–38 years (mean = 22.06 years) from Beijing Sport University. They were selected based on the following inclusion criteria: (a) aged 18–40 years; (b) no contraindications to exercise; and (c) not concurrently participating in another biomedical study. Some previous articles have shown that skin tone affects the HR readings of optical sensors [
27]. To avoid the effect of skin tone, all of the subjects recruited for this study were Chinese.
Before each test, the participants were prohibited from consuming alcohol or caffeine for 12 h and from eating anything for >1 h. They were reminded to wear comfortable sportswear and shoes during the test. If a participant had not complied with these conditions, their test was rescheduled. Before the first test, each participant was asked to complete an informed consent form, the Physical Activity Readiness Questionnaires (PAR-Q), and the International Physical Activity Questionnaire (IPAQ). The IPAQ is one of the most valid and widely used questionnaires for measuring physical activity levels in adults aged 15–69 years and has been used in Chinese population studies [
28,
29,
30]. Each participant’s physical activity level was calculated using the IPAQ results, and the results were used to group the participants into three categories: those with a low physical activity level (LPAL,
n = 22), those with a moderate physical activity level (MPAL,
n = 23), and those with a high physical activity level (HPAL,
n = 20). When each participant completed the entire test, they were given a personalized aerobic exercise prescription.
2.3. Wearable Devices
During the test, the participants wore two wearable motion monitors at the same time: a Mio smart bracelet on each wrist (henceforth, “left Mio” and “right Mio”) (
Figure 1).
The Mio smart bracelet is a wristband wearable device made in China and based on photoplethysmography (PPG) and a three-axis acceleration sensor. This device provides multidimensional health indicator data on HR, EE, mileage, step count, sleep length, stress level, and blood oxygen saturation.
In terms of HR, the basis for measuring HR is PPG, an optical technique in which HR is determined by monitoring changes in blood volume beneath the skin [
31]. The HR sensor collects dynamic HR data at a rate of once per second and displays the HR on the active-matrix organic light-emitting diode (AMOLED) screen of the bracelet in real-time. In addition, an exercise HR range and a high HR warning value can be set through a mobile phone application, as well as an alert for when the HR is too high to ensure the safety of the wearer. EE is calculated using algorithms that are not openly disclosed [
25].
2.4. Criterion Measures
In addition to the Mio smart bracelets, the participants wore the following standard measurement devices to examine the accuracy of the Mio smart bracelets.
2.4.1. Polar Heart Rate Monitor
A Polar H10 HR monitor (Polar Electro, Helsinki, Finland) was used as the standard instrument for measuring HR along with the Mio smart bracelet, as studies have shown it to be a valid and reliable tool for HR measurement compared to the 12-lead ECG [
4,
5,
6]. An HR sensor is worn on the chest and transmits the user’s real-time HR data to a watch.
2.4.2. Cardiorespiratory Function Test System
EE was measured using a benchtop gas-analysis system (GAS, Metalyzer 3B, Cortex, Leipzig, Germany) and indirect calorimetry. The subject’s inhaled and exhaled gases were collected through a gas collection tube (Hans Rudolph pneumotachometer) connected to a face mask and then analyzed using the GAS. The following parameters were measured: HR, total EE (kcal), oxygen consumption (VO
2), carbon dioxide production (VCO
2), and respiratory quotient (RQ). The Weir equation [
32] was used: EE (kcal/min) = 3.9 VO
2 (L/min) + 1.1 VCO
2 (L/min). Before each test, the instrument was warmed up for 30–40 min, and then separate calibrations were performed, including room air calibration, standard gas calibration, capacity calibration, and delayed calibration. After all the calibrations were performed, the instrument was officially tested. The GAS used in this experiment has been proven to be a reliable instrument for testing EE in sports medicine research [
33].
2.5. Other Measures
Each participant’s height and weight were measured. They were instructed to wear sports attire and were asked to remove their shoes before having their height and weight measured twice. When there was a difference of 0.25 cm in height or 0.1 kg in weight between the two measurements, a third measurement was taken. The participant’s height and weight were entered into the Mio smart bracelet and GAS before the exercise test began. The Borg Rating of Perceived Exertion (RPE) scale was used to measure the participants’ perceived level of exercise fatigue while they performed the test. The scores ranged from 6 to 20, which corresponded to HRs of 60–200 beats per minute (bpm) [
34].
2.6. Procedures
The study procedure is shown in
Figure 2. After completing the informed consent form, the PAR-Q, and the IPAQ, the participants had their basic personal measurements recorded (i.e., height, weight, waist circumference, hip circumference, and blood pressure). Trained master’s students performed the anthropometric measurements.
Each participant’s measurements were used to initialize the wearable device and the GAS. Two Mio smart bracelets were placed on the left and right wrists, above the ulnar styloid process. At the same time, the Polar HR band was placed at the level of the participant’s thoracic glabella with the help of the researcher. All three devices were tightly secured to ensure contact with the skin [
35], and then the participant was fitted with a breathing mask and the GAS was connected to the breathing mask.
The laboratory temperature was maintained at 25 ± 1 °C, the relative humidity was 40–60%, and the laboratory environment was quiet and free of noise. To avoid unexpected problems with the functionality of the Mio device, 20 Mio smart bracelets were available.
The test was conducted three times for each participant. The first test was mainly used to measure basic anthropometric data and maximum oxygen uptake (VO
2max). This test was performed according to the Bruce treadmill protocol, and VO
2max was assessed with the GAS. The standard Bruce Protocol was utilized, with increases in speed (Stage 1, 2.7 km/h; Stage 2, 4.0 km/h; Stage 3, 5.4 km/h; Stage 4, 6.7 km/h; and Stage 5, 8.0 km/h) and incline (starting at 10% grade and increasing 2% at every stage) every 3 min until exhaustion [
36]. When participants met two of the following four requirements, the test was terminated: (1) a plateau in HR despite increased activity; (2) peak oxygen uptake during incremental exercise, with constant or small increases in oxygen uptake as the load increases; (3) failure to keep up with treadmill speed; and (4) respiratory exchange ratio (RER) > 1.10. On a subjective level, exercise testing could also be terminated when the subject reached volitional exhaustion (RPE of 19 or 20) [
37,
38]. The second and third tests were mainly used to measure HR and EE during an exercise program. These two tests required the participants to remain in a quiet, sitting position for 10 min. Then, the participants completed the running test at 30% vVO
2max, 60% vVO
2max, and 90% vVO
2max, which corresponded to three speeds on a 1% running platform slope. The expected intensity × VO
2max = 3.5 + (0.2 × vVO
2max) + (0.9 × vVO
2max × slope%) [
39]. We used the 1% slope to simulate the air resistance that runners experience outdoors [
40]. The test time for each exercise intensity was 6 min, with a 1-min interval between intensities. The last stage was a 10-min recovery period. The RPE was measured in the last 15 s of each phase. The latter two tests needed to be conducted at the same time on different days, and the interval between the three tests was 48 h to 5 days.
If a participant was uncomfortable during any of the tests, the treadmill emergency device could be used to immediately stop the treadmill. A trained researcher was present beside the treadmill for every test to adjust the speed and protect the participants.
2.7. Data Processing
The raw HR data were obtained by accessing the backend website of the Mio smart bracelet (
https://jiankang.coolplay.tv/users/index, accessed on 20 November 2021) and were exported to Excel. The required data were extracted and transferred to Statistical Product Service Solutions (SPSS) for analysis. The second and third tests used the Mio smart bracelet, the Polar HR monitor, and the GAS to simultaneously monitor the participants’ HR and EE in real-time. To allow the participants sufficient time to reach a steady state, the average HR and EE values were recorded during the last 4 min of the rest periods, during the middle 4 min of the three exercise intensity running phases, and during the entire recovery period.
2.8. Statistical Analyses
SPSS statistical software (Version 26; SPSS, IBM Corporation, Armonk, NY, USA) was used to analyze the HR and EE data collected by each device. Descriptive data were reported as mean and standard deviation values. Reliability was determined by measuring the left–right Mios’ consistency and ipsilateral measurement repeatability. The left–right Mios’ consistencies were calculated using the paired Student’s t-test (t-test), mean absolute percentage error (MAPE), Pearson’s correlation coefficient, and intraclass correlation coefficient (ICC) between the left and right Mio data. Ipsilateral measurement repeatability was calculated using the t-test, MAPE, Pearson’s correlation coefficient, and ICC between two measurements on one side.
To determine validity, the t-test, MAPE, Pearson’s correlation coefficient, ICC, and Bland–Altman statistical methods were used between the left and right Mios and the standard measurement devices (Polar and GAS).
In the
t-test, a
p-value ≥ 0.05 indicated no difference in the data and good reliability/validity. The MAPE was used to assess the degree of error between the standard measurement instrument output and the Mio smart bracelet output for each measured value. A MAPE < 5% indicated high reliability/validity (precise), 5% ≤ MAPE ≤ 10% was acceptable, and a MAPE > 10% indicated low reliability/validity (imprecise) [
16,
41,
42]. The Pearson’s correlation was calculated to determine the degree of correlation between mean values, and a
p < 0.05 was statistically significant; the closer the correlation coefficient was to 1, the more correlated it was, indicating good reliability/validity. For the ICC, absolute agreement in a two-way mixed effect model was analyzed. The ICC was interpreted as low (<0.4), moderate (0.4–0.75), or high (>0.75) [
43]. For the Bland–Altman statistical methods, the bias (mean value of the difference (MD)) and the limits of agreement (MD ± 1.96 × standard deviation of the mean difference (LOA)) were analyzed. If the difference between the two measurements was within 95% LOA, the two devices could be considered in good agreement (the more focused the scatter plot trend, the better). In general, if more than half of the results were acceptable, they were considered to have good reliability/validity.
4. Discussion
This study examined the reliability and validity of a commonly used wrist-worn, consumer-grade activity monitor for measuring HR and EE at different exercise intensities in people with different physical activity levels. The results of this study are generally consistent with those of previous studies [
20,
25,
35]. Regarding HR measurement, in most cases, this wearable device had good reliability and validity in the laboratory. However, the device performed poorly in measuring EE. This result is also consistent with those of previous studies, in which most wearable devices were found to measure EE with low reliability and validity [
12,
16,
44,
45].
In terms of HR monitoring, the reliability and validity of the Mio smart bracelet decreased as the level of physical activity and exercise intensity increased. This was similar to the results of previous studies [
20,
46]. This may be because the acceleration sensor is not sensitive enough to measure exact values when the treadmill is operating at high speeds.
Regarding EE, it remains unclear why the device was not able to accurately measure this parameter, as the calculation of EE values depends on many anthropometric characteristics derived from the HR [
47]. According to the manufacturer, the Mio smart bracelet’s proprietary algorithms estimate EE using HR, active calories, and basal metabolic rate (BMR) data. Hence, it was expected that the EE values reported by the device would be more accurate; instead, they showed a large bias in most of the activities. This was probably due to the effects of the different physical activity levels and the exercise intensities. Therefore, the results should be viewed with caution when this device is used to measure EE.
Negative correlations were found in the validity tests for HR and EE measurement in some participants. However, analysis of data from repeated measurements showed good consistency of the results. Considering that the majority of the negative correlations were observed in high-intensity tests, we concluded that the device is unstable during high-intensity activity and that this was a significant contributing factor.
In our study, we found that the reliability of the left Mio for EE measurement was relatively poor compared with that of the right Mio. Furthermore, all participants were right-handed. Thus, we concluded that the Mio smart bracelet could measure EE accurately when worn on the wrist of the dominant hand. Since the dominant arm is stronger and used more often than the non-dominant arm [
48], the wrist selected may affect the outcome in assessing physical activity [
24]. However, few studies have investigated the differences in EE measurement between smart bracelets worn on the left and right hands, which may be related to the generally poor validity of smart bracelets for measuring EE.
Many previous studies [
5,
16,
41] have tested protocols in uniform speed progression schemes, which means that the speed was not adjusted for people with different physical activity levels. This leads to the possibility that some speed levels used in experiments have not corresponded to some of the participants’ abilities, and the notion that monitoring exercise intensity is accurate only at some specific speeds, thus discouraging the application of devices to the general public. To avoid this problem, the speed progression scheme used in this study was personalized to ensure the safety of the trial and to ensure the suitability of the smart bracelet for a variety of situations.
4.1. Implications
In this study, the Mio smart bracelet showed good reliability and validity in terms of HR detection. The next step should involve conducting outdoor field testing of the ability of the Mio smart bracelet to detect HR and exploring its application to people with chronic diseases. In the future, wrist-worn devices can be integrated into medical treatments, such as prescriptions for exercise for patients with type 2 diabetes, hyperlipidemia, and digestive tract tumors. However, these applications require devices to provide more accurate HR and EE data than those provided by current models. Nevertheless, the good reliability and validity of the Mio smart bracelet assessed in this study for HR measurement creates the possibility of using this device to measure additional metrics, such as blood pressure and VO2max.
Our findings suggest that using accurate and convenient wrist-worn devices could provide benefits to people’s fitness and in competitive sports. The real-time and efficient monitoring of exercise load could help address the needs of sportspeople for scientific exercise purposes and injury prevention.
Wearable consumer devices are constantly being improved by their respective manufacturers. The formula for measuring EE needs to be focused on improving or personalizing measurements to provide more accurate data. The algorithms used by these wearable devices to measure HR and EE are proprietary and may change without the knowledge of the general public. Therefore, those who want to use these devices in healthcare settings should always be dialectically cautious about them.
4.2. Limitations
This study had the following limitations. First, we tested only on a treadmill, which created a laboratory environment that was more ideal than realistic daily life environments. Second, our participants were only healthy students and did not include subgroups of people who were athletic and with health conditions. As a consequence, our results cannot be generalized to other conditions.