1. Introduction
During the 2020 Olympic Games in Tokyo and 2022 FIFA World Cup in Qatar, temperatures above 30 °C are expected [
1,
2]. Prolonged, intense exercise in such a hot environment impairs athletic performance [
3], causes a rise in core body temperature (
Tc) and increases the risk for potentially life-threatening exertional heat illness (heat stroke) associated with a
Tc above 40 °C [
4]. To offset the impact of thermally stressful environmental conditions, numerous strategies have been developed to help manage heat stress. Out of these strategies, heat acclimation and heat acclimatization appear to provide optimal benefits [
5]. For optimal heat acclimation, the training guidelines advise athletes to exercise for a prolonged time (e.g., 60–90 min) at
Tc above 38.5 °C [
6,
7]. Monitoring
Tc in training is, therefore, an important part of the training process both for achieving the desired stimuli/adaptation to a given training session and secondly to prevent heat-related medical issues. It is therefore of utmost importance to provide athletes and coaches with a valid, reliable, and easily applicable strategy to monitor
Tc.
Tc can be assessed at different body sites, such as the rectum, esophagus, pulmonary artery, mouth, aural canal, armpit, and forehead. For a comprehensive overview of different
Tc measurement methodologies, the readers are recommended to consult a recent review by Tyler et al. [
8]. With respect to validity, measurements in the mouth, aural canal, and armpit, and on the forehead remain questionable [
9], while measurements of
Tc in the pulmonary artery, esophagus, and the rectum have been shown to provide valid
Tc data. However, measurement of the temperature in the pulmonary artery or esophagus is invasive, requires trained medical personnel, and thus remains of limited use, even in laboratory settings [
10]. On the other hand, measurement of rectal temperature (
Trec) represents a valid and reliable
Tc measurement for individuals at rest and while exercising [
11,
12] and is employed in the majority of sports science thermoregulatory studies. In addition,
Trec also serves as the criterion standard for temperature measurement in hyperthermic athletes [
13,
14]. Despite its widespread use, the measurement of
Trec has several important limitations. Prolonged sitting with the inserted rectal probe may be uncomfortable for athletes, the measurement is mostly limited to the laboratory conditions, due to body movement the sensor can be displaced from its original position, as well as the movement of the hips may be limited [
15]. Therefore, the measurement of
Trec during training (heat acclimatization) or competitions remains inconvenient.
In recent decades ingestible temperature sensors (pills) became a popular alternative for research and professional sport. Several studies provide evidence to suggest that ingestible pills are valid sensors for the assessment of
Tc [
15,
16,
17]. However, this technique also has several limitations, including the fact that the pill has to be ingested a few hours before the exercise, it can become contaminated by food or fluid ingestion, and it is expensive as well. Therefore, monitoring of the
Tc during each training session or competition with ingestible pills is not widespread.
Hence, a potential noninvasive sensor that would allow monitoring of
Tc during each training or competition would provide significant benefits in terms of heat acclimation/acclimatization training and prevent the occurrence of a heatstroke. Clearly, a noninvasive sensor that would allow accurate monitoring of
Tc during specific periods would be of considerable benefit not only to athletes, but also for workers exposed to high thermal loads (e.g., firemen and soldiers), as well as for obtaining important diagnostic information in clinical settings [
18].
Recently, one such sensor, the CORE (greenTEG AG, Rümlang, Switzerland), has become commercially available [
19]. The CORE apparatus involves a novel type of thermal energy transfer sensor (a heat flux sensor) that determines core temperature using machine learning algorithms based on measurements of heat flux and skin temperature, as well as, when exercising, data provided by external heart rate sensors connected to the CORE via ANT+ protocol. Although the CORE sensor is already used by many athletes, even in world-class competitions (such as the cycling Tour de France), to our knowledge, a peer-reviewed validation study has not yet been published. Unfortunately, lately, different wearables such as the CORE sensor are often marketed with aggressive and potentially exaggerated claims that lack a sound scientific basis [
20,
21]. Accordingly, the present investigation was designed to compare the validity and reliability of the CORE sensor to that of a rectal sensor under various laboratory conditions, as well as to examine the reproducibility of values obtained with the CORE sensor during exercise under the same conditions on two separate occasions.
2. Materials and Methods
2.1. Study Design
Twenty-four healthy and physically active male volunteers (age = 30 ± 5 years; body mass = 77.9 ± 9.6 kg; height = 180 ± 7 cm; peak oxygen uptake(O2peak) = 58 ± 7 mL min−1 kg−1 (means ± standard deviations)) participated in this study, which was approved by the Ethics Committee for sport at the University of Ljubljana, Slovenia (033-3/2021-2), which adheres to the principles outlined by the World Medical Assembly Declaration of Helsinki. Informed consent was obtained from all subjects involved in the study. The subjects included were men younger than 40 years of age who cycled for at least 8 h each week.
2.2. Design of Studies I and II
Study I was designed to assess the reliability and validity of the CORE sensor with low-to-moderate heat load, while Study II was designed to assess the validity of the CORE sensor under moderate-to-high heat load. For these evaluations, the participants came to the laboratory on three and two separate occasions, respectively. In both cases, the participants underwent a pretest in connection with the first visit. The protocol for Trials 1a and 1b performed during the subsequent two visits to the laboratory in Study I were identical, allowing us to determine the test-retest reliability of the CORE sensor.
Upon each arrival, participants’ body mass was measured and the participants were asked to insert a rectal probe 12 cm past the anal sphincter in a private room. Additionally, each participant was equipped with a heart rate chest strap with an attached CORE sensor (explained in detail below). All exercise trials were performed using participants’ own bicycles mounted on an electrically braked cycle ergometer (Kickr V5, Wahoo, Atlanta, GA, USA).
2.2.1. Pretest
The pretest visit with the aim to assess the participants’ baseline characteristics, i.e., the peak oxygen uptake, and associated power output (
Wmax), was the same for both studies. Additionally, exercise intensities corresponding to the first ventilatory threshold (VT1), and the respiratory compensation point (RCP) were determined as previously described by Iannetta et al. [
22]. Ambient conditions were kept thermoneutral with temperature and relative humidity levels at 21.9 ± 0.2 °C and 36 ± 1%, respectively.
In brief, the exercise test began with an 8-min warm-up: a 2-min at 80 W, followed by 6-min at 120 W. This was followed by an incremental ramp test to maximal volitional exertion. The intensity of the exercise was gradually increased by 30 W min−1 in a stepped manner. Pedaling frequency was self-selected, and participants were encouraged to continue until task failure. After 30 min of passive rest, participants cycled for 10 min at 50–65% Wpeak (i.e., cycling in the heavy exercise intensity domain) to obtain the parameters required for the determination of exercise intensities corresponding to VT1 and respiratory compensation point (RCP).
During this test, gas exchange was monitored by an automated online system (MetaLyzer 3B-3R, Cortex, Biophysics GmbH, Leipzig, Germany). Before each trial gas analyzers were calibrated with a known gas mixture (15.10% O2, 5.06 CO2; Linde Gas A.S., Prague, Czech Republic), and the volume transducer was calibrated with a 3-L syringe (Cortex, Leipzig, Germany). Peak oxygen uptake was calculated as the highest 30-s average value of O2 consumption.
2.2.2. Study I
The 12 men (29 ± 5 years, 78.6 ± 10.2 kg, 181 ± 6 cm) who participated in the first study demonstrated a mean O2 peak of 57.3 ± 6.4 mL kg−1 min- 1 and a Wmax of 413 ± 49 W. The ambient conditions (laboratory temperature and relative humidity) during both trials were similar 19.1 ± 0.6 °C, 33 ± 7%, and 19.1 ± 0.5 °C, 32 ± 5%, respectively.
After the pretest participants were asked to visit the laboratory on two additional separate occasions (Trial 1a and Trial 1b) taking place at the same time of the day. They entered the laboratory after an overnight fast and having abstained from performing exercise 24 h before each trial. Additionally, participants recorded their diet 24 h before Trial 1a and replicated their diet for Trial 1b.The protocol began with cycling for 5 min at 60% VT1, followed by 60 min of steady-state exercise (SS) at 90% VT1.
2.2.3. Study II
The 13 participants (31 ± 5 years, 178 ± 8 cm, 77.0 ± 9.0 kg) demonstrated a mean O2peak of 59.0 ± 8.9 mL kg−1 min- 1, corresponding to a mean peak power of 410 ± 60 W. The ambient conditions (laboratory temperature and relative humidity) were 30.7 ± 0.7 °C, and 39.0 ± 6.0%, respectively.
The exercise started with a 5-min warm-up of cycling at an intensity of 100 W, followed by a 10-min of exercise with graded increases in power output corresponding to RCP in order to increase the heat production. Thereafter, the participants cycled for 60 min at SS intensity, with a subsequent 15 min of cooling down. The SS intensity was reduced if the thermal discomfort or Trec of the participants was too high (Trec above 39.5 °C). Prior to the arrival at the laboratory, participants were instructed to drink enough liquids and during the exercise session, drinks were provided ad libitum.
2.3. Measurement of Temperature and Heart Rate
2.3.1. Body Temperature
Core body temperature was measured with a MSR rectal sensor (MSR, Seuzach, Switzerland) as a reference and the greenTEG CORE sensor. The CORE wearable sensor (4 cm × 5 cm × 0.8 cm) estimates Tc based on the measurements of skin temperature, heat flux, and heart rate (optional). According to the manufacturer’s instructions this sensor must be positioned on the torso/chest approximately 20 cm below the armpit using a heart rate monitor strap. For measurements during physical activity, the CORE should be paired with the heart rate monitor (HRM), but this is not necessary otherwise. The manufacturer offers two different versions of this sensor: CORE and COREresearch, the latter of which samples data every second, can store this data for 3.5 days and was employed here. Data stored on the device can be downloaded to the Android or iOS CORE app for further analysis. The accuracy of the CORE device is described by the manufacturer to be ± 0.26 °C.
2.3.2. Rectal Temperature
Trec was determined with a rectal sensor connected to a data logger (MSR145WD, Seuzach, Switzerland) from which the data collected were later transfered to a personal computer via a USB. The accuracy of the MSR sensor is reported by the manufacturer to be ± 0.20 °C.
2.3.3. Heart Rate
Heart rate was measured with a Polar H10 heart rate sensor (Polar OY, Kempele, Finland) connected to the CORE sensor.
2.4. Data Analysis
The acquisition frequency was 1 Hz for the CORE sensor and 0.1 Hz for the rectal MSR sensor. Therefore, averages per 10 s were calculated for the CORE sensor. These values have been used for statistical analysis performed with Matlab R2020b (MathWorks Inc., Natick, MA, USA).
2.5. Statistics
The data were tested for normality by the Kolmogorov-Smirnov test as well as the differences between data. Because normality was rejected for all the data (p < 0.05), statistical tests that do not assume normality were used.
2.5.1. Reliability of the Device for Measuring the Tc
Device measurements that were performed twice (Study I -Trials 1a and 1b) were evaluated for intra-device reliability. The Wilcoxon signed-rank test was used to assess the systematic bias between trials, with the statistical significance set at
p < 0.05. Limits of agreement (LoA) were calculated according to a nonparametric approach, as proposed by Bland and Altman [
23]. Briefly, values that fell outside 10% of the observations were identified and then 5% of the observations from each end were removed. In addition, the peak temperature values of both trials and the largest differences at a discrete-time point were compared with the Wilcoxon signed-rank test. Values are expressed as means ± standard deviations (SD). Ambient conditions data were normally distributed. Therefore, a paired
t-test was used to assess between trial differences in ambient conditions. Statistical significance was set at
p < 0.05.
2.5.2. Validity of the Device for Measuring the Tc
Validity was assessed by evaluating the association between the data provided by the CORE sensor and a rectal sensor. The concurrent validity, which evaluates the association between data provided by the new device (i.e., CORE) and another device considered to be more valid (i.e., rectal sensor), is reported. The temperature device validity statistics were similar to those described in
Section 2.5.1. (i.e., bias, limits of agreement, peak values, maximal differences). The acceptable difference between devices was taken as ≤ 0.3 °C [
9,
15,
16].
4. Discussion
The main purpose of the current investigation was to evaluate the reliability and validity of a novel device (CORE) that is claimed to estimate Tc accurately during indoor cycling under conditions of low-to-moderate heat load, as well as the validity of this same sensor at moderate-to-high heat load. The main findings were that the reliability of the CORE sensor was acceptable, with a non-significant mean bias between Trials 1a and 1b in Study I of only 0.02 °C. However, in comparison to the “gold standard” MSR rectal sensor, the Trec indicated by the CORE sensor demonstrated poor agreement during cycling under conditions of both low-to-moderate and moderate-to-high heat load, with differences between the devices that were greater than the predefined acceptable level of ≤ 0.3 °C being associated with 45% and 51% of all values measured, respectively. These findings do not support the claim that the CORE sensor provides a valid measure of core body temperature.
4.1. Reliability
In Study I, exercise-induced changes in
Tc were similar between the repeated exercise trials (i.e., Trial 1a and Trial 1b). We observed a systematic bias of 0.02 ± 0.23 °C and LoA of −0.30 to +0.42 °C. Gant et al. [
16], and Ruddock et al. [
2] have published studies dealing with the reliability of ingestible pills. The means bias and LoA assessed in our study was lower compared to the mean bias of −0.07 ± 0.31 °C and LoA of ± 0.61 °C reported by Ruddock et al. [
2]. Exercise intensity duration was similar as in our study, while the ambient temperature in the laboratory was higher (35 ± 0.2 °C vs 19.1 ± 0.6 °C). The mean bias reported by Gant et al. [
16] was similar to our mean bias (0.01, and 0.02 ± 0.23 °C, respectively), while the LoA was lower compared to ours (± 0.23 °C, and −0.30 to + 0.42 °C). They assessed the reliability during intermittent running in a cool environment.
More detailed analysis showed that the mean bias between Trial 1a and Trial 1b was statistically significant during the warm-up period (0.25 ± 0.34 °C, p = 0.027), while during SS there was no statistically significant difference. According to the manufacturer’s instructions, the CORE sensor should be connected to a heart rate monitor during exercise and disconnected during rest to obtain the most accurate readings. However, the state of the heart rate connection cannot be changed during the measurement and therefore, the temperature values obtained during rest (at the beginning of the exercise) may not be accurate. These potentially inaccurate CORE temperature values at the beginning of the exercise can explain the statistically significant difference in mean bias between Trial 1a and Trial 1b during the warm-up period. In addition, this can explain the statistically different increase in temperature during the entire exercise for Trial 1a and Trial 1b. However, it has to be acknowledged that from a sports science perspective the initial 5 min of exercise (warm-up), when the core body temperatures are still below 38 °C, are less important in terms of training/performance.
To allow evaluation of the reliability of the CORE sensor, the participants had to exercise under the same conditions (i.e., environmental conditions and exercise intensity). During Study II, the laboratory temperature was high, raising the possibility that the body core could become too hot (above 39.5 °C), which would require a reduction in exercise intensity. Moreover, exercise under such hot conditions could result in inducing a certain degree of heat acclimation and thereby influence the participants’ subsequent responses. Therefore, sensor reliability could not be assessed in connection with Study II.
4.2. Validity
4.2.1. Study I
The results of Study I show that a systematic bias between the temperature values obtained from two different sensors was evident throughout the protocol (0.23 ± 0.35 °C,
p < 0.001), with the temperatures of the CORE sensor being systematically higher than those from the MSR rectal sensor, see
Figure 2a. The range of differences in temperatures between devices was within the sum (±0.46 °C) of the measurement error provided by the manufacturers of each device (±0.2 °C for rectal sensor, and ±0.26 °C for CORE sensor) in 66% of all measured data points. Moreover, the mean difference between devices was below the criterion threshold of 0.3 °C in 51% of all measured data points, which is much lower compared to the percentage reported by Gosselin et al. (91%) [
15]. Gosselin et al. tested the validity of the ingestible sensor during treadmill running in a hot environment (ambient temperature 38 °C).
A more detailed analysis showed that the mean bias in temperature between both devices was statistically significant and varied from around 0.22 ± 0.33 to 0.33 ± 0.33 °C across all phases, except for the last 20 min of SS. The observed systematic bias was higher than reported by Gosselin et al. [
15] and Gant et al. [
16] that compared the ingestible temperature sensor (pill) with the rectal sensor. They reported a mean bias ranging from 0.1 to 0.2 °C. Gant et al. assessed the validity of an ingestible temperature sensor during intermittent running in a cool environment. Interestingly, in this study, the temperature measured by the ingestible temperature sensor was systematically higher compared to the temperature from the rectal sensor, while in the study by Gosselin et al. the ingestible temperature sensor underestimated the temperature measured with the rectal sensor.
Despite the systematic difference in the temperature was observed between the CORE sensor and the MSR rectal sensor, the total temperature increase was, however, shown not to significantly differ between devices for the entire exercise, as well as each phase of exercise except the warm-up period (
Table 4). Statistically significant different increases of temperature between both sensors during the warm-up period can be explained similarly as in
Section 4.1.
4.2.2. Study II
The results of Study II show that a systematic bias between the temperature values obtained from two different sensors was evident throughout the protocol (−0.10 ± 0.38 °C, p < 0.001). In contrast to Study I, the mean temperature obtained with the CORE sensor was lower compared to the values obtained from the MSR rectal sensor during the entire exercise.
The range of differences in temperatures between devices was within the sum (±0.46 °C) of the measurement error provided by the manufacturers of each device in 73% of all measured data points, which was a slightly higher percentage compared to Study I. Moreover, the mean difference between devices was below the criterion threshold of 0.3 °C in 45% of all measured data points, which is much lower compared to the percentage reported by Gosselin et al. (91%) [
15].
A more detailed analysis showed that the mean bias was not constant for all phases of the exercise. At the beginning and the end of the exercise bout, the CORE sensors underestimated the temperature obtained with MSR rectal sensor, while in the middle (SS from 15 to 35 min) the CORE sensor overestimated the temperature obtained with MSR rectal sensor.
Although the
Trec is the preferred and recommended method of one of the governing bodies—National Athletic Trainers’ Association for assessing core body temperature [
24], athletes and coaches use a variety of devices to measure temperature which is less invasive compared to the rectal sensor. Compared to the data published by Ganio et al. [
17], the CORE sensor has proven to be more accurate than other non-invasive devices (i.e., devices to assess forehead, oral, temporal, aural, and axillary) used in sports. Nevertheless, the studies showed [
15,
16,
17] that the ingestible temperature sensors are still more valid compared to the CORE sensor, but they are not entirely non-invasive and associated with high costs.
The results of the present study must be interpreted with the following limitations in mind. We only tested continuous exercise, steady-state cycling. The main reason is that, as stated by Taylor et al. [
8], the rectal temperature is perfectly acceptable during steady states while inadequate in certain dynamic phases. Therefore, the sensor response during intermittent exercise, for example, remains unknown. Moreover, the exercise was not performed in either very cold or very hot (above 30 °C) environmental conditions. In addition, only males were included here, primarily because the temperature changes associated with the menstrual cycle [
25] could have influenced our evaluation of reliability. Clearly, this limitation should be kept in mind when interpreting data on women obtained with the CORE sensor. Accordingly, we utilized the
Trec as the
Tc reference value. As reported previously,
Trec, gastrointestinal and esophagus temperatures are comparable when changes in the core temperature are small and/or gradual [
26], whereas during the rapid changes only
Trec and gastrointestinal temperature correlate well [
27]. Therefore, although
Trec does, in fact, reflect the actual
Tc in most situations, in some cases, this value may be an under or -overestimation [
8]
. This potential limitation should be taken into consideration when interpreting our present findings and in future studies measurement of
Tc at multiple sites could provide an even better reference value.