1. Introduction
Intensive care units (ICUs) are among the most vital hospital wards, as they are reserved for patients with critical health conditions [
1]. Here, continuous monitoring of vital signs is crucial for the early detection of an acute deterioration in health. The basic parameters monitored are heart rate (HR), blood pressure, respiratory rate (RR) and body temperature (BT), that provide information about the general physical status [
2].
The monitoring of BT allows the observation of hypo- and hyperthermia, e.g., in inflammation. According to a study by Laupland et al., 16% of ICU patients have some type of hypothermia and up to 26% suffer from fever [
3]. Erkens et al. observed dysregulation of BT in half of all patients in a German ICU. In general, BT is considered a significant predictor of mortality [
4].
In addition, observations of changes in the respiratory rate can detect serious respiratory failure, which is the most common cause of admission to the ICU [
5]. In 2015, 8% of all deaths in EU countries could be linked with respiratory diseases, which makes it the third main cause of mortality [
6]. Despite continuous monitoring of respiratory activity in ICUs, the RR is the least accurately recorded vital sign in hospitals, despite its significance as a detector for early signs of deterioration [
7]. Almost all sensors currently used for patient monitoring require direct contact to the body, but for a number of reasons, including handling and hygiene, contactless monitoring would be preferable. Moreover, the measurement quality of e.g., electrodes can vary with displacement. In the worst case, monitoring can cause medical adhesive-related skin injuries (MARSI) in patients with sensitive skin, such as infants or burn patients [
8]. The replacement of disposable equipment (e.g., electrodes) is usually expensive and requires advanced medical knowledge for operation. Moreover, the environmental impact of medical waste production must not be underestimated.
To overcome the disadvantages of wired patient monitoring, contactless vital parameter acquisition has been investigated by research groups worldwide [
9]. The development of camera-based techniques was initialized by Wu et al. in 2000, who used a CCD camera to extract dermal perfusion changes from the skin surface [
10]. In addition to illumination-dependent camera technologies, Murthy et al. introduced infrared thermography (IRT) cameras in 2004 to extract the body surface temperature (BST) and RR from respiration-induced temperature changes in mouth and nose regions [
11]. Subsequently, these techniques have seen great progress in accuracy and performance, due to improved computational efficiency and rapid developments in the field of machine vision. In this paper, a deep learning (DL)-based algorithm for the extraction of relative BST changes and RR from patients in the ICU using a low-resolution IRT camera is presented. A real-time object detection algorithm was used to extract signal-containing regions-of-interest (ROIs) in the frames. The head and chest regions were cropped to measure BST changes and breathing-related thorax movements from consecutive frames using an optical flow (OF) algorithm. Finally, a performance analysis was conducted to show real-time capability on embedded GPU modules for a low-cost implementation.
The further structure of this work is described as follows:
Section 2 provides an overview of related works in the field of camera-based RR monitoring.
Section 3 describes the dataset and the DL-based algorithm for vital sign extraction.
Section 4 presents the performance results of the object detector and the contactless monitoring of RR and BST.
Section 5 analyzes and reflects on the results of the presented approach. Finally,
Section 6 summarizes the major findings and describes limitations of the algorithm.
2. Related Works
In the last decade, major advances have been developed in the field of camera-based vital sign monitoring. In 2011, Abbas et al. presented a method for respiratory monitoring from a tracked nose region in infrared images for neonates [
12]. In the meantime, Lewis et al. worked on a similar tracking approach for adults to additionally estimate relative tidal volume changes [
13]. In 2015, Pereira et al. presented an advanced approach to estimate RR from the nostrils using a high definition thermography camera [
14]. Sun et al. measured RR and HR simultaneously from the face/nose region with a dual RGB/IRT camera system [
15]. Elphick et al. conducted a larger study with more than 70 participants using a technique for facial analysis to track the nose region [
16]. Although these methods showed high accuracy for the extraction of respiration, all approaches used highly expensive camera hardware and required a consistent line-of-sight to the nostrils, which restricts the position and angle of the camera. Furthermore, several tracking algorithms had to be applied offline after the actual recordings. Thus, no real-time capability existed.
Subsequently, computationally complex tracking algorithms were increasingly replaced by efficient DL-based methods. Real-time capable face and nose detectors (e.g., [
17,
18]) offer a great potential to enhance existing monitoring systems. In 2019, Kwasniewska et al. used neural networks in combination with low resolution thermography camera modules for a so-called super resolution approach to show feasibility of RR monitoring in nose-region images of only 80 × 60 px [
19]. Furthermore, Jagadev et al. presented a machine learning-based measurement method where regression trees were used to track the nostrils [
20]. The authors additionally investigated gradient techniques and support vector machines for ROI tracking [
21]. Despite the high potential of machine vision in the field of thermography-based monitoring techniques, research is still at an early stage. Moreover, most groups worked on the extraction of respiration-related signals from the nasal region in thermography videos, which are, however, difficult to obtain in clinical environments. So far, the number of publications where IRT was used to monitor thorax movement for the extraction of RR is very limited. Nevertheless, studies were conducted in an animal trial with anesthetized pigs [
22] and for RR monitoring of infants [
23]. The application of DL methods for segmentation/detection in this context has not yet been covered in the literature. Finally, although commercial devices for medical thermography are available and used for e.g., tumor examinations, there is no approved IRT-based equipment for non-contact measurement of RR.
6. Conclusions and Outlook
In this paper, we presented an approach for DL-based real-time extraction of vital signs using contactless IRT. A dataset of 26 patients recorded in an ICU was used to train and validate the object detectors YOLOv4 and YOLOv4-Tiny. A 10-fold CV was performed to quantify the overall detection performance. It has shown promising results for robust detection of the trained labels. While an IoU of 0.70 was observed for the YOLOv4 model on the test dataset, the tiny model showed a superior IoU of 0.75. The BST trend was measured by detecting the head and RR was extracted by using an OF algorithm looking for chest movements. A corrected regression analysis for the trend analysis resulted in an MSE of . The RR extraction showed MAEs of bpm (YOLOv4) and bpm (YOLOv4-Tiny).
While the extraction of temperature trends from the relative changes of head surface temperature showed the potential of detecting and tracking pathological changes, the comparison of the extracted RR with reference revealed several challenges. Unfortunately, movement disturbances complicated the camera-based extraction of RR. Nevertheless, during the night and for patients with low movement artifacts, the algorithms showed promising results. In future work, more IRT recordings with additional reference data for BT should be analyzed. We assume that larger datasets for training could improve the overall results of the YOLOv4 detection models. The RR extraction could be improved using the tracking information of clinical staff/visitors. The corruption of raw motion signals due to overlaps with the patient ROI could be avoided by neglecting these signal components in advance. We are confident that the use of a low-cost IRT camera system in combination with DL algorithms on an embedded GPU module could contribute to a reduction of wired sensor technologies for patient monitoring in ICUs and enhance the use of unobtrusive real-time capable vital sign acquisition.