*2.5. IRT Sensor Processing for RR Estimation Using Nasal and Oral Breathing Decision based on SQI and MUSIC Algorithm and Body Temperature Estimation*

The current approach of respiration measurement using an IRT is based on nasal temperature change. However, mouth breathing is reported in 17% of the total population [25]. For a stable RR measurement using an IRT, we must also measure oral temperature changes and select nasal or oral temperature changes dependent on strongly including respiration. To choose nasal or oral breathing, we quantified temperature traces via nasal and oral areas using SQI. Moreover, the MUSIC algorithm achieved rapid measurement for RR estimation. Figure 5 shows an overview of the respiration measurement that introduces nasal and oral breathing measurement method and MUSIC algorithm.

**Figure 5.** Block diagram of signal processing for respiration rate (RR) estimation. (**a**) Thermal video frame with facial landmark detected by the fusion sensor system described in Section 2. (**b**) Time-series data extracted from nasal and oral areas. (**c**) Respiration signal that chooses from four signals (b) based on SQI. (**d**) Power spectra obtained by MUSIC.

First, the nasal and oral areas were detected using the fusion sensor system described in Section 2. The possible respiration signals were extracted by the two areas. The mean temperature fluctuation *xmean*(*t*) in each ROI and the min temperature fluctuation *xmin*(*t*) in each ROI are expressed as

$$\mathbf{x}\_{\text{mean}}(t) = \frac{1}{mn} \sum\_{\mathbf{x}=0}^{m-1} \sum\_{\mathbf{y}=0}^{n-1} I(\mathbf{x}, \mathbf{y}, t) \mathbf{x}\_{\text{min}}(t) = \min\_{\substack{0 < \mathbf{x} < m-1, \ 0 < \mathbf{y} < n-1}} I(\mathbf{x}, \mathbf{y}, t), \tag{6}$$

where *I(x,y,t)* is the pixel temperature at the image coordinate (*x, y*) in the ROI and time *t, m* is the width of the ROI and n is the height of the ROI. *xmean*(*t*) and *xmin*(*t*) include the respiration signals.

Second, the respiration signal is selected from nasal and oral temperature traces using the four extracted signals: *xmean nose*(*t*), *xmin nose*(*t*), *xmean mouth*(*t*) and *x*min*mouth*(*t*). Selection of the proposed respiration signal is conducted using the nasal SQI and oral SQI, based on the agreement of frequency estimated by power spectral density (PSD), autocorrelation (ACR) and cross-power spectral density (CPSD). The frequency of PSD using *xmean*(*t*) was estimated from the peak of power spectra from 0.1–0.75 Hz, to provide the range of RR measurement. The frequency of ACR using *xmean*(*t*) was estimated from the average peak interval. The frequency of CPSD using *xmean*(*t*) and *xmin*(*t*) was estimated from the peak of cross-power spectra ranging from 0.1–0.75 Hz. If the temperature change in the nasal or oral area includes dominant respiration frequency, CPSD indicates the frequency by strengthening the respiration frequency between *xmean*(*t*) and *xmin*(*t*) in the ROI. The following two rules are adopted sequentially:


If the two rules are not satisfied, we select nasal area as the respiration signal.

This system applies the MUSIC method separately to the HR and RR time-series data obtained from the video. In the case of respiration, the peak of 0.1– 0.75 Hz (6–45 bpm) of the spectrum obtained was assumed to be the RR. Temperature was also determined as the max facial temperature in the detected facial ROI using the sensor fusion technique.
