Next Article in Journal
Flexible Electromagnetic Sensor with Inkjet-Printed Silver Nanoparticles on PET Substrate for Chemical and Biomedical Applications
Previous Article in Journal
Innovative Digital Phenotyping Method to Assess Body Representations in Autistic Adults: A Perspective on Multisensor Evaluation
Previous Article in Special Issue
Evolving Dynamics of Neck Muscle Activation Patterns in Dental Students: A Longitudinal Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Electromyography- and Bioimpedance-Based Detection of Swallow Onset for the Control of Dysphagia Treatment

1
Control Systems Group, Technische Universität Berlin, Einsteinufer 17, 10587 Berlin, Germany
2
Clinic for Ear, Nose and Throat Medicine, Unfallkrankenhaus Berlin (UKB), Warener Str. 7, 12683 Berlin, Germany
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(20), 6525; https://doi.org/10.3390/s24206525 (registering DOI)
Submission received: 30 August 2024 / Revised: 2 October 2024 / Accepted: 8 October 2024 / Published: 10 October 2024
(This article belongs to the Special Issue Biomedical Sensors for Diagnosis and Rehabilitation2nd Edition)

Abstract

:
Several studies support the benefits of biofeedback and Functional Electrical Stimulation (FES) in dysphagia therapy. Most commonly, adhesive electrodes are placed on the submental region of the neck to conduct Electromyography (EMG) measurements for controlling gamified biofeedback and functional electrical stimulation. Due to the diverse origin of EMG activity at the neck, it can be assumed that EMG measurements alone do not accurately reflect the onset of the pharyngeal swallowing phase (onset of swallowing). To date, no study has addressed the timing and detection performance of swallow onsets on a comprehensive database including dysphagia patients. This study includes EMG and BioImpedance (BI) measurements of 41 dysphagia patients to compare the timing and performance in the Detection of Swallow Onsets (DoSO) using EMG alone versus combined BI and EMG measurements. The latter approach employs a BI-based data segmentation of potential swallow onsets and a machine-learning-based classifier to distinguish swallow onsets from non-swallow events. Swallow onsets labeled by an expert serve as a reference. In addition to the F1 score, the mean and standard deviation of the detection delay regarding reference events have been determined. The EMG-based DoSO achieved an F1 score of 0.289 with a detection delay of 0.018 s ± 0.203 s. In comparison, the BI/EMG-based DoSO achieved an F1 score of 0.546 with a detection delay of 0.033 s ± 0.1 s. Therefore, the BI/EMG-based DoSO has better timing and detection performance compared to the EMG-based DoSO and potentially improves biofeedback and FES in dysphagia therapy.

1. Introduction

1.1. Swallowing and Dysphagia

The ability to swallow is a body function of great importance. Dysphagia is an impairment of the swallowing process that severely reduces the quality of life and health status of patients. Dysphagia therapy is a time-consuming and demanding process for clinicians and patients. Manual treatment methods aim to enable safe food intake and help patients regain regular swallowing functions.
Worldwide, strokes occur in fifteen million people each year [1]. Almost 42% of stroke patients develop dysfunctional swallowing [2], and 33% aspirate, leading to pneumonia in 50% of these cases [1]. The mortality rate of patients with dysphagia causing aspiration is 18.9% [3]. Head and neck cancer treatment includes surgery, radiotherapy, and chemotherapy. Surgical interventions cause specific anatomical or neurological insults and likely result in specific swallow impairments. Radiotherapy and chemotherapy introduce several side effects, such as loss of appetite, mucous membrane inflammation (mucositis), and dysphagia [4]. The prevalence of dysphagia after treatment for head and neck cancer ranges from 33.0% to 71.0% [5].
Matsuo et al. [6] describe four swallowing phases, the preparatory oral phase, the propulsive oral phase, the pharyngeal phase, and the esophageal phase. During the oral preparatory phase, jaw and tongue movements masticate the food and mix it with saliva to produce a swallowing-appropriate bolus consistency. The tongue and soft palate seal the access to the pharynx for airway protection. After bolus preparation, the tongue transports the bolus backward to the pharynx in the propulsive oral phase.
The pharyngeal swallow phase is a rapid, reflexive sequence of actions to protect the airway and open the upper esophageal sphincter, giving the bolus passage to the esophagus. The soft palate and constrictor wall muscles close the access to the nasopharynx. The tongue base pushes the bolus to the pharynx, and the contraction of muscles in the pharyngeal wall forces the bolus downwards, triggering the excursion of the hyoid and larynx for epiglottis closure. Relaxation of the cricopharyngeus muscle enables the opening of the upper esophageal sphincter before laryngeal elevation. A complex contraction of suprahyoid and thyrohyoid muscles elevates the larynx and opens the upper esophageal sphincter. The tongue and the pharyngeal constrictor muscles introduce pressure on the bolus and push it into the upper esophageal sphincter. In the esophageal phase, the bolus arrives in the esophagus and is moved toward the stomach by peristaltic contractions of the esophagus.
In this study, we employ BI and EMG measurements recorded with the PhysioSense device developed by Nahrstaedt et al. [7], utilizing a four-electrode setup, originally proposed by Yamamoto et al. [8], extended with a reference electrode. Figure 1 illustrates the electrode positions. The electrodes placed on the sternocleidomastoideus close to the ear on both sides of the neck inject a current, and two electrodes placed laterally to the larynx between the hyoid bone and the thyroid cartilage measure the voltage over the enclosed tissue. The bioimpedance is the absolute value of the transfer impedance at a frequency of 50 k Hz regarding the voltage measured over the larynx and the inserted sinusoidal current.
Figure 2 illustrates the typically observed BI valley of a swallow, coinciding usually with an active EMG period. The BI and EMG data in Figure 2 are cleaned with the preprocessing procedures presented in Section 2.4. Tongue and jaw movements in the oral swallowing phase introduce EMG activity and some variation in the BI data. In the propulsive oral swallowing phase, tongue movements transport the bolus to the back of the throat and the hyoid burst puts an initial force on the larynx and upper esophageal spincter [9]. These movements coincide with a small peak in the bioimpedance that precedes the BI swallow valley. The elevation of the larynx and the contraction of the pharynx cause a rapid BI drop. Once the pharynx relaxes and the larynx returns to its resting position, the BI data returns to the level before the swallow. Simultaneous Videofluoroscopic Swallowing Studies (VFSS) and BI/EMG recordings showed a high correlation of BI with the movement of the larynx and hyoid bone [10].

1.2. Biofeedback and FES for Dysphagia Therapy

Biofeedback is a technique to convert physiological processes into a representation in another modality. The visual, auditory, or haptic representation enables a patient to gain voluntary control over the physiological processes, which proceed automatically under healthy conditions. Feedback learning supports operant conditioning by providing information and motivation about the progress in physiological abilities and generates awareness about unconscious processes [11].
The majority of studies concerning biofeedback in dysphagia therapy employ electromyography (EMG) measurements with electrodes placed on the submental region. The most basic biofeedback approaches visualize the EMG or the EMG envelope on a screen to increase strength [12] and voluntary timing [13] of swallowing and to practice the Mendelsohn maneuver [14], the effortful swallow [15,16,17], or the volitional laryngeal vestibule closure [18]. More sophisticated techniques utilize onsets of EMG activity to control an avatar in gamified biofeedback [19,20,21]. Both approaches intend to improve the strength, duration, or timing of swallowing and increase motivation. In addition, Li et al. [22,23] employed an accelerometer placed on the larynx to trigger gamified biofeedback. Kwong et al. [24] showed the advantage of ultrasound biofeedback in learning the Mendelsohn maneuver compared to EMG-based biofeedback.
Nasal pressure [25] and the sound of Eustachian tube opening [26] have been used in recent studies with healthy adults as potential new non-invasive measurements for the characterization of the swallowing process. An assessment of these alternative measurements for biofeedback or triggered FES with dysphagia patients is still pending.
To date, meta-studies have found little definitive evidence of the benefit of most methods employed in dysphagia therapy [27,28]. Speyer et al. [28] reported studies supporting significant evidence with large effect sizes only for the Shaker exercise, the chin-tuck against resistance exercise, and expiratory muscle strength training. Moreover, none of the available studies concerning biofeedback in dysphagia therapy proved the benefit of the proposed methods [29,30].
Generally, electrical stimulation applies a current to nerves or muscle fibers through transcutaneous or percutaneous electrodes. The pulse duration, frequency, form, and amplitude define the stimulation current [31]. Functional electrical stimulation introduces action potentials in motor neurons to activate the connected muscle and cause a potent muscle contraction, resulting in a functional movement. Therefore, FES enables neuro-prostheses by restoring impaired motor functions. Usually, feedback control techniques adapt the stimulation current to compensate for unknown effects, such as the individual state of fatigue and voluntary contribution of the patients [32].
Several studies investigated triggered FES to support the swallowing process. Leelamanit et al. [33] achieved an improvement in swallow functions in 20 out of 23 dysphagia patients by administering triggered FES with adhesive electrodes to the thyrohyoid muscle. Burnett et al. [34] employed triggered FES to the mylohyoid, thyrohyoid, and thyrohyoid muscles with needle electrodes causing an up to 50% increase in larynx elevation compared to swallows without FES. In a follow-up study, Burnett et al. [35] showed that voluntary muscle activation is not reduced by triggered FES. Therefore, triggered FES has the potential to support swallowing [35]. Humbert et al. [36] applied transcutaneous FES at ten electrode positions and achieved an excursion of the larynx with two positions and an excursion of the hyoid with one position. Nahrstaedt et al. [37] reported a greater amplitude and velocity of the larynx excursion for triggered FES in a pilot study including a single dysphagia patient. Schultheiss et al. [38] studied the effect of triggered FES with multiple electrode setups on the amplitude and velocity of the larynx excursion in healthy subjects. They reported increased amplitude and velocity of the larynx excursion for most subjects and reduced amplitude and velocity of the larynx excursion for some subjects. Li et al. [23] measured a significantly shorter duration of the propulsive oral phase from the active EMG onset to the onset of acceleration caused by laryngeal excursion for triggered FES compared to swallows without FES in healthy subjects.
Hadley et al. [39] provoked a much greater excursion of the larynx by stimulating the hypoglossal nerve compared to stimulating the mylohyoideus, geniohyoideus, thyrohyoideus, and genioglossus with hook electrodes in five dogs and concluded that there was great potential for dysphagia management in humans. Tyler [40] called the absence of a robust trigger method as one of the greatest challenges for developing neuro-prostheses for dysphagia management.

1.3. Summary and Article Outline

Despite promising approaches presented in the literature concerning dysphagia therapy, most methods, including biofeedback, lack clear supporting evidence based on randomized controlled studies with numerous participants [28,29]. The studies on triggered FES in dysphagia therapy suggest great potential for dysphagia management as well. Triggered FES and some biofeedback methods require the detection of swallow onsets in real-time.
To date, techniques for swallow onset detection have employed EMG measurements at the throat [19,23,33], a combination of EMG and BI measurements [37], and measurement of the acceleration of the larynx [22,23]. EMG measurements taken at the throat usually show activation during swallow preparation [37]. Onsets of active EMG occur on average 0.5   s before the reflexive oral swallowing phase [23]. All reviewed techniques for the detection of swallow onsets require manual parameter tuning and lack a systematic analysis of the detection quality and timing using a comprehensive database including dysphagia patients.
We postulate that the success of biofeedback training and FES support of larynx elevation can be improved when the triggering occurs close to the onset of the pharyngeal phase. Triggering FES or presenting biofeedback during the bolus preparation in the oral phase is considered misleading and counterproductive. Hypoglossal nerve stimulation for dysphagia management in a neuro-prostheses requires robust detection of swallow onsets [40] even in everyday use.
Accordingly, we utilize a combination of EMG and BI data to detect swallow onsets coinciding with the start of the pharyngeal swallowing phase. Employing machine-learning methods omits manual parameter tuning in the proposed BI/EMG-based DoSO, while all single-sensor technologies are sensitive to non-swallowing activities such as speaking, chewing, and head and neck movements, the combination of different sensors is likely to decrease the false-positive rate in DoSO, especially in non-clinical environments. Using a comprehensive database with swallows and non-swallow events, an analysis of the detection quality and timing shows the advantage of the machine-learning-based approach using a BI/EMG-based approach compared to the standard threshold-based approach using EMG. The evaluation and optimization of the EMG-based and BI/EMG-based DoSO employs existing recordings of BI and EMG data. All algorithms presented in the article are suited for the online processing of BI and EMG data streams.

2. Materials and Methods

2.1. Database

The database of this study consists of four data series covering specific aspects of BI and EMG measurements in swallowing recorded with the PhysioSense device introduced by Nahrstaedt et al. [7]. The doctoral thesis of Holger Nahrstaedt [10] utilized the same database, and the doctoral thesis of Corinna Schultheiss [41] contains more detailed information about the included subjects.
The investigators labeled data segments containing swallows and movements during the recording procedure. Developing the real-time detection of swallow onsets demands annotations at the start of the pharyngeal swallowing phase. Therefore, an experienced examiner retrospectively added annotations at the start of the BI drop (Figure 2) providing references of swallow onset times. Table 1 contains an overview of the main features of data series I to IV.

2.1.1. Data Series I

The study investigated the separability of swallows and head movements in BI and EMG measurements. Additionally, the data series contains swallows of boluses with varying electric conductivity and consistency to investigate the effect on BI measurements. The study included twenty healthy subjects divided in five subgroups. The first group swallowed 20 mL of water in a specific head position and performed head, yaw, and tongue movements, and speaking. The second group swallowed saliva and ten portions of liquid, each with a volume of 20 mL. The third group swallowed boluses of different consistencies, including saliva, 5 g of yogurt, and bread. The subjects of group four swallowed 50 mL, 10 mL, 20 mL, and 30 mL of water. The fifth group swallowed liquids with different electric conductivities.

2.1.2. Data Series II

This data series examined the repeatability of the BI and EMG measurements with respect to electrode positioning. The study included fifteen subjects. Ten subjects provided four measurements conducted with slightly varying electrode positions recorded during a single day. All subjects participated in at least two measurements per day, repeated within four successive days. The subjects swallowed 200 mL of water at their own pace in a measurement session.

2.1.3. Data Series III

The study comprises measurements executed by four investigators on nine subjects to investigate the reliability of the BI/EMG measurement method concerning different investigators. Therefore, each investigator placed the electrodes on the subject’s throat to prepare the measurements.

2.1.4. Data Series IV

The data series includes measurements of 41 dysphagia patients. A total of 24 patients suffered from a neurological condition, and 17 patients from head-, ear-, nose-, and throat-related disorders. Depending on their condition and abilities, the patients swallowed saliva, dyed water, green jelly, and bread. An endoscopic examination, performed by a trained physician, accompanied the BI and EMG measurements to ensure the correct labeling of the swallows.

2.2. Assignment of Class Labels

The evaluation of a swallow onset detection requires a procedure to assign the potential swallow onsets determined at times of interest
t i TI = t 1 TI , , t I TI
with i = 1 , 2 , , I being the reference times
t r RT = t 1 RT , , t R RT
with r = 1 , 2 , , R annotated by an expert. Furthermore, training and testing a classifier for swallow onset detection demands a labeled data set
D = { ( y i , x i , s i ) }
consisting of binary class labels y i { 0 , 1 } , feature vectors x i R J with J features, and a subject index s i N + .
Calculating the absolute value differences
Δ t i = t i TI t r RT with i = 1 , 2 , , I and r = const .
between the times t i TI and a reference time t r RT delivers the index
i min = arg min i Δ t i
of the shortest absolute difference Δ t i min . If Δ t i min falls below the threshold θ LA , the corresponding time of interest t i TI with i = i min is a swallow onset. The difference between the time of interest with index i min and the reference time with index r defines the detection delay
d = t i min TI t r RT .
The time t i min TI is removed from the series to prevent multiple assignments. Iterating over all reference times t r RT with r = 1 , 2 , , R defines I t correctly detected swallow onsets and produces a vector of detection delays d containing I t detection delays. The remaining I f times of interest count as non-swallow events.

2.3. Evaluation Scores

An objective analysis of the timing and performance of swallow onset detection requires the definition of evaluation scores.

2.3.1. Timing

Evaluating the timing of swallow onset detection employs the mean
μ d = 1 I t i = 1 I t d i
and standard deviation
σ d = 1 I t i = 1 I t d i μ d 2
of the detection delays d = [ d 1 d 2 d i d I t ] , for the I t detected swallow onset. The mean represents the average delay and the standard deviation measures the scattering of the detected swallow onsets relative to the manually marked references. Therefore, a small mean and standard deviation of the detection delays represent a precise timing of swallow onset detection.

2.3.2. Preselection

The BI/EMG-based DoSO employs a preselection of potential swallow onsets in the BI data. The classifier uses a feature vector extracted from the BI and EMG data before the times of potential swallow onsets to discriminate between swallow onsets and non-swallow events. The ratio
Υ = I t I f
of swallow onsets I t to non-swallow events I f is essential for classifier training. The optimal ratio Υ is close to one because the training works best with balanced data sets. The preselection’s sensitivity S PS measures the share of detected swallow onsets from the number of swallow onsets in the data given as reference times t r RT , which correspond to the times of manual annotations. Calculating the sensitivity of the preselection
S PS = I t R
employs the number of identified swallow onsets I t and the number of reference times R.

2.3.3. Detection Performance

Evaluating the performance of swallow onset detection uses standard scores for classifier evaluation (see, e.g., [42]), which employ the True Positive (TP), False Negative (FN), False Positive (FP), and True Negative (TN) classified swallow onsets to calculate the sensitivity
S = TP TP + FN ,
the precision
P = TP TP + FP ,
and the specificity
C = TN TN + FP .
Controlling biofeedback and functional electrical stimulation demands the start of an intervention for every swallow onset represented by the sensitivity and a low number of falsely triggered interventions measured by the precision. The F 1 score unites properties of the sensitivity and precision without taking the true negative samples into account. Therefore, the F 1 score
F = 2 TP 2 TP + FP + FN = 2 1 P + 1 S
is the preferred measure for classification performance in this work. The F 1 score is the harmonic mean of the precision P and sensitivity S, representing the overall classification performance without considering the true negatives.
The evaluation of the EMG-based and BI/EMG-based DoSO employs a Leave-One-Subject-Out (LOSO) cross-validation (see Section 2.6.6), which generates a score for each subject in the test data. The median of the scores represents the average, and the interquartile range measures the dispersion of the scores. The formulation F = 0.638[0.153] stands for a median F 1 score of 0.638 with an interquartile range of 0.153.

2.4. Preprocessing of BI and EMG Data

The bioimpedance data change slowly and contain additive noise. Removing the noise through a third-order Butterworth low-pass filter with a 15 Hz cut-off frequency provides a maximum group delay of 0.03   s . Downsampling of the low-pass-filtered BI signal from 4000 Hz to 100 Hz reduces the computational expense of the BI-based preselection of swallow onsets.
After eliminating spikes, the EMG preprocessing removes the offset using a third-order high-pass filter with a 30 Hz cut-off frequency. Three notch filters with center frequencies of 50 Hz , 150 Hz , and 150 Hz suppresses power line disturbances in the EMG data. A third-order high-pass filter with a 30 Hz cut-off frequency and a second-order low-pass filter with a 300 Hz cut-off attenuate the transfer function of a whitening filter in the low- and high-frequency ranges. This whitening filter reverts the transfer function of the adhesive EMG electrodes and removes disturbances in a frequency range above 300 Hz .
In the following sections, EMG raw refers to the raw EMG data and EMG to the cleaned (preprocessed) EMG data. The feature calculation additionally utilizes the low-frequency trend in the electromyography tEMG of the raw EMG data generated by a third-order Butterworth low-pass filter with a 10 Hz cut-off frequency applied to the raw EMG data.
Most filter parameters were selected by visual inspection of the filter results on exemplary data. The aim was to find a trade-off between the desired filter effect and phase delays. The whitening filter and its parameters were taken from [10].

2.5. EMG-Based Detection of Swallow Onsets

EMG-based detection of swallow onsets employs a threshold method to detect onsets of active EMG. The method utilizes the samples of the EMG envelope in a sliding window, adopting the proposal by Li et al. [23].

2.5.1. Detection of Active EMG Onsets

After preprocessing of the raw EMG data, the sample rate is reduced from 4000 Hz to 1000 Hz . The estimate of the standard deviation σ 0 at rest utilizes a sliding window approach on the cleaned EMG data. The data window includes N σ = 250 samples and has a shift of one sample. The online estimate of σ 0 is updated every time the standard deviation in the EMG data window falls below the current value of σ 0 . Thus, the estimated standard deviation σ 0 at rest converges to the minimum in the measurement, which might cause inaccurate detection at the measurement’s beginning and reduce the performance of the EMG-based detection. A third-order low-pass filter with a 10 Hz cut-off frequency smooths the rectified clean EMG data and yields the EMG envelope signal eEMG m .
Two parameters control the threshold procedure: the window length w and the threshold factor θ 0 . The threshold
θ EMG = θ 0 · σ 0
to detect active EMG onsets is the product of the threshold factor θ 0 R + and the current standard deviation at rest σ 0 . Detecting an active EMG onset at sample index n ON requires all samples in a window with w samples before the onset sample eEMG n ON to exceed the threshold θ EMG
eEMG m > θ EMG with m = n ON w + 1 , , n ON .
After a detected EMG onset at index n ON , EMG onset detection is disabled for at least 1.0   s and until a sample eEMG i with i > n ON falls below the threshold θ EMG . During this period, the biofeedback or FES intervention will take place so a further triggering is not meaningful anyway. Figure 3 illustrates the threshold procedure for active EMG onset detection and the eEMG data of a single swallow.

2.5.2. Label Assignment

The evaluation of the EMG-based DoSO requires the assignment of the I ON onsets of active EMG at the times
t i ON = n i ON f T with i = 1 , 2 , , I ON
to the R reference times of swallow onsets
t r RT with r = 1 , 2 , , R ,
marked by the expert. Assigning the times of active EMG onsets t i ON to the reference times t r RT employs the assignment procedure of Section 2.2 with a threshold of θ LA = 0.5 s (for justification of this choice, see Section 2.6.2). The assignment procedure creates I t swallow onsets counting as a true positive and I f onsets of non-swallow events rated as a false positive. After finishing the R iterations of the assignment procedure, the unassigned reference times t r RT count as false negatives. The number of true negatives
TN = M N skip N start FN I f I t
equals the sample number M of the downsampled EMG data vector minus the excluded samples N skip , the false EMG onsets I f , the false negatives FN, the true EMG onsets I t , the skipped samples N skip , and excluded samples N start at the start of an EMG measurement defined by the maximum of w or N σ . Finally, the confusion matrix enables the calculation of evaluation scores, such as sensitivity, precision, and F 1 score for the subjects.

2.5.3. Optimization and Evaluation

The optimization of the parameters θ 0 and w employs a grid search for testing the parameter pairs ( θ k 0 w l ) generated by combining L θ values of the threshold factor
θ 0 = [ 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6.0 6.5 7.0 ]
and L w realizations of the window length
w = [ 50 75 100 125 150 175 200 225 250 275 300 ] .
The grid search produces two matrices: F R L θ × L w and μ R L θ × L w , containing the F 1 score and mean detection delay for each parameter pair ( θ k 0 w l ), respectively. Selecting the optimal parameter pair ( θ ^ 0 w ^ ) from the matrices applies two conditions. First, determining the indices
( k , l ) = arg μ ( k , l ) < θ μ d μ
of the elements in the mean detection delay matrix μ that fall below the threshold θ μ d to limit the mean detection delay of the EMG-based DoSO. The selection of the threshold θ μ d will be discussed later on in Section 3.2.1. Second, the element F k ^ , l ^ of the F 1 score matrix F
( k ^ , l ^ ) = arg max k k , l l F k , l
with the highest F 1 score among the elements ( k , l ) defines the optimal parameter pair ( θ ^ 0 = θ k ^ 0 w ^ = w l ^ ) for the given data.
The LOSO cross-validation iteratively employs the measurements of all subjects except for subject s { 1 , 2 , , S } to determine the optimal parameter pair ( θ ^ 0 w ^ ) for subject s with a grid search. Detecting the active EMG onsets in the EMG measurements of subject s with the optimized parameter pair ( θ ^ 0 w ^ ) yields the evaluation score and the detection delay for the subject s. The final score is the median and interquartile range of the S evaluation scores produced by the LOSO-cross-validation iterations. Computing the mean and standard deviation of the detection delays of all S subjects completes the analysis.

2.6. BI/EMG-Based Detection of Swallow Onsets

BI/EMG-based detection of swallow onsets involves of two stages. First, a BI-based preselection determines times of potential swallow onsets. Second, a classifier uses feature vectors extracted from BI and EMG data before the times of potential swallow onsets to distinguish swallow onsets and non-swallow events. Training, hyperparameter optimization, and evaluation of a classifier require a set of labeled features. This section describes the BI-based preselection, the feature extraction, the feature optimization, the hyperparameter optimization, and the evaluation procedure.

2.6.1. BI-Based Preselection

Algorithm 1 presents the BI-based preselection that searches for a significant signal drop after a detected maxima. The function Initialization ( BI 1 ) sets max to False, and BI max , BI m 1 , and BI m 2 to the first sample BI 1 of the low-pass-filtered BI data. Calling the preselection procedure Preselection ( BI m ) with the subsequent samples BI m of the BI data returns True for a potential swallow onset with the index m. The detection of local maxima compares three subsequent samples of the BI data vector stored in the variables BI m , BI m 1 , and BI m 2 . If BI m 1 exceeds BI m and BI m 2 , the data sample BI m 1 represents a new local maximum. BI max saves the value of the new local maximum BI m 1 , and the flag max becomes True to enable the search for the BI drop.
Algorithm 1 BI-based Preselection
1:procedure Initialization( BI 1 )
2:     BI max BI 1 ▹ current maximum
3:     BI m 1 BI 1 ▹ previous sample
4:     BI m 2 BI 1 ▹ pre-previous sample
5:     max False ▹ flag for maximum search
6:end procedure
7:
8:procedure Preselection( BI m )
9:    out ← False
10:
11:    if  BI m 2 < BI m 1  and  BI m 1 > BI m  then▹ reset local maximum search
12:         BI max BI m 1
13:         max True
14:    end if
15:
16:    if  BI max BI m > θ PS  and max = True then▹ check threshold
17:         max False
18:        out← True
19:    end if
20:
21:     BI m 2 BI m 1 ▹ save values for local maximum search
22:     BI m 1 BI m
23:
24:    return out
25:end procedure
Comparing the BI difference ( BI max BI m ) between the current local maximum BI max and the actual BI sample BI m to the threshold θ PS determines potential swallow onsets. If the BI difference exceeds the threshold θ PS , max becomes False, and the return value out is True. Setting the flag max = False prevents detecting multiple potential swallow onsets after a local maximum. A new local maximum reactivates the search for potential swallow onsets because max becomes True.

2.6.2. Label Assignment

Applying preselection to a BI data vector BI yields times of interest t i TI = m i · 1 f T for the indices m i of the BI samples BI m i if the function P r e s e l e c t i o n ( BI m ) returns True. The times of interest t i TI with i = 1 , 2 , , I represent potential swallow onsets, consisting of swallow onsets and non-swallow events. Creating a data set D = { ( y i , x i , s i ) } of samples containing feature vectors x i R J , subject indices s i { 0 , 1 , , S } , and class labels y i { 0 , 1 } requires the assignment of the class labels. Assigning the class labels employs the procedure presented in Section 2.2.
Setting the threshold interval to θ LA = 0.5   s considers the mean 0.39   s and standard deviation 0.14   s of the duration t min from the start to the minimum of the BI valleys, according to Schultheiss et al. [43]. The assignment procedure generates I t swallow onset samples (true samples), I f non-swallow event samples (false samples), and a vector of detection delays d with an entry for each swallow onset.
In some cases, preselection detects multiple potential swallow onsets on the decreasing edge of a BI swallow valley. The assignment of class label marks one of the potential swallow onsets as a swallow onset and the remaining potential swallow onsets as non-swallow events, even if the times t i TI fall on a decreasing edge of a BI swallow valley. Thus, the corresponding feature vectors have the properties of swallow onsets labeled as non-swallow events, which impairs the classifier training.
Furthermore, the intended use of BI/EMG-based DoSO is to trigger FES or biofeedback. Therefore, removing non-swallow events falling in the interval t i T , t i T + θ skip with length θ skip after a swallow onset at t i T from D respects the use case of the BI/EMG-based DoSO because triggering biofeedback or FES during an active intervention is pointless. The interval length of θ skip = 1.0   s respects the 0.763   s mean and 0.205   s standard deviation of the BI swallow valley duration t end presented by Schultheiss et al. [43].

2.6.3. Feature Extraction

Features are extracted from the BI, EMG, and tEMG signals for time windows preceding any potential swallow onset detected by the BI-based preselection. Figure 4 and Figure 5 show examples of the signals for swallow onsets and non-swallow events in a healthy subject.
For the calculation of the j-th element of the feature vector x i from a signal Z (BI, EMG, or tEMG), a data vector is generated
Z i , j = Z m i M j + 1 Z m i 1 Z m i
using M j samples up to the sample index m i indicating a potential swallow onset.
Maximizing the feature relevance yields an optimal sample number M j opt for the data vector of a feature. The feature vectors x i consist of J different features x i , j . Table A1 presents a complete list of all features with the corresponding index j.
The standard deviation measures the scatter of the data vectors. Calculating the standard deviation σ i BI of BI data vectors BI i , and the standard deviation σ i tEMG of EMG data vectors EMG i provide features concerning the data scatter.
x i , 1 = σ i tEMG = σ tEMG i , 1
x i , 4 = σ i BI = σ BI i , 4
The features i BI and i tEMG represent the number of samples in the data vectors BI i , j and tEMG i , j exceeding the reference values tEMG m i and BI m i , respectively, divided by the sample number of the data vectors.
x i , 2 = i tEMG = | { tEMG i , 2 | tEMG l > tEMG m i , l = m i M 2 + 1 , m i 1 } | M 2
x i , 5 = i BI = | { BI i , 5 | BI l > BI m i , l = m i M 5 + 1 , m i 1 } } | M 5
Here, | { · } | represents the cardinality of a set.
The next feature x i , 9 is the Average Amplitude Change (AAC) of the EMG data vectors EMG i , 9 . This feature estimates the extent of EMG activation.
x i , 9 = AAC i = 1 M 9 l = m i M 9 + 1 m i 1 EMG l + 1 EMG l
Furthermore, a temporal analysis of BI, EMG, and tEMG data extracts essential information. The method divides the M j data samples into W j 2 N windows containing Δ j N samples. The sample number M j in a data vector Z i , j is restricted to multiples of the window length Δ j . The mean
μ n ( Z i , j ) = 1 Δ j l = ( n 1 ) · Δ j + 1 n · Δ j Z l + m i M j
and standard deviation
σ n ( Z i , j ) = 1 Δ j l = ( n 1 ) · Δ j + 1 n · Δ j ( Z l + m i M j μ n ( Z i , j ) ) 2
of n = 1 , 2 , , W j windows are the basis of the temporal-structure features.
Two EMG features utilize the standard deviations σ n ( z ) , the index of the window with maximal standard deviation σ ¯ i EMG , and the index of the window with minimal standard deviation σ ̲ i EMG .
x i , 11 = σ ¯ i EMG = arg max n ( σ n ( EMG i , 11 ) )
x i , 12 = σ ̲ i EMG = arg min n ( σ n ( EMG i , 12 ) )
Additionally, the difference
x i , 10 = δ i EMG = σ 2 ( EMG i , 10 ) σ 1 ( EMG i , 10 )
between the standard deviations of two subsequent windows with W 10 = 2 before the times of interest t i TI models the steepness of the active EMG onset.
The same method assesses the time structure of BI i , j and tEMG i , j data vectors. The value of the highest standard deviation σ ^ i BI , the index of the highest standard deviation σ ¯ i BI , and the index of the lowest standard deviation σ ̲ i BI among W j windows of the BI data vectors BI i , j capture information about prominent BI alterations. The value of the highest standard deviation σ ^ i tEMG among W j windows of the tEMG data vectors tEMG i , j extends the feature set.
x i , 6 = σ ^ i BI = max n σ n ( BI i , 6 )
x i , 7 = σ ¯ i BI = arg max n σ n ( BI i , 7 )
x i , 8 = σ ̲ i BI = arg min n σ n ( BI i , 8 )
x i , 3 = σ ^ i tEMG = max n σ n ( tEMG i , 3 )

2.6.4. Feature Optimization

Radivojac et al. [44] defines feature relevance as the difference between the probability distributions f A ( x i , j ) and f B ( x i , j ) of the true A = { D | y i = 1 } and false B = { D | y i = 0 } samples of a feature in a data set D = { ( y i , x i , s i ) } with i = 1 , 2 , , I samples. The overlap measures the similarity of f A ( x i , j ) and f B ( x i , j ) without further restraints on the distributions [45]. Usually, a Kernel Density Estimation (KDE) generates approximations f ^ A ( x i , j ) and f ^ B ( x i , j ) of the probability distributions f A ( x i , j ) and f B ( x i , j ) based on the sets A and B. Integrating over the minimum of approximated probability distributions f ^ A ( x i , j ) and f ^ B ( x i , j ) yields an estimation of the overlap
η ( A , B ) = min f ^ A ( x i , j ) , f ^ B ( x i , j ) d x i , j .
In this work, the feature optimization employs a custom method to estimate the overlap (see Appendix A).
The feature optimization for the BI/EMG-based DoSO uses the relevance measure (39) to optimize the sample number M j N + of the data vectors for feature extraction. Calculating feature values x i , j for a series of increasing sample numbers M j = Δ j n with n = 2 , , N j yields a series of overlap values η ( A ( M j ( n ) ) , B ( M j ( n ) ) ) . Here, Δ j and N j are positive integers defined for each feature in Table A1. Therefore, the optimal sample number M j opt = Δ j n j opt is the sample number with the index of smallest overlap.
n j opt = arg min n = 2 , , N j ( η ( A ( M j ( n ) ) , B ( M j ( n ) ) ) )
Estimating the optimal sample number utilizes the data of data series I, II, III, and IV. The feature extraction employs the optimized sample numbers M j opt of the data vectors to create the data set for hyperparameter optimization and evaluation. Finally, W j = n j opt is used for the features 3, 6, 7, 8, 11, and 12.
Figure 6 presents a visualization of the overlap between two Log-normal probability-density functions f B ( x ) L o g N o r m a l ( 0 , 0.6 ) and f A ( x ) L o g N o r m a l ( 0 , 0.9 ) and clarifies the capability of the overlap to measure the similarity between two probability distributions.

2.6.5. Hyperparameter Optimization

Choosing a classifier involved a comparison of Random Forest (RF), support vector machines, k-nearest neighbors, and multi-layer perceptron classifiers that included five methods for data scaling. The RF classifier achieved the best results for data of dysphagia patients and is independent of the scaling method. Therefore, this work employs the RF classifier and a standard scaler to distinguish swallow onsets from non-swallow events.
Usually, classifiers provide a set of P hyperparameters controlling the training process. Determining the optimal hyperparameter vector λ opt requires the evaluation of multiple realizations of hyperparameter vectors
λ l = λ l , 1 λ l , 2 λ l , P
with l = 1 , 2 , , L for given hyperparameter ranges.
Grid search is a straightforward approach for hyperparameter optimization that reduces the search space Λ = { λ 1 , λ 2 , , λ l , , λ L } to L N equal distant points, given by predefined hyperparameter vectors λ l generated from predefined hyperparameter ranges.
Random search generates the hyperparameter vectors λ l by sampling the hyperparameter values λ l , p randomly from the predefined ranges. Bergstra et al. [46] argue that random search outperforms grid search in higher dimensional search spaces.
Hyperparameter optimization evaluates the hyperparameter response function Ψ for each λ l to determine the optimal hyperparameter vector
λ opt = arg max λ Λ ( Ψ ) .
A mixed hyperparameter optimization is applied in this work. Table 2 presents all optimized hyperparameters. Grid search optimizes the class weighting, and random search searches four additional parameters.
The applied Python library Sklearn [47] allows the definition of weights w 0 and w 1 for the two classes (0—non-swallows and 1—swallow onsets):
ω 0 = weight 0 · | { D | y i = 0 } | 2 · | D | = weight 0 · I f 2 I ,
ω 1 = weight 1 · | { D | y i = 1 } | 2 | D | = weight 1 · I t 2 I .
The grid search is used for the hyperparameter weight1 by evaluating L 1 = 6 values
weight 1 { 0.5 , 1.0 , 1.5 , 2.0 , 2.5 , 3.0 }
with weight 0 = 1 .
The hyperparameters with random search are ccp_alpha for cost-complexity pruning, min_impurity_decrease and min_weight_fraction_leaf for early stopping during the tree construction, and max_samples to test different sizes of the bagging data sets. The number of evaluated vectors for these four parameters is L 2 = 200 .
Setting the hyperparameters min_samples_split = 2, min_samples_leaf = 1, max_dept = None, and max_leaf_nodes = None disables these mechanisms.
The remaining hyperparameters n_estimators = 100, criterion = gini, max_features = sqrt, bootstrap = True, random_state = 1, and class_weight = balanced remain in the default state.

2.6.6. Classifier Selection and Test

Classifier selection and unbiased testing with respect to hyperparameter optimization requires a nested LOSO cross-validation. The outer LOSO cross-validation iteratively splits the complete data D into evaluation data D e = { D | s i s test } and test data D test = { D | s i = s test } concerning a test subject s test = 1 , 2 , , S .
The inner LOSO cross-validation trains and evaluates the classifier for all L = L 1 · L 2 hyperparameter vectors on the data D e of the remaining subjects. Iterating over all ( S 1 ) remaining subjects and hyperparameter vectors λ l with l = 1 , 2 , , L in the inner LOSO cross-validation generates a matrix O s test R L × ( S 1 ) of F1 scores and a matrix of complexity measures C s test R L × ( S 1 ) . The latter contains the leaf numbers in the decision trees of the random forest classifier as proposed by Breiman [48].
The classifier selection and test utilize the matrices O s test and C s test to determine the corresponding index l opt of the optimal hyperparameter vector for the classifier training. The l-th rows of the matrices link to the hyperparameter vector λ l and contain the obtained scores/complexity measures for the ( S 1 ) subjects. The vectors o l and c l denote the l-th row of O s test and C s test , respectively. The selection and test process involves the following steps [49,50]:
  • Determine the mean values μ l for all rows of O s test .
  • Find the index of the largest mean value: l max = arg l max ( μ l ) .
  • Calculate the standard error σ se = std ( o l max ) / S 1 [49] and the modified standard error σ l mod = σ se 1 ρ l for each row [50]. Here, ρ l is the correlation of the vectors o l max and o l .
  • Select all rows of O s test whose μ l lie in the interval [ μ l max σ l mod , μ l max ] and select the one that has the lowest mean of the corresponding rows in the matrix C s test (complexity measure). This yields the index l opt and vector λ l opt .
  • Test the classifier linked to λ l opt on D test to yield the unbiased score o ^ [ s test ] .
In summary, the selection of the optimal RF classifier considers all models whose performance scores are within the modified standard error interval and chooses the model with the lowest complexity. Finally, after completion of the outer LOSO cross-validation, the evaluation returns the test score vector o ^ R S . The median and interquartile range summarize the test scores from the vector o ^ for S subjects.

3. Results

3.1. BI-Based Preselection of Swallow Onsets

The choice of θ PS is critical because this parameter affects the timing and classification performance. Triggering the interventions synchronous to swallowing requires a low mean μ d and standard deviation σ d of the detection delay. Since the mean BI drop lasts t min = 0.39   s according to Schultheiss et al. [43], a desired mean detection delay below μ d 0.039   s results in detecting swallow onsets within the first 10.0% of the BI drop duration. The classification performance will most likely increase with a more balanced ratio Υ . Thus, the choice of θ PS should consider both aspects. After a manual tuning, θ PS = 0.18   Ω was obtained, leading to the desired mean detection delay of μ d = 0.039   s and a standard deviation of σ d = 0.079   s in average for all data series.
Table 3 presents the scores used to evaluate the BI-based preselection with θ PS = 0.18   Ω separately for the data series I, II, III, and IV. The mean detection delay μ d = 0.033 s for dysphagia patients (data series IV) falls below the desired limit of 0.039   s .
Nevertheless, the differences of μ d between data series are small compared to the standard deviation of the detection delays. Thus, bolus type, swallowing style, anatomical differences, and other individual factors influence detection delay. Generally, the BI-based preselection provides an excellent sensitivity S PS of from 0.977 to 0.998 and a sufficient ratio Υ of swallow to non-swallow events of from 0.114 to 0.256 .

3.2. Optimized Test Results

3.2.1. EMG-Based Detection of Swallow Onsets

The optimization of the parameters for the EMG-based DoSO uses the mean detection delays from Table 3 as values for the threshold θ μ d in (22).
Table 4 presents the sensitivity, precision, F 1 score, and the mean μ d and standard deviation σ d of the detection delays for EMG-based DoSO. The EMG-based DoSO achieves the highest F 1 score F = 0.619 [ 0.199 ] for data series II. The F 1 scores F = 0.553 [ 0.135 ] and F = 0.529 [ 0.127 ] for data series I and III are more than 0.05 units lower compared to those of data series II because data set I contains the highest share of movements and the SNR of the EMG measurements in data series III is the lowest of the data series of healthy subjects. The median F 1 score F = 0.289 [ 0.496 ] for dysphagia patients is more than 0.3 units lower compared to data sets II and III.
The median sensitivity is high for data series I and II with S = 0.84[0.131] and S = 0.759[0.235], respectively. The EMG-based DoSO has a reduced median sensitivity S = 0.5 [ 0.678 ] concerning dysphagia patients in data series IV.
The mean detection delays of the EMG-based DoSO are slightly shorter than the mean detection delays of BI/EMG-based DoSO, presented in Table 5. Nevertheless, μ d for EMG-based and BI/EMG-based DoSO fall in a similar range.

3.2.2. BI/EMG-Based Detection of Swallow Onsets

The evaluation of the random forest classifier for the BI/EMG-based detection of swallow onsets employs the data of each data series I, II, III, and IV separately. Table 5 shows the median and interquatile range of the sensitivity, precision, F 1 score, and specificity for the BI/EMG-based DoSO. The specificity is approximately 0.95 for all data series, indicating an excellent classification of non-swallow events. The BI/EMG-based DoSO reaches a F 1 scores of F = 0.705 [ 0.191 ] for data series I, F = 0.826 [ 0.094 ] for data series II, F = 0.732 [ 0.184 ] for data series III, and F = 0.546 [ 0.405 ] for data series IV. Therefore, the BI/EMG-based DoSO performs substantially better compared to the EMG-based DoSO. The F 1 score for dysphagia patients is approximately 0.2 units lower compared to healthy subjects in data series I, II, and III. The sensitivity of the BI/EMG-based DoSO is slightly lower, but the precision is higher compared to the EMG-based DoSO. The maximization of the F 1 score yields weight 1 values between 0.5 and 2.0.
Evaluating the random forest classifier for BI/EMG-based DoSO employs two additional data combinations generated by adding data from healthy subjects to the training data and testing the classifier with the data of dysphagia patients. Table 6 presents the evaluation scores for training with data combination A = { I , II , III } or B = { I , II , III , IV } evaluated with data from data series IV. The F 1 score for combination A in row one is slightly higher, and the F 1 score for combination B in row two is reduced, compared to the performance for data series IV in Table 5. Thus, extending the data for classifier training with samples from healthy subjects does not substantially enhance the classification performance for dysphagia patients.
Figure 7 visualizes the sensitivity, precision, and F 1 score values of single dysphagia patients in data series IV, yielding the median and interquartile range values presented in Table 5. Usually, the sensitivity is the highest score for a patient, while the precision is lower, and the F 1 score falls between the sensitivity and precision. The sensitivity reaches a sufficient level of 0.7 for 26 out of 41 patients, resulting in a median sensitivity of S = 0.871 and precision of P = 0.425 for these patients. The sensitivity falls below 0.4 for ten patients, yielding a median sensitivity of S = 0.297 and precision of P = 0.203 for these patients. Therefore, BI/EMG-based DoSO performs well for the majority of patients but struggles for some dysphagia patients who are most likely more strongly impaired.

4. Discussion

The BI/EMG-based DoSO achieves considerably more precise timing compared to EMG-based DoSO. The BI-based preselection determines potential swallow onsets by detecting typical patterns in the BI measurements. Therefore, the BI-based preselection determines the timing of the BI/EMG-based DoSO. Evaluating the timing employs the mean μ d and standard derivation σ d of the detection delays. The BI-based preselection achieves μ d ( σ d ) = 0.039 ( 0.079 ) s for the complete data and μ d ( σ d ) = 0.033 ( 0.1 ) s for dysphagia patients.
In contrast, the EMG-based DoSO reached a detection delay of μ d ( σ d ) = 0.027 ( 0.183 ) s for healthy subjects and μ d ( σ d ) = 0.018 ( 0.203 ) s for dysphagia patients. The timing of BI/EMG-based DoSO is more precise because the standard deviation of the EMG-based DoSO is more than twice as high for healthy subjects and dysphagia patients.
The BI/EMG-based DoSO realizes a remarkably improved detection quality compared to EMG-based DoSO according to median F 1 scores, consisting of improvements of Δ F = 0.152 for data series I, Δ F = 0.207 for data series II, Δ F = 0.203 for data series III, and Δ F = 0.257 for data series IV. Thus, on average, the BI/EMG-based DoSO gained Δ F = 0.204 F 1 score points compared to EMG-based DoSO. The highest F 1 score of the BI/EMG-based DoSO for dysphagia patients is F = 0.556 [ 0.405 ] using only data from healthy subjects in data series I, II, and III for the classifier training. The BI/EMG-based DoSO of swallow onsets requires no manual parameter tuning, providing a significant advantage in a clinical application compared to existing approaches.
On average, the BI/EMG-based DoSO determines 56.0 % of the performed swallow onsets (S = 0.56) and 50.0 % of the detected swallow onsets are triggered by non-swallow events (P = 0.5) for dysphagia patients in Table 5. Therefore, although BI/EMG-based DoSO outperforms EMG-based DoSO, the detection of the onset of swallowing needs to be further improved.
The BI/EMG-based DoSO might perform much better when applied in real biofeedback or triggered FES sessions because the subjects are more concentrated on swallowing and execute fewer movements, provoking a reduced number of false positives in therapy sessions. The experimental setup for the data series IV with dysphagia patients intended to investigate the assessment of swallowing based on BI/EMG measurements. Therefore, the study protocol did not restrict the patients’ behavior beyond the monitored swallows. Therefore, the data is likely not representative for biofeedback or triggered FES sessions in dysphagia therapy. In clinical practice, most patients learn to reduce undesired movements and evolve toward healthier swallowing patterns, probably enhancing the performance of the BI/EMG-based DoSO.
The lower performance for dysphagia patients likely originates from the reduced EMG activation that causes a smaller contraction of swallow-related muscles and decreases the elevation of the larynx and the depth of BI valleys. Therefore, the reduced swallow ability might impede the separation of swallow onsets from head and jaw movements. Additionally, dysphagia patients sometimes use head movements to initiate swallows, which blurs the difference between swallow onsets and head movements.
The study has the following additional limitations. The ground truth relies on manual annotations of swallow onsets marked by only one expert. Employing several experts leads to more reliable ground truth and enhances the reliance of the results. The timing analysis depends on times of swallow onsets manually marked by an expert based on BI and EMG data. Therefore, the ground truth tends to be more in favor of the BI/EMG-based DoSO.
The threshold θ PS for the BI-based preselection of swallow onsets determines the mean detection μ d of the BI/EMG-based DoSO. Realizing a desired mean detection delay of 0.039   s demands a threshold θ PS of 0.18   Ω . The mean detection delay of the EMG-based DoSO was limited to the mean detection delays of the BI/EMG-based DoSO to ensure comparable results for both approaches. Therefore, the results of the EMG-based and BI/EMG-based DoSO depend on the choice of the desired detection delay.
The reported DoSO performances depend on the selected pre-processing of the EMG and BI signals. In this work, no systematic investigation of the filter types and filter parameters was performed, as the complexity of such an optimization is extremely high. It is expected that a future optimization of the pre-processing could further increase DoSO performance. With more training data and sufficient computing power on the intended wearables for the use of the algorithm, deep learning methods could also be used in the future to better tackle this classification problem.
Another limitation of this study is that the involved healthy subjects are much younger in comparison to the patients with dysphagia. We expect that swallow performance will degrade in general with age, leading to less EMG activity as well as smaller and less steep BI changes during swallows. An age-matched control group would certainly have been better for this study. Future work should investigate and compare the performance of DoSO also explicitly for different patient populations, like geriatric patients, ENT-cancer patients, and patients with stroke or Parkinson’s disease.

5. Conclusions

The article investigated a threshold method based on EMG data and a machine learning approach to detect swallow onsets in BI and EMG data. The evaluation employs a comprehensive database that included measurements of 41 dysphagia patients. The ground truth utilizes manual annotations of swallow onsets marked by an expert. The F 1 score rated the detection performance and the standard deviation of the detection delays measures the timing of the detected swallow onsets.
The BI/EMG-based DoSO outperforms the EMG-based DoSO with an F 1 score of 0.546 compared to 0.289 for dysphagia patients. Further, the standard deviation of the detection delays is reduced by a factor of two from 0.203   s for the EMG-based DoSO to 0.1   s for the BI/EMG-based DoSO with similar mean values of the detection delay. Therefore, the timing of the BI/EMG-based DoSO is more precise compared to the EMG-based DoSO. The article presents the first analysis of the detection performance and timing of an EMG-based DoSO, based on comprehensive data from both healthy subjects and dysphagia patients.
The BI/EMG-based DoSO has the potential to improve biofeedback and triggered FES in dysphagia therapy. The results are still preliminary due to the small number of patients examined. Further studies are required with more patients who suffer from swallowing disorders due to various conditions. All developed methods of BI/EMG-based DoSO are purely causal, which makes a future real-time implementation likely. The methods omit individual parameter tuning. A long-term goal is to employ the DoSO on wearables to enable home care applications for dysphagia management. Such applications might be used in therapy sessions or in everyday life.

Author Contributions

Conceptualization, B.R.; methodology, B.R.; software, B.R.; validation, B.R. and T.S.; formal analysis, B.R. and T.S.; investigation, B.R. and T.S.; resources, R.O.S., T.S. and B.R.; data curation, B.R. and R.O.S.; writing—original draft preparation, B.R.; writing—review and editing, R.O.S., T.S., and B.R.; visualization, B.R.; supervision, R.O.S. and T.S.; project administration, B.R.; funding acquisition, B.R.; All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially funded by a research scholarship for young researchers of the Verein zur Förderung des Fachgebietes Regelungssysteme an der Technischen Universität Berlin e.V.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the Charité–Universitätsmedizin Berlin (EA1/019/10 and EA1/161/09).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available upon reasonable request from the corresponding author.

Conflicts of Interest

T.S. is co-founder of SensorStim Neurotechnology GmbH, which is a company developing FES stimulation devices. All other authors declare that they have no competing interest.

Abbreviations

The following abbreviations are used in this manuscript:
BIBioimpedance
EMGElectromyography
eEMGenvelope of Electromyography
DoSODetection of Swallow Onsets
FESFunctional Electric Stimulation
FNFalse Negatives
FPFalse Positives
LOSOLeave-One-Subject-Out
KDEKernel Density Estimation
RFRandom Forest
RTReference Times
tEMGtrend in Electromyography
TITime of Interest
TNTrue Negatives
TPTrue Positives
VFSSVideofluoroscopic Swallowing Study

Appendix A. Overlap Estimation and Results of Feature Relevance Optimization

This work utilizes an overlap estimation of distributions for feature relevance optimization.
Estimating the overlap for features with N N discrete integer values x i , j { 1 , , N } i and j = const . interprets the discrete feature values as bins of the distributions f A ( n ) and f B ( n ) . Given a data set of true samples A and a data set of false samples B, the probability estimates
p A ( n ) = | { A | x i = n } | | A |
p B ( n ) = | { B | x i = n } | | B |
with n = 1 , 2 , , N of a feature value n being the quotient of the number of features with the value n and the number of features in the data set. The operator | . | returns the number of elements in a set. Summing up the minimum of the probability estimates p A ( n ) and p B ( n ) yields the overlap coefficient
η ^ ( A , B ) = n = 1 N min p A ( n ) , p B ( n )
for the given data. A smaller value of the overlap coefficient coincides with a larger difference in the probability distributions, indicating a higher feature relevance.
Extending this method for real-valued feature values x i , j R i and index j = const . with distinct values requires the definition of ( N + 1 ) borders x n bin to split the data D = { ( y i , x i , s i ) } with i = 1 , 2 , , I into N N subsets D n bin . Defining an equal sample number N bin = | D n bin | I for the first N 1 subsets yields a rule to determine the borders and number of bins. The first border x 1 bin = min ( x i , j ) is the minimal feature value, and the greatest border x N + 1 bin = max ( x i , j ) is equal to the maximal feature value. Setting the remaining borders x n bin with n = 2 , , N follows a simple rule
N bin = D n bin = D | x n bin < x i , j < x n + 1 bin .
The last subset D N bin with the borders x N bin and x N + 1 bin includes the remaining [ I N bin · ( N 1 ) ] samples. Computing the probabilities
p A ( n ) = | A n | | A | with A n = { D n bin | y i = 1 }
p B ( n ) = | B n | | B | with B n = { D n bin | y i = 0 }
with n = 1 , 2 , , N yields the overlap of real-valued features using (A3).
The overlap estimation utilizes N bin = 50 samples per bin for real-valued features with the indices j = 1 , 2 , 3 , 4 , 5 , 6 , 9 , 10 . The features with the indices j = 7 , 8 , 11 , 12 are integer-valued.
Table A1 presents the parameters Δ j and N j for the feature relevance optimization (cf. Section 2.6.4) together with the obtained results.
Table A1. Overview of sample increment Δ j , the number N j of maximal possible increments, the sampling frequency f T , the obtained number of optimal samples M j opt , the corresponding optimal time interval t j opt , and optimal overlap estimate η ^ j opt for all features j = 1 , , 12 .
Table A1. Overview of sample increment Δ j , the number N j of maximal possible increments, the sampling frequency f T , the obtained number of optimal samples M j opt , the corresponding optimal time interval t j opt , and optimal overlap estimate η ^ j opt for all features j = 1 , , 12 .
DatajFeature Δ j N j f T M j opt t j opt η ^ j opt
tEMG1 σ tEMG 200404000 Hz 4000.1 s78.6%
2 Υ tEMG 200404000 Hz 60001.5 s82.1%
3 σ ^ tEMG 600204000 Hz 12000.3 s80.4%
BI4 σ BI 540100 Hz 300.3 s63.1%
5 Υ BI 540100 Hz 1901.9 s74.5%
6 σ ^ BI 1520100 Hz 750.75 s63.8%
7 σ ¯ BI 1520100 Hz 750.75 s78.4%
8 σ ̲ BI 1520100 Hz 450.45 s82.6%
EMG9AAC100404000 Hz 3000.075 s39.9%
10 δ EMG 100404000 Hz 17000.425 s38.4%
11 σ ¯ EMG 600204000 Hz 72001.8 s40.8%
12 σ ̲ EMG 600204000 Hz 18000.45 s60.1%

References

  1. Armstrong, J.R.; Mosher, B.D. Aspiration Pneumonia After Stroke. Neurohospitalist 2011, 1, 85–93. [Google Scholar] [CrossRef] [PubMed]
  2. Banda, K.J.; Chu, H.; Kang, X.L.; Liu, D.; Pien, L.C.; Jen, H.J.; Hsiao, S.T.S.; Chou, K.R. Prevalence of dysphagia and risk of pneumonia and mortality in acute stroke patients: A meta-analysis. BMC Geriatr. 2022, 22, 420. [Google Scholar] [CrossRef]
  3. Giraldo-Cadavid, L.F.; Pantoja, J.A.; Forero, Y.J.; Gutiérrez, H.M.; Bastidas, A.R. Aspiration in the Fiberoptic Endoscopic Evaluation of Swallowing Associated with an Increased Risk of Mortality in a Cohort of Patients Suspected of Oropharyngeal Dysphagia. Dysphagia 2020, 35, 369–377. [Google Scholar] [CrossRef] [PubMed]
  4. Manikantan, K.; Khode, S.; Sayed, S.I.; Roe, J.; Nutting, C.M.; Rhys-Evans, P.; Harrington, K.J.; Kazi, R. Dysphagia in head and neck cancer. Cancer Treat. Rev. 2009, 35, 724–732. [Google Scholar] [CrossRef]
  5. Groher, M.; Crary, M. Dysphagia: Clinical Management in Adults and Children, 3rd ed.; Elsevier: St. Louis, MO, USA, 2020; pp. 1–400. [Google Scholar]
  6. Matsuo, K.; Palmer, J.B. Anatomy and Physiology of Feeding and Swallowing: Normal and Abnormal. Phys. Med. Rehabil. Clin. N. Am. 2008, 19, 691–707. [Google Scholar] [CrossRef] [PubMed]
  7. Nahrstaedt, H.; Schauer, T.; Seidi, R. Bioimpedance based measurement system for a controlled swallowing neuro-prosthesis. In Proceedings of the 15th Annual International FES Society Conference and 10th Vienna Int. Workshop on FES, Vienna, Austria, 8–12 September 2010; pp. 49–51. [Google Scholar]
  8. Yamamoto, Y.; Nakamura, T.; Seki, Y.; Utsuyama, K.; Akashi, K.; Jikuya, K. Neck electrical impedance for measurement of swallowing. Electr. Eng. Jpn. 2000, 130, 35–44. [Google Scholar] [CrossRef]
  9. Smaoui, S.; Peladeau-Pigeon, M.; Steele, C.M. Determining the Relationship Between Hyoid Bone Kinematics and Airway Protection in Swallowing. J. Speech Lang. Hear. Res. 2022, 65, 419–430. [Google Scholar] [CrossRef]
  10. Nahrstaedt, H. Automatic Detection and Assessment of Swallowing Based on Bioimpedance and Electromyography Measurements—Enabling Control of Functional Electrical Stimulation Synchronously to Volitional Swallowing in Dysphagic Patients. Ph.D. Thesis, Technische Universität Berlin, Berlin, Germany, 2017. [Google Scholar] [CrossRef]
  11. Frank, D.L.; Khorshid, L.; Kiffer, J.F.; Moravec, C.S.; McKee, M.G. Biofeedback in medicine: Who, when, why and how? Ment. Health Fam. Med. 2010, 7, 85–91. [Google Scholar]
  12. Crary, M.A. A direct intervention program for chronic neurogenic dysphagia secondary to brainstem stroke. Dysphagia 1995, 10, 6–18. [Google Scholar] [CrossRef]
  13. Loppnow, A.; Netzebandt, J.; Frank, U.; Huckabee, M.L. Skill-Training in der Dysphagietherapie: Möglichkeiten eines patientenorientierten Vorgehens mittels sEMG-Biofeedback. Spektrum Patholinguistik 2016, 9, 243–258. [Google Scholar]
  14. Crary, M.A.; Carnaby, G.D.; Groher, M.E.; Helseth, E. Functional benefits of dysphagia therapy using adjunctive sEMG biofeedback. Dysphagia 2004, 19, 160–164. [Google Scholar] [CrossRef] [PubMed]
  15. Huckabee, M.L.; Steele, C.M. An Analysis of Lingual Contribution to Submental Surface Electromyographic Measures and Pharyngeal Pressure During Effortful Swallow. Arch. Phys. Med. Rehabil. 2006, 87, 1067–1072. [Google Scholar] [CrossRef] [PubMed]
  16. Steele, C.M.; Bennett, J.W.; Chapman-Jay, S.; Polacco, R.C.; Molfenter, S.M.; Oshalla, M. Electromyography as a Biofeedback Tool for Rehabilitating Swallowing Muscle Function. In Applications of EMG in Clinical and Sports Medicine; InTech: London, UK, 2012; Chapter 19; pp. 311–328. [Google Scholar] [CrossRef]
  17. Archer, S.K.; Smith, C.H.; Newham, D.J. Surface Electromyographic Biofeedback and the Effortful Swallow Exercise for Stroke-Related Dysphagia and in Healthy Ageing. Dysphagia 2021, 36, 281–292. [Google Scholar] [CrossRef] [PubMed]
  18. Azola, A.M.; Sunday, K.L.; Humbert, I.A. Kinematic Visual Biofeedback Improves Accuracy of Learning a Swallowing Maneuver and Accuracy of Clinician Cues During Training. Dysphagia 2017, 32, 115–122. [Google Scholar] [CrossRef]
  19. Lee, Y.; Nicholls, B.; Lee, D.S.; Chen, Y.; Chun, Y.; Ang, C.S.; Yeo, W.H. Soft electronics enabled ergonomic human–computer interaction from swallowing training. Sci. Rep. 2017, 7, 46697. [Google Scholar] [CrossRef]
  20. Stepp, C.E.; Britton, D.; Chang, C.; Merati, A.L.; Matsuoka, Y. Feasibility of game-based electromyographic biofeedback for dysphagia rehabilitation. In Proceedings of the 2011 5th International IEEE/EMBS Conference on Neural Engineering, Cancun, Mexico, 27 April–1 May 2011; pp. 233–236. [Google Scholar] [CrossRef]
  21. Pollock, C.R.; Lopez, D.A.; Wambaugh, G.; Almanzar, L.; Morrissey, A.; Krings, K.; Galek, K.; Harris, F.C. Avaler’s adventure: An open source game for dysphagia therapy. In Proceedings of the 26th International Conference on Software Engineering and Data Engineering, SEDE 2017, San Diego, CA, USA, 2–4 October 2017. [Google Scholar]
  22. Li, C.M.; Wang, T.G.; Lee, H.Y.; Wang, H.P.; Hsieh, S.H.; Chou, M.; Jason Chen, J.J. Swallowing Training Combined with Game-Based Biofeedback in Poststroke Dysphagia. PM R 2016, 8, 773–779. [Google Scholar] [CrossRef]
  23. Li, C.M.; Lee, H.Y.; Hsieh, S.H.; Wang, T.G.; Wang, H.P.; Chen, J.J.J. Development of Innovative Feedback Device for Swallowing Therapy. J. Med. Biol. Eng. 2016, 36, 357–368. [Google Scholar] [CrossRef]
  24. Kwong, E.; Ng, K.W.K.; Leung, M.T.; Zheng, Y.P. Application of Ultrasound Biofeedback to the Learning of the Mendelsohn Maneuver in Non-dysphagic Adults: A Pilot Study. Dysphagia 2021, 36, 650–658. [Google Scholar] [CrossRef]
  25. Hopkins-Rossabi, T.; Rowe, M.; McGrattan, K.; Rossabi, S.; Martin-Harris, B. Respiratory–Swallow Training Methods: Accuracy of Automated Detection of Swallow Onset, Respiratory Phase, Lung Volume at Swallow Onset, and Real-Time Performance Feedback Tested in Healthy Adults. Am. J. Speech-Lang. Pathol. 2020, 29, 1012–1021. [Google Scholar] [CrossRef]
  26. Miller, K.J.W.; Macrae, P.; Sands, G.B.; Huckabee, M.l.; Cheng, L.K. An Accurate Fiducial Marker for Aligning EMG signals with Swallow Onset. In Proceedings of the 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), IEEE, Sydney, Australia, 24–27 July 2023; pp. 1–4. [Google Scholar] [CrossRef]
  27. Langmore, S.E.; Pisegna, J.M. Efficacy of exercises to rehabilitate dysphagia: A critique of the literature. Int. J. Speech-Lang. Pathol. 2015, 17, 222–229. [Google Scholar] [CrossRef]
  28. Speyer, R.; Cordier, R.; Sutt, A.L.; Remijn, L.; Heijnen, B.J.; Balaguer, M.; Pommée, T.; McInerney, M.; Bergström, L. Behavioural Interventions in People with Oropharyngeal Dysphagia: A Systematic Review and Meta-Analysis of Randomised Clinical Trials. J. Clin. Med. 2022, 11, 685. [Google Scholar] [CrossRef] [PubMed]
  29. Benfield, J.K.; Everton, L.F.; Bath, P.M.; England, T.J. Does Therapy With Biofeedback Improve Swallowing in Adults With Dysphagia? A Systematic Review and Meta-Analysis. Arch. Phys. Med. Rehabil. 2019, 100, 551–561. [Google Scholar] [CrossRef]
  30. Battel, I.; Calvo, I.; Walshe, M. Interventions Involving Biofeedback to Improve Swallowing in People With Parkinson Disease and Dysphagia: A Systematic Review. Arch. Phys. Med. Rehabil. 2021, 102, 314–322. [Google Scholar] [CrossRef]
  31. Takeda, K.; Tanino, G.; Miyasaka, H. Review of devices used in neuromuscular electrical stimulation for stroke rehabilitation. Med. Devices Evid. Res. 2017, 10, 207–213. [Google Scholar] [CrossRef] [PubMed]
  32. Schauer, T. Sensing motion and muscle activity for feedback control of functional electrical stimulation: Ten years of experience in Berlin. Annu. Rev. Control 2017, 44, 355–374. [Google Scholar] [CrossRef]
  33. Leelamanit, V.; Limsakul, C.; Geater, A. Synchronized electrical stimulation in treating pharyngeal dysphagia. Laryngoscope 2002, 112, 2204–2210. [Google Scholar] [CrossRef] [PubMed]
  34. Burnett, T.A.; Mann, E.A.; Cornell, S.A.; Ludlow, C.L. Laryngeal elevation achieved by neuromuscular stimulation at rest. J. Appl. Physiol. 2003, 94, 128–134. [Google Scholar] [CrossRef]
  35. Burnett, T.A.; Mann, E.A.; Stoklosa, J.B.; Ludlow, C.L. Self-triggered functional electrical stimulation during swallowing. J. Neurophysiol. 2005, 94, 4011–4018. [Google Scholar] [CrossRef]
  36. Humbert, I.A.; Poletto, C.J.; Saxon, K.G.; Kearney, P.R.; Crujido, L.; Wright-Harp, W.; Payne, J.; Jeffries, N.; Sonies, B.C.; Ludlow, C.L. The effect of surface electrical stimulation on hyolaryngeal movement in normal individuals at rest and during swallowing. J. Appl. Physiol. 2006, 101, 1657–1663. [Google Scholar] [CrossRef]
  37. Nahrstaedt, H.; Schultheiss, C.; Schauer, T.; Seidl, R.O. Bioimpedance- and EMG-Triggered FES for Improved Protection of the Airway During Swallowing. Biomed. Eng./Biomed. Tech. 2013, 58, 000010151520134025. [Google Scholar] [CrossRef]
  38. Schultheiss, C.; Schauer, T.; Nahrstaedt, H.; Seidl, R.O. Efficacy of EMG/Bioimpedance-Triggered Functional Electrical Stimulation on Swallowing Performance. Eur. J. Transl. Myol. 2016, 26, 6065. [Google Scholar] [CrossRef] [PubMed]
  39. Hadley, A.J.; Kolb, I.; Tyler, D.J. Laryngeal elevation by selective stimulation of the hypoglossal nerve. J. Neural Eng. 2013, 10, 046013. [Google Scholar] [CrossRef]
  40. Tyler, D.J. Neuroprostheses for management of dysphagia resulting from cerebrovascular disorders. Acta Neurochir. Suppl. 2007, 97, 293–304. [Google Scholar] [CrossRef] [PubMed]
  41. Schultheiss, C. Die Bewertung der Pharyngalen Schluckphase Mittels Bioimpedanz: Evaluation eines Mess- und Diagnostikverfahrens. Ph.D. Thesis, Universität Potsdam, Potsdam, Germany, 2014. [Google Scholar]
  42. Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2018, 17, 168–192. [Google Scholar] [CrossRef]
  43. Schultheiss, C.; Schauer, T.; Nahrstaedt, H.; Seidl, R.O. Evaluation of an EMG bioimpedance measurement system for recording and analysing the pharyngeal phase of swallowing. Eur. Arch. Oto-Rhino-Laryngol. 2013, 270, 2149–2156. [Google Scholar] [CrossRef]
  44. Radivojac, P.; Obradovic, Z.; Keith Dunker, A.; Vucetic, S. Feature Selection Filters Based on the Permutation Test. In Proceedings of the Machine Learning: ECML 2004, Pisa, Italy, 20–24 September 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 334–346. [Google Scholar] [CrossRef]
  45. Pastore, M.; Calcagnì, A. Measuring distribution similarities between samples: A distribution-free overlapping index. Front. Psychol. 2019, 10, 1089. [Google Scholar] [CrossRef] [PubMed]
  46. Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
  47. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  48. Breiman, L.; Friedman, J.; Olshen, R.; Stone, C.J. Classification and Regression Trees; Chapman and Hall/CRC: New York, NY, USA, 1984; pp. 1–358. [Google Scholar]
  49. Chen, Y.; Yang, Y. The One Standard Error Rule for Model Selection: Does It Work? Stats 2021, 4, 868–892. [Google Scholar] [CrossRef]
  50. Yates, L.A.; Aandahl, Z.; Richards, S.A.; Brook, B.W. Cross validation for model selection: A review with examples from ecology. Ecol. Monogr. 2023, 93, e1557. [Google Scholar] [CrossRef]
Figure 1. Electrode placement of the four-electrode setup with an additional reference electrode. The current electrodes (red) placed on sternocleidomastoideus close to the ear introduce a sinusoidal current of 50 kHz. The measurement electrodes (green) placed on each side of the larynx measure the voltage over the enclosed tissue. The reference electrode (gray) is used to suppress common-mode disturbances.
Figure 1. Electrode placement of the four-electrode setup with an additional reference electrode. The current electrodes (red) placed on sternocleidomastoideus close to the ear introduce a sinusoidal current of 50 kHz. The measurement electrodes (green) placed on each side of the larynx measure the voltage over the enclosed tissue. The reference electrode (gray) is used to suppress common-mode disturbances.
Sensors 24 06525 g001
Figure 2. BI and EMG measurement of a saliva swallow. The swallow preparation phase (green background) shows some variation in the BI data caused by tongue movements from collecting saliva. The oral swallowing phase (red background) displays a small peak in the BI data, which continuously transitions to the BI swallow valley caused by the larynx elevation during the pharyngeal swallowing phase (blue background). The vertical line defines the time of the swallow onset marked by an expert, shortly after the start of the pharyngeal swallowing phase.
Figure 2. BI and EMG measurement of a saliva swallow. The swallow preparation phase (green background) shows some variation in the BI data caused by tongue movements from collecting saliva. The oral swallowing phase (red background) displays a small peak in the BI data, which continuously transitions to the BI swallow valley caused by the larynx elevation during the pharyngeal swallowing phase (blue background). The vertical line defines the time of the swallow onset marked by an expert, shortly after the start of the pharyngeal swallowing phase.
Sensors 24 06525 g002
Figure 3. A visualization of the cleaned EMG (EMG), the envelope EMG (eEMG), and the BI of a single swallow for EMG-based swallow onset detection. The black section of the eEMG trace highlights the w samples of eEMG that exceed θ EMG . The red vertical line denotes the manually marked swallow onset, while the black vertical line denotes the time of the detected swallow onset. The light red area marks the period with disabled onset detection that starts at a detected onset.
Figure 3. A visualization of the cleaned EMG (EMG), the envelope EMG (eEMG), and the BI of a single swallow for EMG-based swallow onset detection. The black section of the eEMG trace highlights the w samples of eEMG that exceed θ EMG . The red vertical line denotes the manually marked swallow onset, while the black vertical line denotes the time of the detected swallow onset. The light red area marks the period with disabled onset detection that starts at a detected onset.
Sensors 24 06525 g003
Figure 4. Left: Visualization of the BI data vectors of swallow onsets (12) and non-swallow events (34) in a healthy subject, shifted to zero at the time zero of potential swallow onsets. Right: Visualization of the tEMG data vectors of swallow onsets and non-swallows shifted to zero at the time zero of potential swallow onsets.
Figure 4. Left: Visualization of the BI data vectors of swallow onsets (12) and non-swallow events (34) in a healthy subject, shifted to zero at the time zero of potential swallow onsets. Right: Visualization of the tEMG data vectors of swallow onsets and non-swallows shifted to zero at the time zero of potential swallow onsets.
Sensors 24 06525 g004
Figure 5. A visualization of EMG data vectors of swallow onsets and non-swallow events in a healthy subject preceding the potential swallow onsets at time zero. The left subplot shows the EMG vectors of 12 swallow onsets (blue), and the right subplot shows the EMG vectors of 34 non-swallow events (green).
Figure 5. A visualization of EMG data vectors of swallow onsets and non-swallow events in a healthy subject preceding the potential swallow onsets at time zero. The left subplot shows the EMG vectors of 12 swallow onsets (blue), and the right subplot shows the EMG vectors of 34 non-swallow events (green).
Sensors 24 06525 g005
Figure 6. Illustration of the overlap η ( A , B ) between two Log-normal probability-density functions f B ( x ) L o g N o r m a l ( 0 , 0.6 ) and f A ( x ) L o g N o r m a l ( 0 , 0.9 ) .
Figure 6. Illustration of the overlap η ( A , B ) between two Log-normal probability-density functions f B ( x ) L o g N o r m a l ( 0 , 0.6 ) and f A ( x ) L o g N o r m a l ( 0 , 0.9 ) .
Sensors 24 06525 g006
Figure 7. Visualization of sensitivity (green dots), precision (red triangles), and F 1 score (blue dots) of BI/EMG-based detection of swallow onsets for individual dysphagia patients. A black line connects the sensitivity and the precision of each patient to visualize the span width between the scores.
Figure 7. Visualization of sensitivity (green dots), precision (red triangles), and F 1 score (blue dots) of BI/EMG-based detection of swallow onsets for individual dysphagia patients. A black line connects the sensitivity and the precision of each patient to visualize the span width between the scores.
Sensors 24 06525 g007
Table 1. The main properties of data series I to IV. Data series I to III contain data from healthy subjects, and data series IV consists of measurements from dysphagia patients. The table contains the number and sex of subjects, the mean age of the subjects with the standard deviation in brackets, the number of swallows, the cumulative duration of the measurements, and a brief commentary on the research intention for data series I to IV.
Table 1. The main properties of data series I to IV. Data series I to III contain data from healthy subjects, and data series IV consists of measurements from dysphagia patients. The table contains the number and sex of subjects, the mean age of the subjects with the standard deviation in brackets, the number of swallows, the cumulative duration of the measurements, and a brief commentary on the research intention for data series I to IV.
Data SeriesSubjectsAgeSwallowsDurationCommentary
I20 (8♀, 12♂)30.5 (7.7)965 3.68   h movements
II15 (4♀, 11♂)29.0 (4.5)2044 7.10   h repeatability
III9 (7♀, 2♂)38.6 (9.4)130 0.24   h investigators
IV41 (15♀, 26♂)63.4 (13.8)704 2.49   h patients
Table 2. The hyperparameter space for the random forest classifier, consisting of the optimization method, the hyperparameter sampling distribution, the hyperparameter name, and the range of the hyperparameter.
Table 2. The hyperparameter space for the random forest classifier, consisting of the optimization method, the hyperparameter sampling distribution, the hyperparameter name, and the range of the hyperparameter.
OptimizationHyperparameterDistributionRange
grid searchweight1n.a. [ 0.5 , 3 ]
random searchmin_impurity_decreaseuniform distribution [ 0.0 , 0.001 ]
random searchccp_alphauniform distribution [ 0.0 , 0.00125 ]
random searchmin_weight_fraction_leafuniform distribution [ 0.0 , 0.0025 ]
random searchmax_samplesuniform distribution [ 0.65 , 0.85 ]
Table 3. Scores to evaluate the BI-based preselection of swallow onsets with the parameters: θ PS = 0.18   Ω , θ LA = 0.5   s , and θ skip = 1.0   s . The mean detection delay μ d , the standard deviation σ d of the detection delay, the ratio of swallow to non-swallow events Υ , and the sensitivity S PS are presented for data series I, II, III, and IV. The last column contains the average scores calculated from the columns for data series I, II, III, and IV.
Table 3. Scores to evaluate the BI-based preselection of swallow onsets with the parameters: θ PS = 0.18   Ω , θ LA = 0.5   s , and θ skip = 1.0   s . The mean detection delay μ d , the standard deviation σ d of the detection delay, the ratio of swallow to non-swallow events Υ , and the sensitivity S PS are presented for data series I, II, III, and IV. The last column contains the average scores calculated from the columns for data series I, II, III, and IV.
Data Series S PS [-] Υ [-] μ d ( σ d ) [s]
I0.9930.1390.031 (0.068)
II0.9980.2560.044 (0.076)
III0.9770.1990.048 (0.056)
IV0.980.1140.033 (0.1)
Mean I, II, III, IV0.9920.1910.039 (0.079)
Table 4. Test results for EMG-based detection of swallow onsets, concerning the highest median F 1 score. The table presents the median and interquartile range of the sensitivity, precision, and F 1 score for data series I, II, III, and IV. A LOSO cross-validation determined the optimal parameters with respect to the maximal F 1 score while limiting the mean detection delay to the mean detection delay of the BI/EMG-based DoSO. The last column displays μ d and σ d , describing the timing of the EMG-based DoSO.
Table 4. Test results for EMG-based detection of swallow onsets, concerning the highest median F 1 score. The table presents the median and interquartile range of the sensitivity, precision, and F 1 score for data series I, II, III, and IV. A LOSO cross-validation determined the optimal parameters with respect to the maximal F 1 score while limiting the mean detection delay to the mean detection delay of the BI/EMG-based DoSO. The last column displays μ d and σ d , describing the timing of the EMG-based DoSO.
Data SeriesSensitivity [-]Precision [-] F 1 Score [-] μ d ( σ d ) [s]
I0.84 [0.131]0.425 [0.147]0.553 [0.135]0.019 (0.172)
II0.759 [0.235]0.603 [0.463]0.619 [0.199]0.039 (0.195)
III0.625 [0.292]0.455 [0.509]0.529 [0.127]0.022 (0.182)
IV0.5 [0.678]0.197 [0.484]0.289 [0.496]0.018 (0.203)
Table 5. Test results for BI/EMG-based detection of swallow onsets concerning the median F 1 maximum. The table presents the median and interquartile range of the sensitivity, precision, and F 1 score for data series I, II, III, and IV regarding the maximal F 1 score reached by the hyperparameter optimization using a nested LOSO cross-validation. The last column displays μ d and σ d , describing the timing.
Table 5. Test results for BI/EMG-based detection of swallow onsets concerning the median F 1 maximum. The table presents the median and interquartile range of the sensitivity, precision, and F 1 score for data series I, II, III, and IV regarding the maximal F 1 score reached by the hyperparameter optimization using a nested LOSO cross-validation. The last column displays μ d and σ d , describing the timing.
Data SeriesSensitivity [-]Precision [-] F 1 Score [-]Specificity [-] μ d ( σ d ) [s] weight 1
I0.827 [0.129]0.656 [0.253]0.705 [0.191]0.945 [0.064]0.031 (0.068)2.0
II0.875 [0.139]0.75 [0.183]0.826 [0.094]0.956 [0.139]0.044 (0.076)0.5
III0.698 [0.277]0.846 [0.159]0.732 [0.184]0.943 [0.034]0.048 (0.056)1.5
IV0.56 [0.482]0.5 [0.321]0.546 [0.405]0.943 [0.084]0.033 (0.1)2.0
Table 6. Test results for BI/EMG-based detection of swallow onsets concerning the median F 1 maximum with mixed data for classifier training. The first row presents the classification performance for training data extracted from data series I to III and test data extracted from data series IV. The second row contains the classification performance for training data extracted from data series I to IV and testing with the data of one subject of data series IV excluded from the training.
Table 6. Test results for BI/EMG-based detection of swallow onsets concerning the median F 1 maximum with mixed data for classifier training. The first row presents the classification performance for training data extracted from data series I to III and test data extracted from data series IV. The second row contains the classification performance for training data extracted from data series I to IV and testing with the data of one subject of data series IV excluded from the training.
TrainingTestSensitivity [-]Precision [-] F 1 Score [-]Specificity [-] weight 1
I, II, IIIIV0.544 [0.539]0.556 [0.458]0.556 [0.405]0.949 [0.067]1.5
I, II, III, IVIV0.603 [0.457]0.455 [0.333]0.518 [0.378]0.925 [0.083]1.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Riebold, B.; Seidl, R.O.; Schauer, T. Electromyography- and Bioimpedance-Based Detection of Swallow Onset for the Control of Dysphagia Treatment. Sensors 2024, 24, 6525. https://doi.org/10.3390/s24206525

AMA Style

Riebold B, Seidl RO, Schauer T. Electromyography- and Bioimpedance-Based Detection of Swallow Onset for the Control of Dysphagia Treatment. Sensors. 2024; 24(20):6525. https://doi.org/10.3390/s24206525

Chicago/Turabian Style

Riebold, Benjamin, Rainer O. Seidl, and Thomas Schauer. 2024. "Electromyography- and Bioimpedance-Based Detection of Swallow Onset for the Control of Dysphagia Treatment" Sensors 24, no. 20: 6525. https://doi.org/10.3390/s24206525

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop