*2.7. Performance Analysis*

Models were evaluated exclusively on hold-out data using leave-one-patient-out crossvalidation (Figure 2, pink and red boxes). Models were fitted on the concatenated trials of all but one patient (training data) and were evaluated on all trials of the held-out (test) patient, where each patient served as the hold-out patient once. This evaluation scheme provided unbiased assessment of prediction performance. Model predictions *y*ˆ(*t*) = *β T*e*x* ∗ (*t*) were obtained by multiplying coefficient vector *β* estimated using the training data to embedded EMG data <sup>e</sup>*<sup>x</sup>* ∗ (*t*) from each trial of the test patient.

Predicted and ground-truth data were compared on a per-trial and leg basis using the Pearson correlation coefficient (r). Gait events (SWP, HC, and TO) were extracted from the predicted IMU time series as described in Section 2.5. Separately for each leg, ground-truth and predicted HC and TO events were used to divide each trial into alternating segments representing the swing and stance phases of the gait cycle. The resulting binary time series were compared using the F1-score (see also [34]). In addition, the absolute displacement between matching true and predicted events was measured. Matching events were defined as those being <600 ms apart from each other. Predicted events lacking a matching groundtruth counterpart were counted as false detections. The false discovery rate (FDR) was defined per event type as the number of false detections divided by the number of total event detections. Conversely, true events lacking matching prediction were counted as false negatives (misses). The false-negative rate (FNR) for each event type was defined as the number of missed events divided by the total number of true events.

#### **3. Results**

Ninety-three minutes of gait activity and 5253 full gait cycles were analyzed across the six patients. The median gait cycle duration ranged from 1045 to 1140 ms, corresponding to cadences between 51 and 59 cycles per minute (see Table 2). The gait cycle duration variability was measured as the median absolute deviation from the median duration and ranged from 10 to 30 ms.

Figure 3 illustrates the average activation patterns of individual muscles (measured by means of EMG) relative to the angular velocity profiles (measured by the IMUs). The upper panels show the average IMU and EMG activity across the gait cycles of all patients as a function of time within a cycle. All ten muscles exhibited stable activation patterns relative to the individual gait events of both legs. Importantly, due to the stable timing of the gait cycle in patients with mild PD, the left leg muscles showed precise activation in well-defined time windows regarding HC and TO events of the left and right leg, and vice versa. The Vl displayed particularly consistent timings (as indicated by dark red colors) both for the left and right legs. The lower panels depict cross-correlations (computed on the concatenated data of all trials) of temporally shifted EMG activity traces relative to the IMU signal. The same 21 lags were analyzed, ranging from −500 ms to +500 ms relative to the IMU signal reported above for the machine learning models. Thus, the depicted correlograms represent the independent linear predictive quality of each of the 10 × 21 = 210 EMG features considered in our models, thereby indicating the influence of each muscle and delay combination for prediction (see also [45]). The activity profiles of all ten individual muscles showed substantial positive and negative correlations with the IMU signal within a window of 1 sec. The highest absolute correlations were observed for the Vl. Specifically, left Vl activity lagged behind the left IMU trace by 150 ms (r = 0.78) and

anticipated the right IMU trace by 350 ms (r = 0.76); in contrast, right Vl activity lagged behind the right IMU trace by 150 ms (r = 0.66) and anticipated the left IMU trace by 350 ms (r = 0.67). All reported cross-correlations were statistically significant (*p* < 0.05 after Bonferroni correction).

**Figure 3.** Relative timings of muscular and kinematic signals. Upper panels show average angular − **Figure 3.** Relative timings of muscular and kinematic signals. Upper panels show average angular velocity measured by inertial measurement units (IMUs) and electromyographic (EMG) activity across all gait cycles of all patients as a function of time within a cycle. Percentages are relative to the 95th percentile of the raw data. Averages were cropped below 40%. All ten muscles exhibited stable activation patterns relative to the individual gait events of both legs. Lower panels depict cross-correlations (computed on the concatenated data of all trials) of temporally shifted EMG activity relative to the IMU signal. All ten muscles showed substantial absolute correlations with the IMU signal within a window of 1 sec. The highest correlations (Pearson correlation, r > 0.66) were observed for Vl activity with delays of 150 ms relative to the same leg or −350 ms relative to the opposing leg. Abbreviations: left and right gastrocnemius medialis (LGm and RGm) and lateralis (LGl and RGl); left and right soleus (LS and RS); left and right tibialis anterior (LTa and RTa); left and right vastus lateralis (LVl and RVl); TO, toe-off.


**Table 2.** Gait cycle statistics of individual patients.

Figure 4 shows an example segment of the preprocessed EMG and IMU data of one patient, the EMG-based predictions of the IMU time courses based on all ten available EMG probes, and the gait parameters extracted from true and predicted IMU time series. The EMG time courses of three selected individual muscles (bilateral Ta, S, and Vl) showed the clear periodic pattern of the gait cycle (bottom row). Out-of-sample predictions based on temporal embeddings of the activity of ten muscles showed a high correlation with the true IMU data (top row). Furthermore, gait events extracted from the predicted time series closely matched those extracted from the original IMU traces (top row). True and predicted gait phases based on the extracted events were consequently also closely aligned (center row). Results of similar quality were obtained when predictions were based on the left and right Vl only (see quantitative evaluation below).

**Figure 4.** Example segment of preprocessed electromyography (EMG) and inertial measurement units (IMUs); angular velocity at the left and right anklebones (recordings of one patient (P5)), as well as the EMG-based predictions of the IMU time courses and the gait parameters extracted from true and predicted IMU time series. Top row: true IMU data and predictions derived from temporally embedded EMG activity of ten muscles. Predictions were derived from an ordinary least-squares regression model of fitted data of that had been fitted to data of the other five patients. Gait-related events (swing peak velocity (SWP), heel contact (HC), and toe-off (TO)) extracted from the predicted time series closely matched those extracted from the original IMU traces. Center row: True and predicted gait phases based on the extracted events were closely aligned. Bottom row: EMG time courses of three selected individual muscles (bilateral soleus, tibialis anterior, and vastus lateralis).

Figure 5 quantitatively summarizes the performance of EMG-based reconstructions of IMU time courses and gait events. The median (IQR across all 26 trials) Pearson correlation

between measured and reconstructed IMU time courses, based on all ten muscles, was r = 0.80 (0.74 to 0.87) for the left ankle and r = 0.85 (0.78 to 0.90) for the right ankle. Using the left and right Vl, the performance was on par, with r = 0.86 (0.78 to 0.88) for the left IMU probe and r = 0.83 (0.80 to 0.88) for the right IMU probe. Using the left and right Ta and S muscles did not lead to competitive performance, with r = 0.47 (0.35 to 0.66) for the left IMU and r = 0.55 (0.46 to 0.66) for the right IMU. Importantly, the combination of left and right Vl was found to be on par with the full model for all the performance metrics, whereas the combination of S and Ta was competitive in none. For this reason, we restricted our reporting to the model comprising left and right Vl. With few exceptions, gait events could be reconstructed with median absolute temporal displacements of <50 ms using IMU predictions derived from this model. The median (IQR) displacement for SWP was 40 (20 to 60) ms for the left leg and 38 (25 to 60) ms for right leg. For HC events, median temporal displacements were 35 (25 to 55) ms for the left leg and 45 (30 to 60) ms for the right leg. For TO events, median displacements were 43 (30 to 100) ms for the left leg and 43 (20 to 95) ms for the right leg. Segmentations of the recordings into dichotomous gait phases based on detected HC and TO events were similar for measured and reconstructed IMU data. Median (IQR) F1-scores were 0.89 (0.87 to 0.93) for the left leg and 0.89 (0.86 to 0.93) for the right leg.

**Figure 5.** Performance of electromyography (EMG)-based reconstructions of inertial measurement unit (IMU) time courses and gait events. Lower numbers represent better performance. Top row: Pearson correlation (r) between measured and reconstructed angular velocity profiles of the left and right ankles. Second row: Accuracy of the reconstructed dichotomous (swing vs. stance) gait phases compared with the IMU-based ground truth, as measured by the F1-score. Bottom three rows: Absolute displacement of three types of events (swing peak velocity, heel contact, and toe-off) determined using reconstructed rather than measured IMU data. Results are shown separately for the left and right leg and for the best-performing prediction models utilizing between one and five pairs of EMG channels. In addition, results of the combination of the left and right soleus and tibialis anterior are also shown. Bar plots depict median performance across 26 walking trials of six patients in total, while overlaid whiskers depict first and third quartiles. Abbreviations: gastrocnemius medialis (Gm); lateralis (Gl); soleus (S); tibialis anterior (Ta); vastus lateralis (Vl).

Event detection errors were rare and did not occur in most trials. Across all trials, events were missed in 1.4% (n = 71; left leg) and 1.3% (n = 68; right leg) of cases. Numbers were nearly identical for all three event types, as HC and TO events were always determined relative to the two enclosing SWP events (see Section 2.5). False event discoveries were rare (<0.1% of the total events detected for both legs and all three event types). In absolute terms, between 2 and 4 out of over 5000 detected events were false discoveries.

#### **4. Discussion**

We have demonstrated the feasibility of accurately determining gait events such as HC and TO, defining the swing and stance phases of the gait cycle, in PD patients using a single pair of EMG probes placed bilaterally on the Vl muscle. Our proposed method may have substantial practical benefits in experimental setups in which EMG derivations are indispensable and where additional equipment for kinematic analysis (e.g., foot switches, IMUs, or a motion-capturing system) is either unavailable or would introduce undesired complexity, especially in severely ill patients. Furthermore, robust acquisition of EMG signals is necessary in experimental and commercial applications to achieve control of myoelectric interfaces for neuroprosthetics [29], including future adaptive DBS devices [30].

Rather than framing the prediction problem as one of binary classification [34], our approach consisted of two steps: First, the angular velocity at the left and right anklebones was predicted using the activity of between two and ten EMG probes. This mapping was learned a priori from training data for which both EMG and IMU recordings were available. Using carefully designed data features (temporally embedded, smoothed muscle activation time courses), a simple linear regression approach was found to be suitable to achieve sufficient reconstruction performance. Second, predefined rules were used to extract prominent events and the main phases of the gait cycle. These rules accommodate domain knowledge about the timing of events relative to each other, which constitutes a substantial advantage over algorithms that are completely naïve to the underlying data, framing gait cycle prediction as an abstract classification problem. Importantly, our approach does not require any calibration involving real IMU data, as models fitted a priori on a training cohort (e.g., the data reported here) can be readily applied to new patients. Due to the simplicity of our model, its application amounts to a simple linear filtering of the appropriately recorded and preprocessed EMG data and does not require any advanced machine learning software. In addition, our approach of approximating IMU time courses instead of individual events or categorial segmentation labels offers numerous additional advantages. These include the direct interpretation of the predicted time courses in terms of gait mechanics. Potential failure modes of the model (e.g., due to misplaced or noisy EMG probes) can easily be detected through visual inspection of the predicted time courses. Since SWP could be accurately detected even using reconstructed angular velocities and HC and TO were defined relative to SWP, our system achieved low numbers of event-detection errors and high overall accuracy regarding the determination of gait phases. It is also likely that our approach could be generalized to the extraction of other biomechanically relevant parameters of the upper and lower extremities.

Contrary to our prediction, the EMG profiles of the S and Ta muscles were insufficient to reliably identify major gait cycle events in parkinsonian patients. We based this hypothesis on the distinctive and synergistic activity of these two monoarticular (i.e., ankle) muscles during human locomotion. Indeed, normal EMG activity of the plantar flexors has been reported to mainly occur during the stance phase. In this phase, the triceps surae restrains the tibial rotation controlling for disequilibrium torque, which is responsible for propelling the body [46,47]. The ankle dorsi-flexors are instead mainly active during the swing phase, controlling for sufficient foot clearance, with an additional contribution in the loading response phase for the lowering of the foot to the ground after HC [48], thus assisting the forward momentum of the tibia during the heel rocker action at the ankle [49]. These muscles, however, may show large stride-to-stride variability in EMG profiles [48], especially in patients with PD [50,51]. In particular, a great intra- and inter-subject variability of Ta activity during gait has been described in parkinsonian patients in the meds-off state [50].

The prediction model did not improve when replacing the S muscle with the Gm or Gl or by adding this muscle to the S-Ta pair (data not shown). This was unexpected because while the S muscle may provide less forward propulsion with physiological aging, the gastrocnemius muscle has been shown to maintain its contribution to initiating swinging limb movement [52,53], thus possibly allowing kinematic events to be more accurately detected. Rodriguez and colleagues demonstrated a simplification of modular control of locomotion in PD with individual muscle contribution of the gastrocnemius, but not the S, among ankle plantar flexors and the semimembranosus and biceps femoris for knee flexor musculature [54].

In our study, EMG recordings of the Vl provided the most accurate prediction of IMU times series and gait events. The action pattern of this muscle during the gait cycle paralleled the activation of the Ta but was more selectively confined to the HC. This muscle controls the knee flexion that occurs after HC and ensures knee extension during terminal swing to prepare for ground contact [49,55].

In principle, there are an infinite number of different combinations of muscle activation that can be applied to maintain a particular posture or produce a given movement [56]. However, despite the apparent redundancy, four or five component activity patterns may be distributed to all the muscles that are specifically activated during locomotion; thus, the activation of each muscle involves a dynamic weighting of these basic patterns [57,58]. Interestingly, Ta, S, and Vl contributed differently to these factors [57,58]. Our results suggest that characteristic activity patterns of one pair—left and right Vl—are sufficient for the proper detection of gait events in patients with PD (H&Y: I–III).
