*2.3. EMA Data Capture and Analysis*

The CR230 remote-control device allowed subjects control over sound processor volume and sensitivity, as is typical for daily use (Figure 1). It also displayed the current listening environment class [32]. A large side button (conventionally used to enable the telecoil feature) was repurposed and used as a vote button. The data logging capability of the CR230 allowed the listening environment (Quiet, Speech, Speech in Noise, Noise, Wind, and Music) and listening program to be recorded as the user pressed the vote button. These features provided a suitable platform to capture EMA data. In this study, we investigated a sound processing program preference through subject voting between a BEAM program and a ForwardFocus program in real-world environments. For analysis, the listening environments relevant for communication were used, which excluded the Wind and Music classes.

Subjects were provided with two programs and a sound processor remote control for a two-week period. During this period, they were asked to change between programs during each day to experience both programs. Subjects were also instructed to complete at least one vote (data capture) each day in a range of their different listening environments across the two-week period. To vote, subjects were instructed to change between programs during normal use of the device, and after several changes back and forth, to vote for their preferred program by pressing the side button on the remote control.

Data capture of the patient's instantaneous listening environment was possible due to the SCAN scene classification algorithm available on the CP900 sound processor [32]. At each time instance, the environment is classified into one of six sound classes: Quiet, Speech, Speech in Noise, Noise, Music, and Wind. This algorithm is based on extracting acoustic features such as sound level, modulation, and frequency spectrum from the microphone signal, followed by a decision tree to determine the sound class [32,33]. A data-driven machine learning approach was used to train the decision tree using sound recordings labelled by humans with the appropriate sound class. In contrast to the commercially available CP900, during this study the classification system did not make any automatic changes to the sound processing or program selection but was only responsible for determining the sound class for the purpose of data logging.

At the end of the two-week period, data logs containing the vote events, scene classification data, and program selection were downloaded. Analysis was first performed to exclude accidental voting and exclude votes that did not show temporal coincidence with previous changes between both programs. In order to determine the sound class associated with each vote, the detected sound class was analyzed over the 10 s preceding the vote event. It was assumed that the evaluation of programs would likely have occurred over a period of time, possibly under different scene classifications. In cases where the sound

class was variable, the vote was assigned according to the dominant sound class over the 10 s preceding the vote event, and in the case of an equal distribution, to the most recently detected sound class. The preferred listening program was determined from the listening program that was selected at the time the vote button was pressed.

For each subject, raw vote data were aggregated separately for each acoustic scene and represented in a program verse scene matrix, where each element represented the number of votes for each program.

Statistical analysis was performed in R statistics package version 4.1.1. Program preference (vote) was modelled as a binomial dependent variable using repeated measures logistic regression by fitting generalized linear models (glms) with the logistic link function.

#### **3. Results**
